Why Google Will Not Move Away From PageRank
Recently, I’ve got a little flak in Google Group Webmaster Help for coming down hard on people in the “Google is broke” camp. Basically, some of them were upset that the supplemental index was based heavily on PageRank because Google’s PageRank paradigm is broken and unfair:
- Google may misread the intent of a link. For example, due partly to many people linking to domain.com, it’s amassed a TBPR 8. While those links aren’t meant to be citations, Google apparently counts them as such.
- A site can’t get organic links unless it already has links (a.k.a Mike Grehan’s “rich get richer” syndrome). This would be true IF Google is the only source of a site’s visibility. However, we’ve got Technorati, RSS, Yahoo, MSN, Reddit, Digg, Myspace, YouTube, paid advertising…True - in some niches (e.g. porn), Reddit or Digg isn’t going to work, and people are more hesitant to link to you. But in general, although the “richer get richer” is a fact of life, there are many ways around it. Like Adam Lasnik pointed out, once upon a time, YouTube.com was TBPR 0.
- The guy with the deepest pocket wins. The guy who can afford to spend the most money on paid advertising and paid links will in the end come up on top. Thus mom and pop sites will never have a chance in Google Search, or so they argue.
- The Paris Hilton syndrome. PageRank paradigm degrades search into a popularity contest. People link to what’s popular, even if it has no value or it’s completely untrue.
Basing a page’s value on links thus have several potential downsides (like anything else). Instead of PageRank, they argue, indexing should be based the quality of on-page text - a statement that begs the question: “Are you on crack?”
Answer me this. How can a computer program read, understand, and judge the quality of an article in comparison to other articles written on the same topic? It can’t - until Google discovers Artificial Intelligence. Sure - there are ways to look for on-page spammy finger prints (e.g. illogial sentence structures, excessively high keyword density, overuse of bold and italics). But given two well-written articles, how does a machine decide - based solely on on-page text - which article is more valuable?
Relevance for a keyword can, of course, be guessed at by looking at things like the TITLE tag, keyword frequency, keyword location on the page, and keywords in H1. Relevancy, however, has nada to do with page value or page quality.
How can a program judge the value of a page using on-page text alone when, from its POV, everything looks like a random string of symbols? To gauge a page’s value, there is simply no other option than to analyze off-page factors.
Ok, so it sucks that without inbound links or without decent internal link structure, Google will chuck a potentially great site into the supplemental index. Been there, done that. But if you think Google should base their indexing on on-page text quality instead of inbound links, I suggest you try spending a few days coding your own search engine. Then you’ll eventually realize what you want Google to do, at the present state of technology, is like wanting Wordpress to write posts for you, or like wanting your wife to become a rock legend overnight, or like wanting a billion-dollar white-hat website that builds and markets itself (and all you have to do is deposit checks in the bank every month).