“PageRank Doesn’t Matter” is Now Officially an SEO Myth
Back in August, after having fixed my 99% supplemental site to the point where duplicate content could not possibly be an issue, Google nonchalantly stuck my original pages back in the supplemental index. At that point, I claimed the existence of a PageRank hurdle preventing supplemental pages from getting back into the main index:
On gfe-eh.google.com and other DCs, the supplemental pages cache dates no longer go all the way back to Aug 2005. But the new “system” makes it even harder to tell why a page is listed in the supplemental index, because now you’re required to jump over at least two major hurdles to break out of supplemental hell: 1) duplicate content issues (i.e. identical meta tags, multiple urls resolving to the same content, www/non-www, etc) and 2) “Trust” / PageRank. A perfectly structured page with original content could remain stuck in the supplemental index if a domain lacks juice.
On October 11, 2006, Matt Cutts confirmed that PageRank is the primary factor used to determine whether or not to chuck pages into supplemental hell. He later reiterated his point at Pubcon, and Adam Lasnik also recently echoed the same idea in Google Group Webmaster Help.
That PageRank still matters in this day and age didn’t fly over too well with some SEOs, who often de-emphasize PageRank in favor of other factors like “trust”, domain age, user data, links from authority sites, link neighborhood, co-citation, link history, link age, webmaster profile, SERP CTR, dmoz listing, and link topical relevance. Some black hats, G-Man for example, claimed supplemental results are mainly due to duplicate content issues. Rand Fishkin also claimed that PageRank has a limited role in determining a page’s supplemental status. He reasoned orphaned pages with no links to them may turn supplemental, but that in most cases, duplicate content is the predominant factor.
This anti-PageRank attitude stems from several commonly-held beliefs: that 1) PageRank has been devalued during the past few years; 2) PageRank by itself will not guarantee high rankings, and 3) PageRank is just one of hundreds of factors Google considers when evaluating ranking for a search term. Some webmasters also obsess too much over PageRank, paying hundreds of dollars to buy links on high TBPR pages in hopes of boosting their own TBPR.
Still, PageRank isn’t quite dead. Google still cares about the quantity and quality of links pointing to a page. How does Google keep track of that? You guessed it.
The supplemental index is Google’s version of the Interweb junkyard. It’s the holding space for what got cut on Google’s cutting floor. It’s where your pages end up if Google thinks they lack value. Again, how does Google measure page value? As far as dupe issues are concerned, it uses on-page text. Otherwise, Google looks at links. The primary metric Google uses to size up links pointing to a page, of course, is PageRank.
That doesn’t mean low PageRank is the only reason a page is in the supplemental index. For example, Adam Lasnik’s following comment implies duplicate content is clearly another major factor:
Joe Parts, I took a look at the examples you gave (thanks!) and — aside from the PR Toolbar issue I noted above — I did notice that at least a few of the URLs you noted come up directly in searches for text on the page. But there’s not much text on those pages for us to go by, unfortunately, and so it’s not surprising that some seem to be perceived as similar content to other pages on the net.
So its not just about low PageRank, even if that’s the main reason pages “go” supplemental. In Joe’s case, several other reasons, a few of which Adam alludes to, include 1) Similar shingles across pages 2) Devalued inbound links 3) Outdated TBPR.
“PageRank (toolbar PR) doesn’t matter (much anymore (in ranking))” is now officially an SEO myth and a misleading statement at best. Links matter and have always mattered. PageRank is just a metric for links. It used to be an inaccurate and spam-prone metric, but by nuking sold links, devaluing link schemes, blocking PageRank to spam sites with nofollow, and updating PageRank on a daily basis, PageRank has become a more reliable metric that better reflects page value.
The fact that Google already uses PageRank to determine crawl frequency, crawl depth, and supplemental indexing shows how confident Googlers feel (rightly or wrongly) about the accuracy of PageRank in place today.