Why Duplicate Content Causes Supplemental Results
Nearly a year after I had my tentative say about supplemental results, my thinking on what causes supplemental results has shifted away from duplicate content and moved more toward lack of inbounds and internal PageRank distribution, as Matt Cutts recently stated that PageRank is the primary factor in determining supplemental results.
Back around May 18, 2006, Matt Cutts had this to say about combatting supplemental results by optimizing internal link structure:
typically the depth of the directory doesn’t make any difference for us; PageRank is a much larger factor. So without knowing your site, I’d look at trying to make sure that your site is using your PageRank well. A tree structure with a certain fanout at each level is usually a good way of doing it.
Some people questioned my thinking when I advocated optimizing your internal link structure for better PageRank distribution, probably because they just hate hearing the word PageRank.
The Word PageRank Causes Friction (tangent)
So I’ve been replacing the word “PageRank” with words like “authority”, “trust”, “link juice”, “link value”, “visibility” or “link weight.” If I say “your page doesn’t have enough PageRank,” that makes some people:
- cringe, like they’re at the dentist, when suddenly the dentist whips out a dental drill. I think its some kinda pavlovian response - the word “PageRank” just make some people feel ill.
- look at the toolbar, and tell me “hey, this PR 0 page is in the main index, while this PR 6 page is marked supplemental. So you’re obviously wrong.” First of all, please don’t say “PR.” PR means Public Relations or Peurto Rico. Second of all, when I say TBPR, I’m talking about the Toolbar PR. When I say PageRank, I’m talking about internal PageRank, the one you can’t see. Just because the toolbar says 0 doesn’t mean there are no links to that page.
Seriously, some people just get stupid when they hear the word “PageRank”, so to get around that debacle, I say “your page doesn’t have enough trust” or “your page doesn’t have enough quality inbound links” or “your page lacks visibility.” But really, all I’m saying is your page doesn’t have enough internal PageRank.
Why Duplicate Content Is Responsible for Supplemental Results
Anyway, back to duplicate content. I’m starting to read some people saying duplicate content doesn’t matter at all when it comes to supplemental results. I disagree.
If you have two identical/similar content pages, both with thousands of authority inbound links, those pages are not supplemental-index-bait. Google’s solution to duplicate content is not the supplemental index. On the contrary, Googlers insist that they filter out duplicate content.
But when you’ve got links to domain.com and domain.com/index.php, you’re splitting link juice between two pages instead of one.When you got multiple urls like index.php?sessionId=290302342, index.php?sessionId=20343400, and index.php?sessionId=023123321 generating identical content and you got links to all of those urls, again, you’re splitting link weight between all those pages. The urls with just a couple of low value inbound links end up “going supplemental” while urls with tons of other sites linking gets to sit pretty in the main index.
So yeah, you can have multiple duplicate content pages in the main index as long as they’re well-linked-to. But links to multiple urls hosting identical content within your own site will dilute PageRank and cause some link-starved pages to “go supplemental.”