How Google Failed to Hide Supplemental Results
If you’re an SEO with clients that are worried about supplemental results, your job just got a whole lot harder. It’s like having a patient dying of disease showing no visible symptoms. Not only does he believe he isn’t sick anymore, but you can’t tell what he’s sick of.
First, your clients should know that just because they don’t see the supplemental results label anymore, it doesn’t mean their worries are over. Their supplemental pages are still supplemental. Google is just trying to hide the fact.
They should also know that just because the label is gone doesn’t mean you can’t detect supplemental results. You can, and here’s how:
- site:www.domain.com/& hack, which seems to pull up urls that used to be labeled supplemental. Of course now that Danny Sullivan blogged about it, that hack probably won’t last another week. (UPDATE: The hack was covered last week, according to Danny, the same week I pulled almost all my SEO feeds from Google Reader so I’m not bombarded by SEO news. Bad timing, I guess)
- site:www.domain.com/* shows pages in the main index.
- Old cache date. If s page’s cache date is old, its a sign that the page may be supplemental. Why? Because Google doesn’t refresh a supplemental result’s cache all that often. For example, my blog’s main urls have cache dates ranging from Jul 25~26, 2007 (today’s date: Aug 1, 2007) while my old supplemental pages have cache dates as old as Jul 6-7, 2007.
- Low-to-none competitive term traffic. If you’re not getting Google hits for two-word queries or getting no traffic at all to a specific URL, it may be supplemental.
- Uneven PageRank distribution, which you can control by downloading the Supplemental Results Detector Tool. See how sugarrae.com and www.seo4fun.com distributes PageRank?
See how sugarrae’s PageRank distribution is pretty even, so that there isn’t a huge gap between the home page and the deep pages? The site is 99% supplemental results free. Yeah, its a high TBPR site with only ~100 pages (which means plenty of PageRank to go around for each page) but so is vanessafox.com (TBPR 7 with ~100 pages), which has more supplemental results than sugarrae.com.
(orange urls are supplemental)
In contrast, www.seo4fun.com concentrates PageRank on just a handful of pages while the rest of the site gets very little attention. Consequently, some of the urls near the bottom of the chart with low link popularity are supplemental.
- Low Toolbar PageRank (0 ~ 3). The toolbar is a weak indicator due to delay but more green generally means less chance of a page being supplemental.
(One interesting tidbit I found in the recent Google blog post is Matt/Prashanth Koppula saying a url with complicated query strings also might go supplemental. At this point (considering the fact that Dave said stale pages go supplemental as well) it’s probably safe to assume a myriad of minor factors are involved)
(After reading the post, reasons why Google likes supplemental results are pretty clear:
1. Crawl the web more fully to serve ~1000 results (or maybe Google’s satisfied with just 10-100) for every possible search query, which means a) a happier user and b) more pages to display AdWords on.
2. Improve efficiency by taking advantage of prioritized crawling: crawl important, frequently updated pages more often while crawl less important, unupdated pages less frequently. Unfortunately, this often means only home page/top-level nav pages get indexed while pages with actual content fails to make it into the main index. I often get frustrated by a search result that lands me on a blog category page with 40+ blog post links instead of the blog post itself.)
A site with many pages in the main index receive traffic for competitive two-word queries. Traffic land on not just a handful of pages but on thousands of pages. Googlers promise that, by the end of the summer, supplemental results will generate more traffic and will rank for more terms. We’ll see. There are a ton of spam pages in the supplemental index, so Google will have to walk a thin line - otherwise odd query terms will be swamped with low PageRank spam while legitimate supplemental results never see the light of day.