Supplemental Results Tests
This page links to old tests I ran to see if the supplemental index had something to do with duplicate content penalty. The test returned a negative, where multiple pages with identical / near identical pages were indexed. That's not a proof to me but it still enough evidence on top of logic that says weak link juice is the primary signal of low value. Anyway, I moved the links over here to cut off juice to them.
General Rules of Thumb on Supplementals (Untested)
- Unique title/description.
- Changing a few words in the title/description for a DB driven site will not make them unique. Eliminate as many duplicate words as possible in the title/description, so that only a small portion of the title/description repeats.
- Increase page size.
- Link to the page if it's orphaned.
- Force revisiting by submitting a sitemap.
- Body text begins with unique text.
- The body text isn't a copy of a text found elsewhere.
- The body text doesn't contain text snippets from another page.
- Lack of META description. If the on-page text starts with MENU elements for several pages, those pages may end up listed as supplemental.
- Avoid index.html/index.htm pages.
- Once a page is flagged as supplemental, it will not easily come back to the main index.
Here are a few things to test about supplemental listings:
- Unique Title / description. Does a page with duplicate title/description automatically turn supplemental?
- Unique keywords. If the page
- Unique H1/H2
- Does duplicate text snippets create supplementals, or penalize the page?
- Too many query parameters may turn pages into supplemental listings. Other restraints Google places on the main index may flag a page as supplemental. What restraints?
- Starting on-page text is unique to the domain. This means avoiding topping the page with identical navigation menus.
- The pages don't snare identical text snippets.
- The page is not a complete copy of another page on the domain.
- Will pages with completely different content but identical TITLE/DESCRIPTION be listed as non-supplemental?
- Does a page have to exceed a certain filesize to be indexed as non-supplemental?
- How about two unique pages that start with indentical text snippet on the page?
- Do pages go supplemental if the pages target competitive keywords and the page content is thin?
- Test pages that are 60%, 70%, 80%, and 90% similar.
- Test a page that is an exact copy of another page on one of my other domains.
- If the onpage link text are identical, but the links go to different URLs, will the page be marked as supplemental?
- Does getting flagged as supplemental have to do with unique text on the page (despite page size) or will a page get slashed as supplemental if the size of the page is too low? Can't be... since some top listings show tons of small pages. Some spam pages barely contain any text.
- Does the amount of code between <head> and <body> make it more likely that a page goes supplemental?
- Does a huge chunk of copied text turn a page into supplemental? Will they even be indexed?
- If Titles contain the same exact text, but use different punctuations (e.g. =++//) will the titles be considered unique?
- Effect of text snippets...How often can I use snippets without getting penalized? What length of snippets is too large?
- What if the titles/descriptions are unique but nearly identical?
- What if the titles and descriptions' unique elements are numerical? e.g. <title>product #1</title>, <title>Product #2</title>.
- Lack of paragraphs make it easier for pages to go supplemental?
- Does adding a long / unique META description help pages avoid supplemental listings?
- Is inflating filesize with JS/CSS/HTML effective? Or does it matter more when the totla filesize is lower but the amount of text on the page is bigger?
- If the copy text is identical but the in-line HREF point to different URLs, will that make the pages unique? Or are they dupes?
How To Test Supplementals?
- Generate 200 word text. Then copy the page exactly on another page, and see what happens.
- Do the same, except use two unique TITLES
- Generate 200 word text block for 2 pages. For both pages, generate two unique 200 word text blocks and paste them below the first text blocks.
- Generate two pages with the same content, except for TITLE/DESCRIPTION that are unique.
- What if I put the page in a frame?
Supplementals Test Pages
These pages are optimized for the words "se4funsupp testcases". UPDATE (9/3/2006): These pages are wiped off the face of Google last time I checked. Even the original copy is gone, which implies there may be some kind of directory level duplicate content filter at play.
- Test 1: The original page.
- Test 4: NAV MENU PROBLEM TEST: Unique body text follows the original text. This tests whether a huge nav menu can cause two pages with some different text to be listed as similar.
- Test 5: Another original page with unique TITLE/description and body text, just to make sure we know what will definitely evade dup listing penalty.
- Duplicate Content Test: I'll test pages with different percentages here. To simplify this test, I'll mostly use unique titles and descriptions for each page. Another question is when dealing with snippets, does the position of the page where snippets appear matter? Note that pages with longer META descriptions result in lower Text to HTML%.
||Visible Text Size
||Text to HTML%
||Similarity to the Original (%)
||Unique Text on Top
- Text Snippet Tests:
- Title/Description Tests:
- Test 6: This page has unique body text, but the title/description are identical to page 5. Both pages contain 100 words.
- Test 3: Unique TITLE/DESCRIPTION, otherwise identical to test 1.
- Head / Body Tag test Since I've read somewhere that a missing </head> tag or <body> tags can create supplemental pages, I'm testing this using pages with 500 words on them. TEST 16 is the original page with all tags intact, and should not be listed as supplemental.
- Page Size Test: This tests Google's threshhold for page size. This still assumes Google only looks at the page size, and the actual text on the page doesn't matter. It also assumes pages with <P> is identical to pages with text snippets and links. There are no META descriptions for these pages; they do have all unique page text and <title> tags. Page size was checked using SEW's Webpage Size Checker (link's dead now). My assumption is that you can get a 5 word page indexed by Google (according to what I've seen Jim W do), but we shall see. Also, some SEOs recommend at least 200 words per page.
Supplementals Test Log
2/27/2006 - MSN Only root listing on google; 0 pages in Yahoo.
- only the index page ranks for "Brandnewsacx1 sasahz"
- site:seo4fun.com brandnewsacx1 sasahz: #1 index page #2 www.seo4fun.com/test-1/7.html
- www.seo4fun.com/test-1/5.html (META DESCRIPTION ONLY) is not displayed. This might mean including keywords in META description alone will do NOTHING for MSN; though including the META desc when a page has that keyword on the page is another question. Another weird thing is a search for brandnewsacx1 sasahz shows 1 result, but site:seo4fun.com brandnewsacx1 sasahz results in 2 listings.
- Search for site:seo4fun.com carcasherdotcom seocontest returns only the home page, where link text is matching the query. Why isn't the other page showing up? I don't see this problem in Google. The contest page shows up above the index page.
- site:seo4fun.com "carcasherdotcom seocontest" pulls 2 listings, with the inner page on top. But why the hell do I need the quotes? I might add a link to another page using that text and see if that makes a difference. I'm already using <h1> and <strong> but that doesn't seem to do the trick.
- Same effect is going on with the other keyword. "brandnewsacx1 sasahz" returns 2 results (only 1 without quotes).
- Search for one word brandnewsacx1 returns 2 results, which is what I expected. Does MSN have a problem of identifying 2+ word phrases?
- Ok this is weird. Now "Brandnewsacx1 sasahz" is returning 2 urls, the home page and 1-7. Cache date is 2/24, and the page text has not changed. Did the home page change? Nope. Home page is also cached on the same date. The cache hasn't changed from what I can see, but why are both pages showing now? Weird.
- site:seo4fun.com carcasherdotcom seocontest also now returns 2 urls instead of just one, with the more "relevant" page showing up top.
- Interesting that the home page ranks above 1-7. Why? Higher PR?
- Both page is unoptimized for the term.
- Home page uses the term in anchor text.
- in 1-7, the word shows up in one paragraph.
- So...outbound anchor text is a bigger boost than text in <P>?
Supplementals Test Conclusions
- Keywords in META description alone will not rank a page for those keywords.