Google Sitemap Reporting 404s under Summary
Note: This post is a follow up to my earlier post, Googlebot Refreshing Supplementals.
I noticed this morning that Google Sitemap is reporting 404s on the Summary page. The pages Google Sitemap reports missing include some of the pages I’ve been getting emails for since May 20th.
Not all the urls showed up under HTTP errors, just around 11 of them.
As I said earlier, the only interesting thing about these 404 pages are they no longer have any incoming links, and they exist only in Google’s supplemental index. So, I’m kinda hoping Google is in the process of refreshing their supplemental index by checking urls in their database, to see if they changed, 404/410, or 301 redirect to some other url.
In fact, Matt Cutts has previously posted on his blog (520 freaking comments, longer than any forum thread I ever seen, never mind a blog post) that a supplemental refresh has been going on since before April: “In early April, we started showing some refreshed supplemental results to users.” He also said the index/crawl team’s turning their focus to refreshing supplementals: “Well, now that Bigdaddy is done, we’ve turned our focus to refreshing our supplemental results.” Another quote (this one I remember reading the day he posted it), which suggests we have a glimmer of hope to have all of our supplementals refreshed by September: “I believe that folks here intend to refresh all of the supplemental results over the summer months, although I’m not 100% sure.”
Well…this is what I like to know: will lack of high quality/relevant incoming links or crappy outgoing links put a site at the back of the bus during the big summar supplemental refresh, or what?
P.S. Anyway, assuming Google will refresh your supplemental pages by the end of the summer, if you or your client is having problems with supplemental pages, this is a good time as any to make sure you don’t end up with a bunch of new supplemental pages, and wait another year to have them refreshed again. Beef up those product pages, make sure your title/description metas are unique across your entire site, tighten up your dynamic url handling, etc.