How To Exploit The PageRankBot Tool

Building a good house means more than buying a pine dining table or 1080p Plasma TV (more “quality” content) or telling your friends about the new house you’re building (marketing). You gotta know how to use hammers, drills and nails too.

If you rather build a good site than worry about supplemental results, why are you reading SEO blogs? Come on, be honest. When’s the last time you read an SEO blog that talked in-depth about optimizing a dynamic page for fast page loads or repeating graphic elements on a page to create a sense of unity or using element size and position to establish a visual hierarchy?

Never, right?

But if you’re a control freak like me, read on.

WTH Does It Do?

Though some of you guys gave me positive feedback via comments and email about PageRankBot, I’m not sure if all of you know exactly what to do with it.

Inspite of the misleading name “Supplemental Results Detector”, its not a tool for detecting supplemental results. You have site:www.domain.com/& and site:www.domain.com/* for that. There are also other tools out there (I think Aaron Wall has one and sitemost just came out with a new tool).

I don’t really care how many of my pages are supplemental, but I do care when a page that deserves to rank in the SERP goes supplemental. One way to address that problem is PageRank distribution management. That’s what I built this tool for.

Tactics

First, figure out which pages on your site are important and which pages aren’t. Ask yourself is this page valuable to my visitors? If the answer is no, the page can go. You might also ask yourself what is this page supposed to rank for? If the answer is “contact me” or “privacy policy” then ask yourself why the hell would I want traffic for “privacy policy” and am I out of my mind thinking I can rank on the first page for “privacy policy” alongside Google, Sun, Apple, Adobe, and NY Times?

But if your “contact me” page contains your email address or IM information and your clients find you by Googling for your contact info, I would keep the page in the main index.

To mark unimportant URLs, multi-select URLs that are unimportant, then Edit > Toggle Importance.

Now flag supplemental URLs. Some of you wish the tool does this for you automatically. It doesn’t. Instead, label URLs returned by site:www.domain.com* command by going to Edit > Mark Page As > Main Index.

You can use the search tool to find URLs. For example, the following image shows a search on seo4fun.com for urls containing the word “pagerank”:

supplemental results tool search feature

Now go to View > Filters > Hide Marked, which hides all the URLs you just marked. Select all the URLs you see, and then set their status to supplemental.

Find Your Link Targets

To manage internal PageRank flow, you add internal links to your site. Decide which page you’re going to add a link on (link source) and which page you want that link to point to (link target).

To fish out your link “targets”, view only supplemental pages and sort them by PageRank (View > Filters > Show Supplementals and then click on the PageRank column). The topmost URL marked “important” is your best candidate:

1. The page is important to you (you feel the page deserves to rank in the SERPs).
2. The page is supplemental.
3. The page with the highest PageRank = easiest url to pull back into the main index.

Supplemental results sorted by pagerank

There’s your link “target.”

Note: If your site has multiple “entry points” (i.e. not all inbounds point to the home page), PageRank flowing into your site from those entry points will change the dynamics of how PageRank is distributed. In that case, take the PageRank values this tool gives you with a grain of salt.

If you’re anal enough to want to account for IBLs pointing at specific pages, then you can “add juice” by going to Tools > Simulate Backlinks. First, set the home page TBPR (use a float, like 4.2 for more accuracy). Go to View > Column Filters > Approximate TBPR. That will show you approximate TBPR numbers translated from raw PageRank numbers. Choose a URL, and adjust as needed using the + and - keys.

add juice

Find Your Link Sources

There’s a few ways to figure out your link “sources.” One way is to find the page with the most PageRank bleed. (Don’t believe PageRank bleeds? We’ll argue about that in another post). Amount of PageRank bleed depends on percentage of outbounds to inbounds and a URL’s (non-visible) PageRank. For example, a PageRank X URL with two outbound links and two internal links would bleed (X/4)*2 PageRank. Bigger X (increased number/quality of IBLs pointing to a URL) means more PageRank bleed. More internal links means less PageRank bleed, even if the number of outbound links stay the same.

Let’s not get too obsessed with PageRank bleeds though. You can solidify Google’s trust in your links by linking out organically. A site that doesn’t link out needs a strong set of credible, trusted IBLs to “validate” with Google (e.g. amazon.com). Consider your outbound links a part of your link profile and a key ingredient in proving to Google that your linking habits are 100% natural with no artificial colors, flavors or sweeteners (yeah, I know that was bad).

Link from Pages with the Highest PageRank Bleed

First, limit results to URLs in the main index by going to View > Filter > Main Index Only, so you only link from URLs in the main index. Then sort by Outbound PageRank (click on the “Outbound PageRank” column header. If you don’t see the colum displayed, go to View > Column Filter to activate). The topmost URL with the biggest outbound PageRank is your link “source.”

outbound pagerank

Link from Pages that Flow the Most PageRank

Another way is to find a page with that flows the most PageRank with each link. Go to View > Filter > Main Index Only. Then click on the “Increment” column header, which sorts the result in the order of PageRank flowing per link. The topmost URL with the biggest Increment bar is your link “source.”

pagerank increment

Connect the Dots

Finally, point a link from your link source to your link target.

If your modification isn’t sitewide, select the URL you just updated and recrawl that URL only instead of recrawling the entire site to update the site’s PageRanks.

recrawl url feature

You can also try flattening out your site’s PageRank curve (see the two graphs in my previous post about Google hiding supplemental results).

Take It Slow

If your site has enough PageRank, Google should update your pages in the main index every 3-4 days, if not sooner (e.g. if you show up for Google News) - though dramatic on-page edits like rewriting a TITLE tag might make Google sit on a page for a week or two. It should take you no more than 3 days to get a URL out of the supplemental index, as long as you have enough URLs in the main index to play around with.

Related Posts

24 Responses to “How To Exploit The PageRankBot Tool”

  1. Great Post - I’ll need to take a few hours out with a test site to figure out all the functions above. I never realised I could simulate flow like this - great tool.

    Best rgds
    Richard

  2. Hey Halfdeck - this tutorial was way overdue… I which I had this last week when playing with your Pagerank Similator - aehm … supplemental fixer… :-)

    So, now that the great doc is ready, it would be great to have that “pagerank request script url” parameter we talked about… if you need a test script, LMK please

    best,christoph

  3. Thanks Richard. If you don’t see bars for Outbound PageRank, download the latest version.

    Hey Christoph, thanks, I actually haven’t had the time to read the follow-up on your blog post; I’ll save that for this weekend.

  4. Wow, this is a great tool.

    I have a very weird problem with it tho. When it tries to crawl my site, it only produces 22 URLs out of a few hundred. I don’t understand why. I looked over my robots.txt file and that doesn’t look like that’s the problem. I’ve tried it on some other sites that I own, and it works fine, crawls the whole site.

    Any suggestions ?

    Ps. The site in question is the site that my name links to.

  5. Alex, the reason is the tool handles “StyleTips101.com” and “styletips101.com” as two unique URLs. Google Toolbar also happens to have the same problem: in IE, example.com may return a green bar while example.COM returns 0 TBPR. Capitalization variations in a domain name isn’t a bad thing, but since capitalization in a path will cause canonical problems, the solution to this is involves de-capitalizing only the domain portion of a url - which is more costly than just de-capitalizing the entire URL.

    I might get around to fixing this, but in the mean time, if you use all lowercase URLs, the tool should work.

  6. Hey Half, just a quick update. I’ve tried the tool on a few different machines, different ISPs, and still the same. I also tried altering the site (added 1 new post), and the crawler increased from 21 pages to 22…but the new page wasn’t even included. Hope that helps !

  7. I hope it’s ok to ask this here, if not let me know, but I’ve just downloaded the program and I’ve run it on a few small sites fine, but when I run it on a site with 590 URLs it seems to hang up every time at 582. Any possible explanation or fix?

    Thanks, it’s an awesome tool, looking forward to using it!
    Paul

  8. Hey Paul, thanks for the feedback.

    If the program hangs at, say, 590, restart it and run a partial crawl instead of a full crawl. Then look at the Status Code column (might have to activate it, just look under filters). Select the urls with a 405 status code and recrawl them.

    You can also run the program through DOS (see the install page for instructions - it’s a cake walk) and then email me the errors you see. Or just email me the URL you’re having trouble with and I get back to you.

  9. Hey there,

    I changed it to all lowercase, and it works like a charm ! Thanks for taking the time to find a quick fix !

  10. Hello There,

    I have a question concerning the PR that is given in the green bar. How is this calculated? I’ve tested the tool and there is a great difference between the published PR in the google toolbar and the pagerank given by the tool. This shouldn’t be a problem however there is a relative difference. Pages with the highest PR (from the google toolbar) should also have the highest pagerank given by the tool, however this is not the case.

    Can anybody give me further info on this? Thanks!

  11. Hey Tsim,

    The “raw PageRanks” you see represents an internal PageRank calculation that isn’t a predictor of what you actually see in the toolbar. Actual PageRanks depend on PageRank entry points (which pages are your backlinks pointing at?) and the amount of PageRank flowing into your site from those entry points. There are other factors that effect the accuracy of the tool (e.g. PageRank discounted from certain links, some pages on your site that’s excluded from PageRank calculation, etc). To simulate backlinks, like I explained in my post, use the “add juice” tool.

    I’d suggest you read up on the difference between Toolbar PageRank and internal PageRank to understand how Google calculates PageRanks internally and how Google exports them to the toolbar.

  12. […] Fortunately, there is a tool out there that you can use… and it's free (ain't the web cool?). Halfdeck (of SEO4Fun) has recently released a free tool called the PageRankBot that will spider your site and map out the distribution of PageRank. He's labeled it badly as a supplemental results detector, because it's actually a lot cooler than that. There will be some work involved in installing it, and I am not on board for tech support. With that caveat, it can be all kinds of fun to play with once you get it running. […]

  13. Fair warning… Get ready for some downloads… I didn’t Sphinn it or anything, but I will be mailing on it in the morning:
    http://www.seofaststart.com/blog/internal-nofollow-help

  14. Great tool! Thanks a lot for making this available.

    I am puzzled by what I’m seeing in the results for my site though. My contacts, warranty, shipping etc pages are coming up with the highest pagerank, which you would expect given that they are linked from the main navmenu on every other page - except that I have rel=”nofollow” on those links. The only thing I can think of is that the navmenu is in an ssi include file - but if the ssi wasn’t being processed, then the links wouldn’t be visible at all. Any idea what I’m missing here?

  15. Kevin, take a closer look at your source code:

    You need to swap rel-”nofollow” with rel=”nfollow”

  16. Thanks Dan, I gave your article a nudge on Sphinn. I kinda like to see it on the “Greatest Hits” front page.

  17. Duhhh! .

    Thanks. That will certainly make a difference, eh.

  18. […] Originally Posted by wige Any ideas or suggestions how to test that the SI ratio calculation ONLY takes into account SI pages? 1. Supplemental Index Ratio Calculator 2. How To Exploit The PageRankBot Tool Half’s SEO Notebook 3. Manual checking I was working on our site’s supplementals today. Here is what the one tool found: Supplemental Index Ratio Calculator 1 page in an https page on http://www. 8 pages are the categories of our forums which are duplicated due to redirect mistake: To be specific, there two are indexed by Google: Code: […]

  19. […] Originally Posted by crankydave Ahhh yes. I rember when you posted this before. Not sure why you believe this tool shows you "hidden PageRank" but it is a very good tool for being able to illustrate and help you better visualize your internal link structure. Dave A good idea would be to read more about it here: How To Exploit The PageRankBot Tool Half’s SEO Notebook I mainly use the desktop version, and sometimes the web based version we installed on the Webnauts Net server. __________________ Search Engine Optimization Consulting Company | SEO Analysis Tool | SEO Articles & Tutorials […]

  20. Just a quick one - excellent tool and I look forward to using it to improve things slightly.

    One thing though is that with some things using php sessions (e.g. gallery2, oscommerce) the ?gcid paramater means the same url will be added repeatedly with different session id’s because the bot presumably doesnt keep the same one - meaning for my 50k+ site I can’t really use it - if you get a chance a future feature would be removing parameters from urls or similar :D

    Other than that - Count me in as a newly subscribed reader :D

  21. Hey Woody,

    Thanks for the feedback. If you email me an example URL, I’ll be happy to look it over.

  22. Thanks for nice tools, and nice explanation here. I use it regularly to check my site.

    Thanks
    Dewa, Bali

  23. Is there a way to export the results as PDF or XML / XLS?

  24. “Is there a way to export the results as PDF or XML / XLS?”

    Sorry Phillip, that’s on the todo list. I’m thinking of doing an upgrade on the tool either this month or next month, so if you shoot me an email I’ll email you back when the upgrade is released.

What's Your Take?