Help the robots help you-and mankind
Wednesday, September 17th, 2008The default mission of robots, more commonly known as spiders, is to crawl the web and meticulously index each site, archive important data and follow every link. Since they have been programmed to do this at birth, you do not have to tell them to act naturally.
The question then becomes why would you not want them to index, archive or track down the links on your site? Although nobody knows the specific algorithms that search giants use to calculate page rank, we can safely assume that duplicate data does not help your cause.
If you haven’t done so already, make it your goal to get the “right” pages to rank well. Picking those pages should be a fairly straightforward task. If you would like the robots to focus on your product centric pages, make sure you direct them to not look at shopping carts, account login pages, driving directions and your returns or shipping policies.
Many sites include tons of duplicate data, CGI scripts, and temporary data that can be easily made to be overlooked by the robots with the robots.txt file or a robots meta tag.
Think of a site that includes multiple pages for items that are sold in the various colors under the sun. One sweater may have 10 or more pages, plus the master item page. By having the robots skip over the various flavors, you can have them concentrate on the more important item master.
Many sites include “printer friendly” versions of each page, or Adobe PDF versions available for download. These provide another rich source of repetitive data that should be made off limits to the robots, to help them do their job better.
Similarly you can direct the robots to not archive certain pages as well. This is of particular importance if you run “buy it now” type sales and hate negotiating after the sale expiration date with aggressive bargain hunters.
A couple of cases come to mind where you may want to have the robots skip over some select links. First, if your website has pages that must be read in a specific order, such as a multiple page registration form, they could easily be mis-interpreted if taken out of context by the robots. The second is if you remove pages often and wish to avoid unnecessary Error 404’s.
By helping robots do their job better, ultimately you help searchers. On your quest to eliminate duplicate data and steer the robots away from non-essential parts of your website, you make it easier for searchers to get what they need faster without wading through duplicate listings. A worthy cause for both robots and humans alike.









