Thursday, December 08, 2005

Search Engine Saturation

Search engine saturation is a metric that shows how many of the pages in your web site a search engine has found. There are free online tools that will measure web site saturation. My favorite is MarketLeap

Another site that offers a free tool to measure saturation is PageRank.net

There are many web sites that offer a tool similar to what you’ll find on PageRank.net. I prefer the MarketLeap tool because it maintains a saturation history allowing you to see trends. Each time you measure saturation it stores another data point. Checking your saturation once a month is a good way to build up a useful trend report.

Monitoring your search engine saturation helps identify problems that may be preventing a search engine from seeing your complete site, such as poor navigation, poor page design or even possible banning of the site.

Low Page Counts

For example, I’ve just starting working on a web site that has over 50,000 pages. The MarketLeap saturation report shows this site as having 234 pages in the Google index. There appears to be a problem, or actually several problems. Here are three:

a) About 35,000 of the pages require the visitor to enter a password. This prevents Google from indexing those pages. The password requirement is just ensure the visitor has read the terms of use. The site owner would like Google to index the complete web site.

b) The web site is set up like a directory. There is a manual navigation system that provides “index” pages, each with 600 links. Although these are all internal links, this is rather unusual and Google may be seeing this as a link farm.

c) The link text is not meaningful. Links are labeled with just an alphabetical range (a-ad, ad-ah, ah-am, etc.) or with page numbers (page 1, page 2, page 3, etc.).

High Page Counts

The page count reported by a tool for measuring saturation will often exceed the total number of pages on your web site. For example, I have a web site that has about 3,000 pages, but saturation tools report that Google has over 10,000 pages in its index. What is going on?

You can search for the pages in Google by using the following in the Google search box (with no spaces):

site:www.website.com/

The problem is, although it will tell you how many pages are in the index, you can only see the listing for a limited number of those pages.

Part of the reason for the high number of pages is that search engines will list the same page multiple times using different URLs. For example, here are for URLs for the same page:

www.yoursite.com/
www.yoursite.com/index.htm
yoursite.com
yoursite.com/index.html

Because you have no way to know how many duplicate pages are in the search engine’s index, you can not take saturation numbers as being absolute. If there is a big discrepancy, such as in my first example, you know there is a problem. However, in most cases saturation numbers should be considered as relative numbers that serve as a guide to tell you whether your saturation is improving or not improving. For example, if you’ve been adding pages to your web site, but your saturation is decreasing, there is a problem.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home