Google Answers: Only 3% of one of our dynamic info pages are listed in Google. Why?

View Question

Q: Only 3% of one of our dynamic info pages are listed in Google. Why? ( Answered, 1 Comment )

Question

Subject: Only 3% of one of our dynamic info pages are listed in Google. Why?
Category: Computers > Internet
Asked by: xsjah-ga
List Price: $20.00

Posted: 20 Jun 2003 07:09 PDT
Expires: 20 Jul 2003 07:09 PDT
Question ID: 219622

Our site (eil.com) is composed almost entirely of dynamic ASP pages. 
One of our information pages moreinfo.asp shows people the all details
on each item we have in stock, which is over 250,000 items.

Only about 7000 of a potential 250,000+ of our moreinfo.asp pages are
listed.  I'm getting this number by search for:
site:eil.com moreinfo.asp
The listings show without the Title or Subject showing, they just show
as a link e.g.
http://eil.com/shop/moreinfo.asp?catalogid=8566
The example above is a SUZANNE VEGA Blood Makes Noise US promo.  If I
search for this item in Google using the regular search the results
don't include our moreinfo page, somewhere further down in the listing
I can see our Suzanne Vega main listing page which includes this item.
 It's very strange.

Could someone shed some light on what's happening?

Answer

Subject: Re: Only 3% of one of our dynamic info pages are listed in Google. Why?
Answered By: robertskelton-ga on 21 Jun 2003 17:51 PDT

Hi there,

I think the answer is best understood when you see Google's point of
view.

As far as I can tell, the main reason why Google takes roughly one
month to update their index is because it takes that long to spider
and index the 3 billion pages in their database. It is important to
Google that they restrain from indexing any data that might be
superfluos to the needs of a searcher.

Obviously every page in your website is unique, and would be an
important search listing for anyone searching for a precise product.
However, the Google software doesn't make human decisions, it makes
computer ones. The biggest problem is that Google has no idea how many
dymanically generated pages your site has, nor how useful they are. If
you were hosted by Geocities then they obviously they wouldn't index
any of your .asp pages, because it would probably be a waste of time.
Probably the only criteria they can judge your site by is link
popularity.

It is quite possible for a site to have 100 billion dynamically
generated pages. So Google makes a decision to only index X amount of
pages. Only Google search engineers know how such a decision is
arrived at.

It is obviously of importance to Google to have the most information
in their index that they can (this service, for example, is not just a
means of getting answers for a fee, it is becoming, and will be, a
huge database of qualified information). GoogleGuy (the closest to an
official spokesperson for Google you can find online) says:

"We're getting better on dynamic pages every month thanks to better
analysis. I think we crawl dynamic pages better than any general
search engine at this point.. "
http://www.webmasterworld.com/forum3/12370.htm

"General rule of thumb is that Googlebot is willing to ingest just
about anything. The corollary is to keep the number of parameters
small and to keep those parameters short (no session IDs, for
example)."
http://www.webmasterworld.com/forum5/1730.htm

You pass the guideline above, you only have one parameter. The only
reason for Google limiting the index of you site is:

"We are able to index dynamically generated pages. However, because
our web crawler can easily overwhelm and crash sites serving dynamic
content, we limit the amount of dynamic pages we index."
://www.google.com/webmasters/2.html

It's that simple. I like to think that once a site becomes well known
(think Amazon), then all of its pages will be indexed. Although the
biggest I found for Amazon was 1,140,000:

site:amazon.com ISBN
://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=site:amazon%2Ecom+ISBN

AlltheWeb have 30,000 of your pages indexed:
http://www.alltheweb.com/urlinfo?q=eil.com&c=web

The biggest number of results I got from Google was 24,100:

site:eil.com Artist
://www.google.com/search?num=30&hl=en&lr=&ie=UTF-8&oe=UTF-8&newwindow=1&safe=off&q=site%3Aeil.com+Artist

Have you investigated Froogle?
http://froogle.google.com

Froogle is designed to index sites such as yours, and hopefully one
day it will be a widely used web shopping tool.

Official Froogle info:
http://froogle.google.com/froogle/merchants.html

Article about Froogle:
http://www.sitepoint.com/article/1060

Let me know (via a clarification) if you need additional info on this
topic...

Best wishes,
robertskelton-ga

Comments

Subject: Re: Only 3% of one of our dynamic info pages are listed in Google. Why?
From: respree-ga on 20 Jun 2003 10:16 PDT

I'm afraid I can't offer you an answer, but are you sure it's not
around 50,000 that have indexed?

://www.google.com/search?hl=en&lr=&ie=UTF-8&q=+site:eil.com+eil

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy