Sorry you are having trouble.
Many webmasters wonder how to ensure their sites will be included in
Google's index of web sites. Although Google crawls more than 2
billion pages, it is inevitable some sites will be missed. When Google
does miss a site, it is frequently for one of the following reasons:
The site is not well connected through multiple links to others on the
web
The site launched after Google's last crawl was completed
The design of the site makes it difficult for Google to effectively
crawl its content.
Google's intent is to represent the content of the Internet fairly and
accurately. To help make that goal a reality, here is a guide to
building a "crawler-friendly" site. There are no guarantees a site
will be found by our crawler, but following these guidelines should
increase the probability that your site will show up in Google search
results.
Do...
Provide high-quality content on your page - especially your home page.
If you follow only one tip from this page, this should be it. Our
crawler indexes web pages by analyzing the content of the pages
themselves. Google will index your site better if your pages contain
useful information. Plus, your site has a better chance of becoming a
favorite among web surfers and being linked to by others if the
information it contains is relevant and useful.
Do submit your site to the appropriate category in a web directory.
Listing your site in the Open Directory Project or Yahoo! increases
the likelihood it will be seen by robot crawlers and web surfers.
Do pay attention to HTML conventions. Make sure that your <TITLE> and
<ALT> tags are accurate and descriptive. Also, check your <A HREF>
tags for errors since broken or improperly formatted links can prevent
Google from indexing your page.
Do make use of the robots.txt file on your web server. This file tells
crawlers which directories can or cannot be crawled. Make sure it is
current for your site so that you don't accidentally block our
crawler. Visit: http://www.robotstxt.org/wc/faq.html for an FAQ
answering questions regarding robots and how to control them once they
visit your site.
Do ensure that your site is accessible through HTML hyperlinks.
Generally, your site is crawlable if the pages are connected to each
other with ordinary HTML links. If certain areas are not linked, you
may be excluding older browsers, differently-abled users, and Google.
Google can crawl content from a database or other dynamically
generated content as long as it can be found by following links. If
you have many unlinked pages, you may want to create a jump page from
which the crawler can find all of your pages.
Do build your site with a logical link structure. A hierarchical link
structure is not only beneficial to you, but also to Google. More of
your site can be crawled if it is laid out in with a clear
architecture.
Don't...
Fill your page with lists of keywords, attempt to "cloak" pages, or
put up "crawler only" pages.If your site contains pages, links or text
that you do not intend visitors to see, Google considers them
deceptive and may ignore your site.
Do not feel obligated to purchase a search optimization service. Some
companies "guarantee" your site a place near the top of a results
page. While legitimate consulting firms can improve your site's flow
and content, others employ deceptive tactics to try and fool search
engines. Be careful - if your domain is affiliated with one of these
services, it could be permanently banned from our index.
Do not use images to display important names, content or links. Our
crawler does not recognize text contained in graphics. Use ALT tags if
the main content and key words on your page cannot be formatted in
regular HTML.
Do not provide multiple copies of a page under different URLs Many
sites offer text-only or printer-friendly versions of pages that
contain the same content as the graphic-enriched version of the page.
While Google crawls these pages, duplicates are removed from our
index. In order to ensure that we have the desired version of your
page, place the other versions in separate directories and use the
robots.txt file to block our crawler.
For more information, read the information available at this link:
://www.google.com/webmasters/index.html
Or contact Google at:
help@google.com
Hope this helps. |
Request for Answer Clarification by
duckman-ga
on
02 May 2002 02:44 PDT
Thanks for your very informative answers.
I do need clarification on the following points that you made since
they do not seem to be the cause of http://www.capebuyerbroker.com/
having been dropped from your database:
1. Re: "The site is not well connected through multiple links to
others on the
web"
My husband tells me that several organizations have links to our site.
There were 10 to 20 links to http://www.capebuyerbroker.com/ during
the Google's crawl in March when http://www.capebuyerbroker.com/ was
listed on Google's fifth page of results of a search on "Cape Cod Real
Estate."
2. Re: "The site launched after Google's last crawl was completed"
Our site was launched over a year ago.
3. Re: "The design of the site makes it difficult for Google to
effectively
crawl its content."
We had already followed many of your excellent suggestions, and are
trying to implement the others. On the first page there are links to
information about arts and music, fishing, golf and other sports,
disability information, government, towns, schools, museums, outdoor
activities, restaurants, and places to stay. Gathering this
information took my husband and me many hours. No other Cape Cod real
estate site has this much information about Cape Cod.
Also, http://www.capebuyerbroker.com/ has been listed by DMOZ.
Could the following be part of the reason that
http://www.capebuyerbroker.com/ is not in the Google database:
We created a prototype site on free web pages provided by our ISP
before we launched our real site at http://www.capebuyerbroker.com/.
The address was www.tiac.net/users/marietta.
Google and DMOZ added www.tiac.net/users/marietta to their databases
along with http://www.capebuyerbroker.com/.
My husband asked Google and DMOZ to eliminate
www.tiac.net/users/marietta from their databases because it was just a
prototype. In addition, we no longer use tiac as our ISP. He was
trying to save Google users from being directed to a dead link.
Might Google have eliminated our real site at
http://www.capebuyerbroker.com/ instead of the prototype site at
www.tiac.net/users/marietta?
Thanks,
Marietta Nilson REALTOR and Google fan
|