![]() |
|
![]() | ||
|
Subject:
Getting indexed better by Google
Category: Computers > Internet Asked by: andrew_gray-ga List Price: $5.00 |
Posted:
17 Mar 2003 05:56 PST
Expires: 16 Apr 2003 06:56 PDT Question ID: 177287 |
I work for a publisher who owns a news and information service dedicated to accountants called "AccountingWEB" - the worlds largest with about 200,000 members. It's a database driven site and most of the daily news (about 10 stories a day) is delivered via CGIs with only a synopsis appearing on the main home page. Google indexes the site but we've never been able to get any of the news stories into the Google index (and our news stories are particularly valuable - at least to acocuntants). Can you give me guidance as to how to get our CGI delivered news content indexed by Google without having to prepare a "static" HTML page for each one? Thanks. ANDREW www.accountingweb.co.uk www.accountingweb.com www.accountingweb.nl |
![]() | ||
|
Subject:
Re: Getting indexed better by Google
Answered By: serenata-ga on 23 Mar 2003 19:01 PST |
Hi Andrew ... I am very familiar with Accounting Web's website. It has important information which is often quoted or mentioned on sites I subscribe to, such as TaxMama.com and other accounting information sites. Dynamic pages can indeed be a problem for indexing purposes ... Google says in its guidelines to try to avoid them, saying in its Design and Content Guidelines, "If you decide to use dynamic pages ... be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them small." - ://www.google.com/webmasters/guidelines.html I feel the information you have on your site is important, too, so it's a shame they are not getting indexed. I did a search of Danny Sullivan's Search Engine Watch, and he has a suggestion which you may be able to use ... This is from Danny Sullivan's Search Enging Placement Tips, updated October, 2002: "Generating pages via CGI or database-delivery? Expect that some of the search engines won't be able to index them. Consider creating static pages whenever possible, perhaps using the database to update the pages, not to generate them on the fly." - http://www.searchenginewatch.com/webmasters/tips.html A search of Google for creating static pages with a dynamic content feed found a discussion on Webmaster World. It discusses actually using enough static text to feed the rest of the dynamic content regularly. You can see the discussion here: - http://www.webmasterworld.com/forum3/6542.htm The point is, a search engine needs a static page to index, so if you can make your pages static with enough content to get indexed, you can feed the dynamic content on the page, instead of making the entire page dynamic. It's not tricking the search engine, but it is giving the search engine something to actually find. I am not familiar enough with your site's database or how it is dynamically built to help you achieve this, but I bet your designer can help you figure out how to do it. Search terms used - - search engines +dynamically generated pages - searching dynamically generated pages - search engines + cgi I hope this helps and that you can get these pages indexed soon - even if it is dynamically generated. Yours ever so, Serenata |
![]() | ||
|
Subject:
Re: Getting indexed better by Google
From: robertskelton-ga on 17 Mar 2003 14:02 PST |
Hi there, Google sometimes indexes such content, but they don't say what criteria they use to determine indexing or not indexing. I suggest just emailing them and let them know: help@google.com And while you are at it, suggest your site for inclusion in Google News: news-feedback@google.com |
Subject:
Re: Getting indexed better by Google
From: shobjanta-ga on 18 Mar 2003 14:56 PST |
Usually, the search engines will not index URLs that use cgi-bin style parameters, i.e. http://www.somedomain.com/cgi-bin/script?param1=value1¶m2=value2 If your site is implemented in CGI, and you still want your dynamic pages to get indexed, you can configure your web-browser/cgi-bin subsystem to use the path scheme instead. So the corresponding URL may look like: http://www.somedomain.com/cgi-bin/script/value1/value2 In which case Google and other search engines won't "know" this is a cgi-bin generated page and will happily index it. Of course, like any other pages, if you want these dynamic pages to be indexed by Google (and others), they would need to be linked in from other pages. The way you set this up depends on the web-server you are using. For instance, if you are using Apache, you can use the "RewriteRule" directive to do this. Related Sites: http://www.sitepoint.com/article/485 http://www.phpbb.com/phpBB/viewtopic.php?t=76843 |
Subject:
Re: Getting indexed better by Google
From: andrew_gray-ga on 19 Mar 2003 02:24 PST |
Thanks for this guidance. Am I right in thinking that if we attempt to solve the indexing problem in this way that we may introduce a different problem - more firewalls will start caching more pages. The news pages include dynamic elements (like user comments) so it's important that users get a fresh copy each time (not an old version from a cache along the way). As I understand it the "?" in the cgi string alerts most cache not to cache. Thanks. ANDREW |
Subject:
Re: Getting indexed better by Google
From: shobjanta-ga on 21 Mar 2003 11:08 PST |
"Am I right in thinking that if we attempt to solve the indexing problem in this way that we may introduce a different problem - more firewalls will start caching more pages" I am not sure I follow what you are saying here. Yes the search engines do cache these pages and they generally will index pages at some regular interval. So in your page, if some one has added the word "foobar", the search engine indexer will take a while to pick this word up. Is this what you mean? If so, there really is no way out of here other than to provide your site-users with your very own search engine, where you can control how frequently your content is indexed. Or are you referring to the fact that your page is wide open to the world, there by you are allowing all search engines to "see" your pages. Your initial question seems to suggest this is what you wanted, so I dont understand why this is a problem. Anyways, if you did want to prevent search engines from indexing your pages, you can place a file called "robots.txt" at the root of your webserver. For help, visit: http://www.robotstxt.org/wc/robots.html You can place directives in this file, asking web search indexing engines (called "spiders" or "robots") to either not visit certain sections of your sites, or do something specific, etc. |
Subject:
Re: Getting indexed better by Google
From: googleexpert-ga on 23 Mar 2003 17:37 PST |
You might want to try webmasterworld.com |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |