Hi gjcmg-ga ~
You ask a that many people ask about, usually wanting to know how to
get dynamic pages indexed in search engines.
Please remember that Google Answers Researchers are independent
contractors. We are not privy to any inside information about Google's
(or any other search engine's) well-guarded algorithms.
In answering questions of this nature, I rely on information used on a
regular basis in my own business and other information provided by
experts in design for search engine optimization (SEO), recognized
experts in the field of SEO and other reliable resources to provide
the most current reliable information available.
My quick interpretation of what you were told about being 'penalized'
for dynamic pages is that they will not always be indexed, so any
content on those pages that are created dynamically are not likely to
show up in search engine results. I think your marketing person was
referring to 'penalized' as in: If you want that information to be
indexed and to show up in search engines, it won't.
==================
What Google Says
==================
Here's what Google says about indexing 'dynamic' pages in its :
"Your pages are dynamically generated. We are able to index
dynamically generated pages. However, because our web crawler can
easily overwhelm and crash sites serving dynamic content, we limit the
amount of dynamic pages we index." (See Google's "Reasons your site
may not be included" ).
- ://www.google.com/webmasters/2.html#A1
Google also addresses dynamically generated pages in its "Design and
Content Guidelines", saying:
"If you decide to use dynamic pages (i.e., the URL contains a '?'
character), be aware that not every search engine spider crawls
dynamic pages as well as static pages. It helps to keep the parameters
short and the number of them small."
- ://www.google.com/webmasters/guidelines.html
Your concept of dynamically generated pages, "I've used 'dynamic' to
refer to web pages built from code and information from databases.
..." is correct, and that is precisely why some search engines cannot
always index dynamically generated pages and produce those pages as
search results.
If every page, including your index page, is created dynamically, then
it will be especially difficult, if not impossible, for any search
engine to index and list your site. Therefore, you are 'penalized' by
not being included.
=========================
But What About MetaTags?
=========================
In an ideal world, you could dynamically generate your entire site,
and let the search engines rely on your keyword and description
metatags to deliver your site on a search of your key words.
Unfortunately, a lot of people caught on to the 'keyword' thing and
were stuffing those and the description metatags with terms which
didn't necessarily exist on the page, but which might get them a high
placement under certain search terms.
Because of that, most metatags are now ignored or weigh so little in
search engine algorithms, and there has to be something else there for
a search to index and follow.
For designing a page that's easy to index, Google recommends:
"* Make a site with a clear hierarchy and text links. Every
page should be reachable from at least one static text link.
* Offer a site map to your users with links that point to the
important parts of your site. If the site map is larger than
100 or so links, you may want to break the site map into
separate pages.
* Create a useful, information-rich site and write pages that
clearly and accurately describe your content.
* Think about the words users would type to find your pages,
and make sure that your site actually includes those words
within it.
* Try to use text instead of images to display important names,
content, or links. The Google crawler doesn't recognize text
contained in images.
* Make sure that your TITLE and ALT tags are descriptive and
accurate.
* Check for broken links and correct HTML.
* If you decide to use dynamic pages (i.e., the URL contains a
'?' character), be aware that not every search engine spider
crawls dynamic pages as well as static pages. It helps to keep
the parameters short and the number of them small.
* Keep the links on a given page to a reasonable number (fewer
than 100)."
- ://www.google.com/webmasters/guidelines.html
The reason for having at least one page with all the recommended
design points is to give the search engine something to link to and
search.
On the other hand, if you're dynamically generating your entire site,
there are ways to make it easier for search engines to crawl your site
and index the pages. There are several articles which explain how this
can be done.
Some articles you may want to take a look at:
1. J.K. Bowman, proprietor of the Spider Food web site, offers an
informative overview of both the problem and the solution for many
types of web databases.
- Optimization for Dynamic Web Sites
http://spider-food.net/dynamic-page-optimization.html
2. Search Engine Optimization Ethics discusses it in an article
entitled "Optimizing Dynamic Web Pages"
http://www.searchengineethics.com/dynamicpages.htm
3. SEO Chat also has an excellent article by Barry Schwartz, "Dynamic
URLs In The Eyes Of A Search Engine" dated June 9, 2003, that helps
explain what the search engine sees and how to get the dynamic pages
indexed.
http://www.seochat.com/articles/1/page1.html
4. Jill Whalen, recognized as one of the foremost search engine guide
recently wrote another article on Optimizing Dynamic Content for
Search Engines (8/11/2003)
http://www.searchengineguide.com/whalen/2003/0811_jw1.html
5. Brian Gilley of SEO Position wrote an article, "Dynamic Content for
Search Engines", January, 2003)
http://www.seoposition.com/articles/seo1.html
===============================
Dynamic Pages
With Certain Search Engines
===============================
In addition to the above articles, there are some discussions among
the webmasters and website owners on Webmaster World about search
engines and dynamic pages.
I performed a search on Webmaster World for just the term "dynamic",
and you will notice that I left the dynamic parameters in the URLs. If
you list those URLS within your links (as I have done below), the
information will be dynamically delivered with the latest additions to
the discussion threads, yet would also be indexed by most search
engines.
1. In this thread, there is a discussion of Inktomi crawling
dynamically generated pages:
http://www.webmasterworld.com/forum1/2210.htm?highlight=dynamic
2. There is another thread that gives good examples of dynamically
driven pages and static pages with dynamic content, which may help
answer your own question about 'dynamic' pages here:
http://www.webmasterworld.com/forum12/882.htm?highlight=dynamic
3. Another discussion specifically about Google and indexing dynamic
pages can be found here:
http://www.webmasterworld.com/forum3/15532.htm?highlight=dynamic
4. A specific discussion on ways to build templates and aid in
recognition of dynamically generated pages can be found in this
thread:
http://www.webmasterworld.com/forum88/296.htm?highlight=dynamic
================
Summary
================
There *are* ways to get dynamically web pages indexed by search
engines.
For smaller sites with a manageable number of dynamic pages, sometimes
it is easier to create a site map that features links to each of the
dynamically generated pages. The obvious benefit is that if you're
using a content management system, it can still be used to update the
dynamic pages - but search engines have a static page that serves as a
doorway to them so that they can be properly spidered.
For larger sites, the best way is to employ one of the many methods
used to change the syntax of a dynamic URL so that it appears to be
static.
An example could be:
http://www.yoursite.com/index.cfm?category=widgets&size_id=11
and could be rewritten as:
http://www.yoursite.com/category/hats/size_id/11/index.cfm
About.com's Jennifer Laycock gives a good example of rewriting for
either a Unix server or using .asp in her discussion of Optimizing
Dynamic Content here:
- http://websearch.about.com/library/weekly/bl-seo101-buildf.htm
Remember, search engine algorithms change constantly in order to
deliver the most relevant information. GoogleGuy, a Google engineer
and regular poster to Webmaster World (and probably the closest we
will get to an "official" Google answer) says:
"We're getting better on dynamic pages every month thanks to better
analysis. I think we crawl dynamic pages better than any general
search engine at this point.."
"In general, it's still a good idea to keep the number of parameters
short. But we are getting better over time"
http://www.webmasterworld.com/forum3/12370.htm
I am very sure your marketing consultant referred to 'penalizing' in
the manner of not getting. As you can see from the above links (both
the articles and the Webmaster World discussions), it is still
difficult to get dynamic content listed unless you restrict the
parameters or 'help' by the use of static pages or rewrites.
===================
Search Strategies
===================
Google:
search engine + dynamic content
webmaster world + dynamic content
mod rewrites
I trust this helps in understanding the issue of search engines and
dynamic content a bit better.
Best regards,
Serenata |
Clarification of Answer by
serenata-ga
on
11 Sep 2003 19:22 PDT
Hi Gary ~
I like "Gary" better than gjcmg, if you don't mind.
Under Google's "Facts & Fiction" - last item on the page:
"Fiction: Sites are not included in Google's index if they use
ASP (or some other non-html file-type.)
Fact: At Google, we are able to index most types of pages and
files with very few exceptions. File types we are able
to index include: pdf, asp, jsp, hdml, shtml, xml, cfm,
doc, xls, ppt, rtf, wks, lwp, wri."
Performing a search on Google for the term "default.aspx", returns
over 1 million pages, so yes, it is possible to index dynamic pages.
I did not look beyond the source code of a half dozen of the results
(and not the "obvious" pages like Microsoft's). I would imagine it is
safe to assume that some of those pages are dynamically generated,
although I see the results of the generated pages and not the actual
coding.
So far as your two examples go:
Searching Google for http://www.catechjobs.org/Default.aspx
and for
www.NMTechJobs.org/default.aspx
produced no results.
Searching for www.catechjobs.org/ and www.NMTechJobs.org/ show both
sites are indexed by Google.
I realize this doesn't answer why your particular default.aspx pages
aren't being indexed, but it does show that such pages CAN be indexed
by Google, which answers your original question. I also suspect that
there are static portions of some of the pages, with the dynamic
content generated after certain criteria are met. Unfortunately, I
cannot determine what those criteria are.
I think you might compare your own coding to the suggestions I cited
in my answer, which may be able to help.
Good luck,
Serenata
|