Thanks for asking!
Without seeing the before and after URLs I can only offer
possibilities, rather than a specific analysis. However, there is
enough material available regarding the actual behaviors of search
engine spiders and dynamically generated .ASP pages to provide you
with a reasonable picture of the factors involved.
Google Answers Researchers do not speak officially for Google,
however, we can provide insight into the behavior of the Googlebot and
the PageRank algorithm, based upon our own observation, study, and
experiences.
Let's take your last question first, and begin with Google's own words
on the subject, from the Google Facts and Fiction page of their
Webmaster Help section.
"Fiction: Sites are not included in Google's index if they use ASP (or
some other non-html file-type.)
Fact: At Google, we are able to index most types of pages and files
with very few exceptions. File types we are able to index include:
pdf, asp, jsp, hdml, shtml, xml, cfm, doc, xls, ppt, rtf, wks, lwp,
wri."
Google Facts & Fiction
://www.google.com/webmasters/facts.html
.ASP vs .HTML
-------------
"How do asp pages relate to html pages and the ranking of each?"
A short excerpt from one of many similar discussions at WebmasterWorld
as early as 2000 provides evidence to debunk the myth that .asp pages
rank lower, or differently, than equivalent .html pages, all other
factors being equal:
">asp pages are ranked less highly than HTML pages....
<boasting>
This is not the case in my experience. Rather, my asp pages are
ranking very well (read top positions) based on the content,
optimisation and PR factors, and regularly beat HTML pages.
</boasting>
With all the possible variations of page suffixes available today, and
I am sure more in the future, I don't believe this is a factor any
more."
WebmasterWorld
ASP and Search Engines
Will it read the include statement?
Woz - Site Administrator, August 13, 2000
http://www.webmasterworld.com/forum47/81.htm
There are a couple of barriers to indexing certain types of .asp
pages, notably those with session IDs incorporated into the URLs or
those that depend solely upon site search query strings for page
generation. These difficulties can often be overcome by conversion of
the dynamic URL characters into a format that the Googlebot can more
easily digest, and "hallway pages" -- static pages which contain lists
of dynamic URLs. If the pages have been reindexed, however, these
items are not the most likely culprits.
The question now becomes: What new factors added by the switch to .asp
can affect PageRank?
XML Parsing
-----------
A possible issue might be the source locations of the elements that
make up the .asp pages. It sounds as if the source material is coming
from more than one server. Although .asp is a server side application,
if data must still be retrieved from both every time a URL is called,
a timing issue might be involved, especially if certain page elements
(page title, or other ranking-critical elements) must be parsed from
XML data.
Some omissions might be obvious from the actual new Google search
result, however, it may take several spider cycles to determine if the
complete text of each page is being retrieved and spidered. You can
perform a quick check to try rule out this possibility by using Search
Engine World's Spider Simulator, however I'd also recommend comparing
these results to Google's cached versions of the pages when those
appear in the index. If it turns out that some data is being missed,
you might have to modify the parser to speed up the process.
Spider Simulator at Search Engine World
http://www.searchengineworld.com/cgi-bin/sim_spider.cgi
Also see "Search engine standards for website placement, ranking and
positioning" for a listing of page elements that contribute to
PageRank:
Attributes and tags-effect on major search engines
http://search-engines-web.com/
The Transition Process
----------------------
In the changeover between an .html extension and a .asp extension,
another possibility might be the transition period itself. PageRank is
based upon links to (votes for) a page. If the inbound links still
read .html, then they may not be counted as votes for the .asp pages.
The page owner should request that all backward links be changed to
the new URLs.
It is also commonly told by those switching from static .html pages to
.asp pages that it takes a bit of time, several spider cycles at
least, for the changes to fully propagate and PageRank to stabilize.
Many webmasters' natural panic and anxiety over the drop in PageRank
has been "cured" by nothing more than the application of a few months
patience.
Deep Crawl
----------
It's been observed that there is sometimes an initial reluctance on
the part of the Googlebot to transit all the levels of a database
driven website. In figuring out the navigational scheme, the Googlebot
seems to be cognizant of warning signs of a "spider trap", and until
it discovers a way out of a new leap (link) of a deeply layered
hierarchal page, without a high PageRank of its own, it may simply
pass over any given link. After a couple of spider cycles, with better
knowledge of the full site, more links are spidered, and make their
way into the index. This is, again, a function of time, but the
process can sometimes be accelerated by providing static link lists of
"safe" URLs.
Page/File/URL Naming Considerations
-----------------------------------
Finally, if page names, file names, or the URLs have changed in any
other way beyond the file extention, due to the implementation of the
database, the old and new names should be compared in terms of the
change in keyword ranking and adjusted accordingly.
Summary
-------
All other factors remaining unchanged, there is no difference
demonstrated between the ranking of .asp pages and the ranking of
.html pages. XML parsing difficulties, the time required for changes
to propagate during the transition process, and adjustment of the
search engine spider to the .asp navigation are the primary factors
that would best account for a change in PageRank.
Additional Resources
----------------------------------------------------------------------
WebmasterWorld
News and Discussion for the Advanced Web Professional
Site Search - use search terms: .asp spider PageRank
http://www.webmasterworld.com/help.cgi?cat=search
Forums Index - Microsoft Related - .NET and ASP
http://www.webmasterworld.com/forum47/
Google PageRank:
The Google Pagerank Algorithm and How It Works, by Ian Rogers
http://www.iprcom.com/papers/pagerank/index.html
Google's PageRank Explained, by Phil Craven
http://www.webworkshop.net/pagerank.html
Phil Cravens PageRank Calculator
http://webworkshop.net/pagerank_calculator.html
Note: These WebWorkshop pages may be a bit slow to load or require a
couple of tries, but they are well worth the effort.
Answer Strategy
----------------------------------------------------------------------
My answer comes from knowledge of the search engine concerns raised by
dynamically driven websites, earned as webmaster and web developer,
working in tandem with SEO specialists on website development, plus
the following Google searches to verify the most current information.
.asp +urls Google pagerank considerations
dynamic ".asp pages" spidering Google ranking
If anything I've said is unclear or if you (oh no!) discover a broken
link, please let me know. I'll be happy to make it right.
--- larre |