Hi,
It took me several read throughs to get this, but I finaly did. So,
I'm goig to answer this for you, as clearly as I can, taking much of
the tech fluff out of it, and I'll explain why in a bit. Basically,
you are right, but the way you said you were right had me thinking you
were dead wrong. So, clear your mind of what you think you know for a
moment, and read this through, if you have other questoins regarding
this, please feel free to use the Clarification button and I'll see
that you get your answers. But, lets' start here first.
First off, it is popular now in the SEO arena to suggest that dynamic
links of all kinds are not indexed by Google, when in fact, Google's
own SEO page, says they do this just fine, and if you look at the
address of the pages Google uses, they are just as dynamic as everyone
else's.
Secondly, if someone were to suggest to me that PREURL is not indexing
as well as CFID (or visa versa) my first thought would be that the
PREURL pages are not as static as the CFID pages. Meaning, that when
the Google bot returns, it cant' find the same pages it did the last
time it was here.. and really, that's all it cares about. If it found
a page that it thought noteworthy enough to index, it wants to find
that page on its next trip out to your site. If it doesn't find these
pages, then they drop out of the index, as not being stable enough to
keep in the data files.
Third, the utility you are using to test this is just that, a utility,
a game, a thing to get an idea with, and shouldn't be confused with a
diagnostic tool. Cause its not and was never intended to be a
diagnostic tool for your SEO program. There are times (quite a few
really) that the site: inurl: and linked: do reflect what the main
engine says, there are quite a few other times they do not.
Much of this has to do with the massive updating required to keep
those tools, and the main engine in sync. Much of it also has to do
with the Google Dance, and the monthly updates to the main engines,
which span over weeks in time as the main engines across the globe
sync with each other.
A great deal more has to do with the well known fact that the only
ones really using the site: search are people searching their own
sites. So, its not as important to keep accurate as the main engine
is.
I went over their website. They have a very good ratio of indexed
pages, and no doubt this has come from many long hours of detailed
attention. They also have a great deal of content there, which is the
biggest variable to keep working. There are several factors involved
with the Google engine (and not just them, all of the engines have
their quirks these days), But, really.. consider what you are
suggesting here... It would take 'extra' code in the Google bot to
have the preference you are suggesting.
What is happening there with a PREURL code is not the code itself, but
the information it is relating. They are using this as a marker for
the last page I was on, while going through the site, instead of using
sessions or a cookie or something of that nature. If they remove this
code, but continue to put this information (last page URL) in the
string, they will have the same results, because these pages won't be
there on the next time the bot comes through. If they remove the other
codes and keep this one, ... the net result will be 0 as well. .. No,
that's not really true, They will have still changed every url on
every page on their site, so they will probably loose a great deal of
indexed pages for a while. If they are moving to a schema that results
in more stable URL's then it is worth it, if they are not, then, they
should probably reconsider their logic.
If they change these to static types using MOD_Rewrite, they will
still have the same problem.
The main thing, the number one thing, which is important to any search
engine, not just Google, is the longevity of the page. The question is
"will it exist tomorrow?" and in pages we see here, the answer is
almost always, "no". So they will not be indexed, not for long anyway.
Consistency and content are the two main factors in Google indexing.
There's not much mystery involved. You build a good site, with good
content and keep it up and consistent, then you rank high. If you
change URL's constantly, have 100's of URL's pointing to the same URL,
(which happens by default with their type of setup, because the bot
comes in different routes, and finds later that it has several
different URL's pointing to the exact same content ) then.. you don't.
It doesn't take highly paid SEO's to tell you that.
Think of it this way, you have two friends who give you great advice.
One, has a single number, and every time you call it, he's there to
answer the question. The other, well.. he's not always at the same
number, and sometimes, he's not anywhere. Over the course of a couple
of months, who are you most likely to call on a regular basis?
Now, as for your question, the answer is no, and I've explained it
rather well I think, but not for the reasons you have started out
with. Any change to the URL's will affect them in the short run.
Expect them to drop quite a bit with a site wide change like that.
Keeping the PREURL tag in there, is .. well, ludicrous really, (why
change at all?) Having dynamically created links is not a problem.
What is a problem is creating links that will not be created again, or
.. inconsistently created in the future. There are better, more
reliable, and much more accurate methods of tracking user progress
through a site.
So, yes, your are right and being very consensus in pointing this out to them.
I've explained this in very simple term in this answer, because,
although you were on the right track, it took me several read through
to realize what it was your were getting at, and where you were coming
from. So, this simple way of explaining it is also to give you a
method, of relating your ideas in a context that your clients will
understand as well.
You and they need to realize that in the area you are addressing,
'deep inner workings of the Google engine' are not necessary to see
and understand basic facts of life. Google, and every other engine,
has a limited amount of space and has to keep the engine as clean as
possible to get results from that base as fast as possible. That is
reality. There is no getting around that.
Second, they (Search Engines) want as many different results for a
give query, which relates to that query, as possible. Finding that a
search (any search) shows 100's of links to the exact same content, is
frustrating to the user and embarrassing for the SE. So, when they
discover sites which create this phenomena, they remove them, or
filter them down heavily. Again, no secret here, just basic business.
Third, a simple javascript cookie placed into the body of the page,
would solve this. In fact they are missing the simple basics. Like a
site map.
http://www.execunet.com/sitemap.html
Google will use that page to re-index your site. It doesn't care that
those links in that file are dynamic. All it cares about is that the
page is there when it comes back next week. It also cares about the
content an that there is something meaningful there, but that's
another topic, and one we aren't ready to address at this point.
The don't' have a robots.txt file, to help the bot know where to go
and what to do when it is there. It might be in your HEADER tag, but
that's not where its going to look for it for 'site constancy'.
http://www.execunet.com/robots.txt
All of this is on the Google main Page, it's not hidden information,
and doesn't require a highly paid SEO to gather it up for them. (Or
maybe it does. It seems that more and more businesses would rather pay
than play these days).
://www.google.com/webmasters/
://www.google.com/webmasters/4.html
://www.google.com/webmasters/3.html
Quoate--"Fiction: Sites are not included in Google's index if they use
ASP (or some other non-html file-type.)
Fact: At Google, we are able to index most types of pages and files
with very few exceptions. File types we are able to index include:
pdf, asp, jsp, hdml, shtml, xml, cfm, doc, xls, ppt, rtf, wks, lwp,
wri, swf."
-- from
://www.google.com/webmasters/facts.html
://www.google.com/webmasters/faq.html
Google is very good at being straight forward with you, and has put up
a great deal of content on what they look for and how they act when
they find it.
A final note on this and I'll end this answer. The note is in the
results of the latest SEO Google Ranking Contest .. here's the link.
Single Post Wins Google Contest
http://www.wired.com/news/infostructure/0,1377,64130,00.html?tw=wn_2culthead
I wish you luck,
thanks
webadept-ga |
Clarification of Answer by
webadept-ga
on
21 Jul 2004 04:38 PDT
Hi,
Normally it is wise to use the Clarification button, before rating an
answer. but, that's okay.
I'm a bit confused with your listed response however, since he is
saying exactly the same thing I did. Bots don't arrive at pages the
same way, thus the dynamic link this website has building on the last
page visited will be different on each visit.
--"What is happening there with a PREURL code is not the code itself,
but the information it is relating. They are using this as a marker
for
the last page I was on, while going through the site, instead of using
sessions or a cookie or something of that nature. .... The main thing,
the number one thing, which is important to any search engine, not
just Google, is the longevity of the page. The question is "will it
exist tomorrow?" and in pages we see here, the answer is almost
always, "no". So they will not be indexed, not for long anyway." --
and his
--"The EuN architecture seems to be
in part based on the assumption that visitors are arriving at and
beginning their visits at the default home page. Note that if they do
arrive at the default home page the PREURL variable seems to be
included in each html link on the page (as are the CFID and CFROKEN
variables) except the login link.
But visitors and bots also arrive at pages other than the home page; a
link to EuN from some other site may look simply like another heavily
trafficed page URL (not home the page), like
http://www.execunet.com/e_home.cfm, or like
http://www.execunet.com/r_home.cfm, for example. The query portion of
the URL is not likely to be present unless the link is from a tracked
source (e.g. ?welcome=xxxxxxxxx), but in any event, the PREURL, CFID
and CFTOKEN variables will not be in the requested URLs. The html
links on those pages, called in this manner, sometimes include the
PREURL variable and sometimes do not. " --
As for Google's bias, their bias is stated very clearly on the Facts
and Fiction page
--"Fiction: Sites are not included in Google's index if they use ASP
(or some other non-html file-type.)
Fact: At Google, we are able to index most types of pages and files
with very few exceptions. File types we are able to index include:
pdf, asp, jsp, hdml, shtml, xml, cfm, doc, xls, ppt, rtf, wks, lwp,
wri, swf.
---"
://www.google.com/webmasters/facts.html
You did notice the SWF there at the end.. yes?
The problem is not the PREURL in and of itself, it is the displayed
information the PREURL is gathering for the GET string. Both this
other service and myself have said this in different ways. You can
name PREURL "cash" or "string" or anything you want too, its not
going to matter. The bots see the GET string as the "name of the page"
the whole string. This other service and I have both agreed on this as
well.
his quote --"The query portion of
the URL is not likely to be present unless the link is from a tracked
source (e.g. ?welcome=xxxxxxxxx), but in any event, the PREURL, CFID
and CFTOKEN variables will not be in the requested URLs. The html
links on those pages, called in this manner, sometimes include the
PREURL variable and sometimes do not."---
I don't know where he gets the one parameter, two parameter, dance of
logic he has there. It doesn't hold up to observation or for that
matter, what Google says on their pages and/or publications. I didn't
see a reference to his source, so its just opinion as far as I can
tell. Google didn't say it.
He might be thinking of this quote --"If you decide to use dynamic
pages (i.e., the URL contains a '?' character), be aware that not
every search engine spider crawls dynamic pages as well as static
pages. It helps to keep the parameters short and the number of them
small." --
from this page
://www.google.com/webmasters/guidelines.html
But, Google isn't referring to themselves there, they are letting you
know that "other" bots don't crawl them well. So.. (??)_
Be all that as it may, he and I agree completely on the PREURL
problem, and both fo us stated that the page referance in the GET
string need to be taken care of.
So I don't understand your statements in the comment area. I'm fine
with the rating, because I'm assuming that this is your first time
using the service and you didn't know that if you used the
Clarification Button, I would search out and find more information for
you, and would have addressed this other 'advice column' as well. Our
goal is to research your answer. Some times we get it on the first go,
other times we don't. But once we start, we do our best to insure you
have the answer you are after. Next time you use the service, please
keep this in mind. The researcher are very dedicated to their level of
service.
With that said, rating or no/payment or no, if you would like a
greater understanding on this issue or would like to post something
else, that someone else said, for me to research out for you, please
do. You are obviously not quite certain about this issue, so I'm happy
to help you out with it.
webadept-ga
|