Hi Dan,
There is a simple remedy using a META tag.
<meta name="robots" content="noindex,follow">
When the Googlebot sees this, it will follow the links in the page,
but will not index the content.
This will solve your problem, but may have an effect on your ranking
in Google search results. Fortunately, because your site is updated so
regularly, Googlebot visits often, so it won't take too long to see
what the effect will be. My guess is that your ranking will improve.
The page comparch.shtml doesn't appear to be made by hand, so I'd
leave the descriptions there, they can't do any harm once the META tag
is in place.
More information on robots.txt can be found at:
http://www.robotstxt.org/wc/robots.html
Search strategy:
Personal experience
I trust this answers your question. If any portion of my answer is
unclear, please ask for clarification.
Best wishes,
robertskelton-ga |
Request for Answer Clarification by
hammerhands-ga
on
09 Sep 2002 15:12 PDT
Robert,
Thank you for your META tag instruction. I just want to make sure you
still agree we should leave the descriptions there in the
comparch.shtml page, because currently the file is 1.4 megs! If I
deleted the descriptions it'd be 149k. Do you see a problem with it
being such a large file? Do the spiders refuse to look at the entire
page since it's so huge? Would they spider the whole thing if it were
chopped down to 149k?
Thanks,
Dan
|
Clarification of Answer by
robertskelton-ga
on
09 Sep 2002 16:14 PDT
Yikes! No wonder it was too large for Notepad!
A good thing / bad thing about Google is that it only indexes the
first 101K of a page, so in search results if a page is over 101K, it
still says 101K - it never crossed my mind that your page could be so
huge. My apologies.
Google's cache for the page stops at 101K, and so any article links
older than July 14, 2002 have not been indexed.
http://216.239.33.100/search?q=cache:qTkJAK3s6KgC:www.petroleumnewsalaska.com/comparch.shtml+northstar+gunkel&hl=en&ie=UTF-8
I tried searching for some June articles and Google couldn't find
them. I used to think that the Googlebot would follow all links,
regardless of page size, but using your site as evidence, this is
obviously not correct.
Keywords found in and around links are important to Google, but more
important for you would be to get as many links followed as possible.
149K is not less than 101K - the only remedy would be to have the
links covering multiple pages, perhaps one covering the last 2 months
and others as archives.
My revised suggestion is that you get rid of the descriptions.
|
Request for Answer Clarification by
hammerhands-ga
on
09 Sep 2002 17:35 PDT
Do you suggest I get rid of the descriptions and break the file into 2
files? (One file would be 100k and the other approx 49k.)
Dan
|
Clarification of Answer by
robertskelton-ga
on
09 Sep 2002 18:24 PDT
One file of 100k and the other of 49k would work fine, although 90K
and 59K would give you a bit of a buffer - avoiding accidentally going
over 101K. Make sure that the two pages link to each other.
Removing the descriptions is the only option, apart from splitting the
1.4MB file into 20 files of 70K. However, in my experience this is
deviating too far from the "site map" type page that Google seems to
like - that is, a single page that links to every other page. In your
case it needs to cover at least 2 pages, but that cannot be helped.
|