Google Answers Logo
View Question
 
Q: Google returns links to folders that don't exist! ( Answered 4 out of 5 stars,   17 Comments )
Question  
Subject: Google returns links to folders that don't exist!
Category: Computers
Asked by: drewlin-ga
List Price: $6.00
Posted: 25 Feb 2003 15:32 PST
Expires: 27 Mar 2003 15:32 PST
Question ID: 167103
I launched a site, www.smart-threads.com, a few weeks ago, and I have
been checking google to monitor its crawl.  Typing the following into
the google search box:
site:www.smart-threads.com "www.smart-threads.com"

yields the following results:
www.smart-threads.com/more/petite/
Similar pages 

www.smart-threads.com/merchants/nordstroms/
Similar pages 

Now, there never were, and are no folders on my server with the names "merchants",
"nordstroms", "more", or "petite".  How could it of found this result?
 Go ahead and try it, and click on one of the links google provides to
my site, and you'll see that you arrive at my custom 404 page!

How is this possible, and is it detrimental to search engine
optimization?

Thanks a million!

Request for Question Clarification by serenata-ga on 25 Feb 2003 18:03 PST
Hello Drewlin -

You said you "launched the site a few weeks ago" ... and a check of
Arin Whois shows that the records for both "smartthreads.com" and
"smart-threads.com" was created on December 6, 2002.

Did you also recently acquire the domains at that time?

If so, it could be possible that someone else owned the domain(s) and
what you're seeing is either cached or old information previously
retained in Google.

In the event the above is the reason, there is information on how to
contact Google for errors on this Web page:
://www.google.com/contact/search.html

and an email to Google may help clear up the matter.

Yours ever so,
Serenata

Clarification of Question by drewlin-ga on 26 Feb 2003 11:16 PST
Serenata-

Yes, you're right, I did have both of the domains registered in
December, and I'm pretty sure that the googlebot gathered its
information from this site recently, because the googlebot does appear
in the server log files, and also because I had searched google
before, using the syntax I provided above, several times with no
results.  So, because currrently it returns these skewed results, and
they are related to information on my website, I assume that it
collected that information from my site, but it is very puzzling as to
how it gathered those particular links.

I did write to Google last week as well to ask them, but haven't heard
anything.  I've got friends that work for companies that spend upwards
of 50k/year with Google, and they say it still takes them weeks to get
back to them, so I'll be very surprised to hear back from them.

Thanks for your ideas...got any more?

I'm still puzzled!
Answer  
Subject: Re: Google returns links to folders that don't exist!
Answered By: justaskscott-ga on 02 Mar 2003 18:57 PST
Rated:4 out of 5 stars
 
Hello drewlin-ga,

A fellow Researcher (thanks secret901-ga!) alerted me to your February
26 comment, and so I will go ahead a post an answer.

The answer, as noted, is basically here:

"Q: Have google perceived my website as spam?" 
Google Answers 
http://answers.google.com/answers/main?cmd=threadview&id=160848 

It sounds like your web site was established after Google's most
recent update, which began on January 25, 2003.  In that case, you
might have to wait for two more updates for the old information to be
replaced by the current web site information.  If you established your
web site in time for that update, you might only have to wait for one
more update, which webmasters are speculating should begin any day
now:

"Alright enough lets update" (February 25, 2003 to present) 
WebmasterWorld 
http://www.webmasterworld.com/forum3/9608.htm 

In the event that your web site is not properly indexed -- or in order
to ensure, as much as possible, that it does get indexed -- you might
want to consider the information in another answer, which concerns
another web site apparently established at about the same time as
yours.

"Q: Time frame for our MUSKY GUIDE service to be listed by
GOOGLE??????"
Google Answers
http://answers.google.com/answers/main?cmd=threadview&id=168128

I hope that this information is helpful.

- justaskscott-ga

Request for Answer Clarification by drewlin-ga on 02 Mar 2003 19:30 PST
jdog-ga has a point.  I read through that entire link with the
"explanation" to my question, and it was comforting, but the question
still remains, "why was it returning these links, when I'm absolutely
positive these folders never existed?"

Clarification of Answer by justaskscott-ga on 02 Mar 2003 19:41 PST
Someone must have had this content before you registered the domains
in December.  In the case of the site described in "Have google
perceived my website as spam?", the old information had been there
since at least December 2, 2002.  It seems that the same thing has
happened in your case.
drewlin-ga rated this answer:4 out of 5 stars
Great, and reassuring, thank you!

Comments  
Subject: Re: Google returns links to folders that don't exist!
From: justaskscott-ga on 26 Feb 2003 11:36 PST
 
Perhaps my answer to another question -- involving a web site changed
at a similar time -- is applicable to this situation:

"Q: Have google perceived my website as spam?"
Google Answers
http://answers.google.com/answers/main?cmd=threadview&id=160848

(Indeed, I assume that it is applicable.  If you think that it answers
your question, I would be happy to post it as an answer, modified to
your situation.)
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 26 Feb 2003 13:12 PST
 
justaskscott-ga:

That definitely helped!  Thanks a million, and I won't worry too much
about it now.  You can post it as an answer and cash out if you want.

Thanks again,

Drewlin
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 02 Mar 2003 01:31 PST
 
that would only explain it if there were, at one time, "/more/petite"
and "/merchants/nordstroms" directories.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 02 Mar 2003 19:35 PST
 
jdog-ga has a point... I'm absolutely positive that these folders
never existed, I developed the entire site myself, and would've had no
reason to create these folders.  Any other ideas?

Thanks!
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 03 Mar 2003 10:35 PST
 
It seems likely to me that drewlin would have had to buy the domain if
someone had previously held it, not register it. In any case, I
wouldn't worry about the dead links, some whacky stuff can happen
around the time of the update. If they're still there after the
update, let us know and we'll see if we can help.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 05 Mar 2003 12:41 PST
 
jdog-ga, that's what I was thinking too, and because I did register it
as a new domain, and NOT buy it from someone else, it is puzzling.

It seems like the dance came and went on the 3rd, and the googlebot
hit every page on my other sites, but on smart-threads.com, it only
crawled my robots.txt file, index.html, and one product page, and then
it left.  This site should be very robot-friendly, with straight HTML,
lots of text, small page size, no content that is too similar from
other sites, and no javascript or dynamic pages.  Why would it of
left?  Could it have something to do with those non-existent links
that somehow got there before?  I'm beginning to worry that it will
not be indexed well.

Thanks again for your help!
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 05 Mar 2003 13:01 PST
 
As of 2 minutes ago, I couldn't access the page. Could it be possible
that the site (or part of it) was down at that time?
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 05 Mar 2003 13:12 PST
 
The site's working again now. Anyway, you can submit the dead links
for removal at [ http://services.google.com:8882/urlconsole/controller?cmd=reload&lastcmd=login
], but I see no reason they would prevent your site from being
indexed. Unfortunately, you might have to wait till next month to find
out if the problem gets solved. In the meantime, you may want to write
another letter to google and include the latest developments.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 05 Mar 2003 13:21 PST
 
Your help is greatly appreciated - thank a million, and I'll take your advice!
Have a good one,
Drewlin
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 07 Mar 2003 20:50 PST
 
Just another thought: why do you have robots index your 404 page?
following the links I can understand, but I'm not sure what robots
would do if they are told to index after receiving a 404 response.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 07 Mar 2003 21:28 PST
 
Great catch!  I didn't realize I'd put those meta tags in the 404
page.  I'll remove them for sure.  However, I thought that the
googlebot ignored meta tags.  I read this somewhere, and so I included
metatags on my site for the rest of the robots that do read them. 
Maybe I received poor information.  Regardless, your point may be a
very good explanation to why the rest of my site didn't get crawled.
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 08 Mar 2003 13:30 PST
 
Someone lied to you ;)

"Googlebot obeys the noindex, nofollow, and noarchive meta-tags. If
you place these tags in the head of your HTML document, you can cause
Google to not index, not follow, and/or not archive particular
documents on your site."

[ ://www.google.com/bot.html#noindextags ]

I looked at some other pages on your site, and they all have a "index,
follow" tag (which, by the way, is the standard behavior of robots if
you don't have a meta tag on the page). That shouldn't prevent
Googlebot from indexing your page, I just thought it was weird that
tag in the 404 page said to index. Still, even there, it shouldn't
have stopped the bot.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 08 Mar 2003 14:50 PST
 
You're right.  I'm still stumped, and anxious, because I don't want to
lose out on getting my pages indexed next month too.  I believe all my
meta tags to be correct, I've got links pointing to this site from
other sites that are indexed by google, I've submitted my URL to DMOZ
and other directories, and there is nothing that I know of on my site
to deter spiders.  I guess I'll just have to wait and see, keeping my
fingers crossed.  If there is anything else that you might suggest I
do, I greatly appreciate it!

Thanks as always,
Drewlin
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 10 Mar 2003 11:45 PST
 
Well here's a bit of good news: Google seems to have at least indexed
your index page (do a search for "smart-threads.com") and dropped the
dead links. I'm guessing you should be set for the next crawl, but the
results may not show up for another two months.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 25 Mar 2003 10:57 PST
 
I think I may have figured out why the googlebot didn't get past my
index.html page!  Using searchengineworld.com's Sim Spider, which
allows you to enter a URL and displays what a spider will see, I made
an interesting discovery.  The spider could not follow any of the
relative links on my page for some reason, and I know why and have
fixed the problem, but don't really know why!  Get this,  every
relative link on my page had a typical <a href="somefile.html"> tag. 
What the spider saw was, "http://somefile.htm" and it tried to follow
this link, which of course, was dead.  I went back and changed all my
relative references to something like this: <a href="/somefile.html">,
and just by adding the "/" before the filename, the spider can now
follow the correct link.  Very strange!  Another puzzling fact is, the
spider only acts this way on my index.html page.  Once it gets to
another page, it can follow any link that has <a href="somefile.html">
without the "/".  Very strange, test it out if you like, sim spider
can be found here:
http://www.searchengineworld.com/cgi-bin/sim_spider.cgi

Enter http://www.smart-threads.com and notice that the spider can now
follow the links.  It can also follow the links on the other pages. 
However, viewing the source code will reveal that on the index.html
page, I have inserted the "/" on all relative references, and on the
other pages, even though the "/" is not there, the spider can follow
the links.

Have you ever heard of anything like this before?  I haven't!
Subject: Re: Google returns links to folders that don't exist!
From: jdog-ga on 27 Mar 2003 14:57 PST
 
Glad you figured out how to solve it. Anyway, the problem (strangely
enough) seems to be caused by a combination of your relative links and
the URL used to access the index.

When analyzing your index page, the robot probably accessed it through
"http://www.smart-threads.com". The URL used to resolve most relative
links in this case (assuming it's not overridden by an HTTP response
or a BASE tag) is apparently just "http://". Trying to find the
directory the current document was located in, the robot stripped the
URL of characters on the right until it found a '/'. This would work
in most cases, but index pages can be an exception because they can be
referenced without a complete path (browsers are obviously smart
enough to get around this). Adding the '/' to the beginning of your
relative URLs resolved this little fluke. Accordingly, you could have
kept the links the same if the base URL used by the bots had been the
equivalent "http://www.smart-threads.com/" or
"http://www.smart-threads.com/index.html". Even though it's hard to
control how the bots get to the index page, you can use the BASE tag
to specify the exact base URL you want to be used.
Subject: Re: Google returns links to folders that don't exist!
From: drewlin-ga on 27 Mar 2003 22:29 PST
 
jdog you're awesome... That was a great explanation and will be
valuable information to have in the future.  If this site ever makes
any money, I'll remember to tip you well!  Thanks again,
Drewlin

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy