![]() |
![]() | ||
Google returns links to folders that don't exist!
Category: Computers Asked by: drewlin-ga List Price: $6.00 |
25 Feb 2003 15:32 PST
Expires: 27 Mar 2003 15:32 PST Question ID: 167103 |
I launched a site, www.smart-threads.com, a few weeks ago, and I have been checking google to monitor its crawl. Typing the following into the google search box: site:www.smart-threads.com "www.smart-threads.com" yields the following results: www.smart-threads.com/more/petite/ Similar pages www.smart-threads.com/merchants/nordstroms/ Similar pages Now, there never were, and are no folders on my server with the names "merchants", "nordstroms", "more", or "petite". How could it of found this result? Go ahead and try it, and click on one of the links google provides to my site, and you'll see that you arrive at my custom 404 page! How is this possible, and is it detrimental to search engine optimization? Thanks a million! | |
| |
![]() | ||
Re: Google returns links to folders that don't exist!
Answered By: justaskscott-ga on 02 Mar 2003 18:57 PST Rated: ![]() |
Hello drewlin-ga, A fellow Researcher (thanks secret901-ga!) alerted me to your February 26 comment, and so I will go ahead a post an answer. The answer, as noted, is basically here: "Q: Have google perceived my website as spam?" Google Answers http://answers.google.com/answers/main?cmd=threadview&id=160848 It sounds like your web site was established after Google's most recent update, which began on January 25, 2003. In that case, you might have to wait for two more updates for the old information to be replaced by the current web site information. If you established your web site in time for that update, you might only have to wait for one more update, which webmasters are speculating should begin any day now: "Alright enough lets update" (February 25, 2003 to present) WebmasterWorld http://www.webmasterworld.com/forum3/9608.htm In the event that your web site is not properly indexed -- or in order to ensure, as much as possible, that it does get indexed -- you might want to consider the information in another answer, which concerns another web site apparently established at about the same time as yours. "Q: Time frame for our MUSKY GUIDE service to be listed by GOOGLE??????" Google Answers http://answers.google.com/answers/main?cmd=threadview&id=168128 I hope that this information is helpful. - justaskscott-ga | |
| |
rated this answer:![]() Great, and reassuring, thank you! |
![]() | ||
Re: Google returns links to folders that don't exist!
From: justaskscott-ga on 26 Feb 2003 11:36 PST |
Perhaps my answer to another question -- involving a web site changed at a similar time -- is applicable to this situation: "Q: Have google perceived my website as spam?" Google Answers http://answers.google.com/answers/main?cmd=threadview&id=160848 (Indeed, I assume that it is applicable. If you think that it answers your question, I would be happy to post it as an answer, modified to your situation.) |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 26 Feb 2003 13:12 PST |
justaskscott-ga: That definitely helped! Thanks a million, and I won't worry too much about it now. You can post it as an answer and cash out if you want. Thanks again, Drewlin |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 02 Mar 2003 01:31 PST |
that would only explain it if there were, at one time, "/more/petite" and "/merchants/nordstroms" directories. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 02 Mar 2003 19:35 PST |
jdog-ga has a point... I'm absolutely positive that these folders never existed, I developed the entire site myself, and would've had no reason to create these folders. Any other ideas? Thanks! |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 03 Mar 2003 10:35 PST |
It seems likely to me that drewlin would have had to buy the domain if someone had previously held it, not register it. In any case, I wouldn't worry about the dead links, some whacky stuff can happen around the time of the update. If they're still there after the update, let us know and we'll see if we can help. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 05 Mar 2003 12:41 PST |
jdog-ga, that's what I was thinking too, and because I did register it as a new domain, and NOT buy it from someone else, it is puzzling. It seems like the dance came and went on the 3rd, and the googlebot hit every page on my other sites, but on smart-threads.com, it only crawled my robots.txt file, index.html, and one product page, and then it left. This site should be very robot-friendly, with straight HTML, lots of text, small page size, no content that is too similar from other sites, and no javascript or dynamic pages. Why would it of left? Could it have something to do with those non-existent links that somehow got there before? I'm beginning to worry that it will not be indexed well. Thanks again for your help! |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 05 Mar 2003 13:01 PST |
As of 2 minutes ago, I couldn't access the page. Could it be possible that the site (or part of it) was down at that time? |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 05 Mar 2003 13:12 PST |
The site's working again now. Anyway, you can submit the dead links for removal at [ http://services.google.com:8882/urlconsole/controller?cmd=reload&lastcmd=login ], but I see no reason they would prevent your site from being indexed. Unfortunately, you might have to wait till next month to find out if the problem gets solved. In the meantime, you may want to write another letter to google and include the latest developments. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 05 Mar 2003 13:21 PST |
Your help is greatly appreciated - thank a million, and I'll take your advice! Have a good one, Drewlin |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 07 Mar 2003 20:50 PST |
Just another thought: why do you have robots index your 404 page? following the links I can understand, but I'm not sure what robots would do if they are told to index after receiving a 404 response. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 07 Mar 2003 21:28 PST |
Great catch! I didn't realize I'd put those meta tags in the 404 page. I'll remove them for sure. However, I thought that the googlebot ignored meta tags. I read this somewhere, and so I included metatags on my site for the rest of the robots that do read them. Maybe I received poor information. Regardless, your point may be a very good explanation to why the rest of my site didn't get crawled. |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 08 Mar 2003 13:30 PST |
Someone lied to you ;) "Googlebot obeys the noindex, nofollow, and noarchive meta-tags. If you place these tags in the head of your HTML document, you can cause Google to not index, not follow, and/or not archive particular documents on your site." [ ://www.google.com/bot.html#noindextags ] I looked at some other pages on your site, and they all have a "index, follow" tag (which, by the way, is the standard behavior of robots if you don't have a meta tag on the page). That shouldn't prevent Googlebot from indexing your page, I just thought it was weird that tag in the 404 page said to index. Still, even there, it shouldn't have stopped the bot. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 08 Mar 2003 14:50 PST |
You're right. I'm still stumped, and anxious, because I don't want to lose out on getting my pages indexed next month too. I believe all my meta tags to be correct, I've got links pointing to this site from other sites that are indexed by google, I've submitted my URL to DMOZ and other directories, and there is nothing that I know of on my site to deter spiders. I guess I'll just have to wait and see, keeping my fingers crossed. If there is anything else that you might suggest I do, I greatly appreciate it! Thanks as always, Drewlin |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 10 Mar 2003 11:45 PST |
Well here's a bit of good news: Google seems to have at least indexed your index page (do a search for "smart-threads.com") and dropped the dead links. I'm guessing you should be set for the next crawl, but the results may not show up for another two months. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 25 Mar 2003 10:57 PST |
I think I may have figured out why the googlebot didn't get past my index.html page! Using searchengineworld.com's Sim Spider, which allows you to enter a URL and displays what a spider will see, I made an interesting discovery. The spider could not follow any of the relative links on my page for some reason, and I know why and have fixed the problem, but don't really know why! Get this, every relative link on my page had a typical <a href="somefile.html"> tag. What the spider saw was, "http://somefile.htm" and it tried to follow this link, which of course, was dead. I went back and changed all my relative references to something like this: <a href="/somefile.html">, and just by adding the "/" before the filename, the spider can now follow the correct link. Very strange! Another puzzling fact is, the spider only acts this way on my index.html page. Once it gets to another page, it can follow any link that has <a href="somefile.html"> without the "/". Very strange, test it out if you like, sim spider can be found here: http://www.searchengineworld.com/cgi-bin/sim_spider.cgi Enter http://www.smart-threads.com and notice that the spider can now follow the links. It can also follow the links on the other pages. However, viewing the source code will reveal that on the index.html page, I have inserted the "/" on all relative references, and on the other pages, even though the "/" is not there, the spider can follow the links. Have you ever heard of anything like this before? I haven't! |
Re: Google returns links to folders that don't exist!
From: jdog-ga on 27 Mar 2003 14:57 PST |
Glad you figured out how to solve it. Anyway, the problem (strangely enough) seems to be caused by a combination of your relative links and the URL used to access the index. When analyzing your index page, the robot probably accessed it through "http://www.smart-threads.com". The URL used to resolve most relative links in this case (assuming it's not overridden by an HTTP response or a BASE tag) is apparently just "http://". Trying to find the directory the current document was located in, the robot stripped the URL of characters on the right until it found a '/'. This would work in most cases, but index pages can be an exception because they can be referenced without a complete path (browsers are obviously smart enough to get around this). Adding the '/' to the beginning of your relative URLs resolved this little fluke. Accordingly, you could have kept the links the same if the base URL used by the bots had been the equivalent "http://www.smart-threads.com/" or "http://www.smart-threads.com/index.html". Even though it's hard to control how the bots get to the index page, you can use the BASE tag to specify the exact base URL you want to be used. |
Re: Google returns links to folders that don't exist!
From: drewlin-ga on 27 Mar 2003 22:29 PST |
jdog you're awesome... That was a great explanation and will be valuable information to have in the future. If this site ever makes any money, I'll remember to tip you well! Thanks again, Drewlin |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |