Google Answers: find the sites (or pages)

View Question

Q: find the sites (or pages) ( Answered, 0 Comments )

Question

Subject: find the sites (or pages)
Category: Arts and Entertainment > Restaurants and City Guides
Asked by: joycen-ga
List Price: $10.00

Posted: 20 Jun 2002 08:05 PDT
Expires: 27 Jun 2002 08:05 PDT
Question ID: 29759

I need to find the links to amazon.
Example:  www.amazon.com ....../centerstagechica/
The returns should include
1. The title
2. The format
3. The link
Such as 
1.Jeff Buckey - Live in Chicago (2000)
2.(DVD)
3.http://www.amazon.com/exec/obidos/ASIN/0738900672/centerstagechica/
Anh give me intruction how to search for them
Send your help asap if you can.
Thanks

Answer

Subject: Re: find the sites (or pages)
Answered By: hedgie-ga on 20 Jun 2002 09:08 PDT

Hello joycen
   
    Sometimes only a partial answer is possible
    within the practical constraints of a problem.
    
    Thousands of the amazon affiliates link to the amazon
    site. So, instead of giving you the fish, I will just 
    give you the instructions how to fish:
    
    1) enter the following into the google search engine
         http://amazon.com
    2) select option 4:     Find web pages that link to amazon.com  
    
    3) get a spider to crawl through the thousands of resulting links
        (links with obidos string in them).
    
    4) modify the spider (usually a perl script) to extract information
       you want (title, format ..) from the target page
       (page at amazon) and format it the way you want.
       
    5) do not spam the affiliates or anyone else.
    
    
    Search term forthe  above is:  reverse URL lookup (or reverse search)

    Other search engines then Google do this, but to get
    output you want, you would need a customised spider.
    
    Here is an article on the topic:   
    Reverse Search Inside Out - Part One: Why and How to Search ... 
... process is to look at links inbound to a particular site or URL...
   http://websearch.about.com/library/weekly/aa061101a.htm 
   
   
      The search terms for the other part are:  spider crawler
      
      You will find many spiders, ready to crawl all sites
      from a given list. If you do not feel like modyfing spider
      yourself, I would suggest posting that task as a separate question,
      not a clarification request.
      I have noticed there are ga few researches,  willing and able 
      to  hack a perl script like this in less then a day,
      for some $40 to $75, which in my humble opinion is quite a bargain.
      
      I hope this is useful.
      
      Hedgie

Request for Answer Clarification by joycen-ga on 23 Jun 2002 19:18 PDT

Hello Hedgie

Please verify step 3 in your answer's instruction. How do I find
"spider to crawl" to change the script?
I did step 1 and 2 but could not follow step 3.
Can you verify step 3, 4, and 5.
Thanks

Clarification of Answer by hedgie-ga on 24 Jun 2002 04:15 PDT

OK. We have a good start.
  To do step 3  I suggest you first read a bit about the spiders. E.g.
here:
   Writing a Web Crawler in the Java Programming Language 
  http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/

 Spider is  a short  program, in a language such as perl ot tcl, which
does this:
  a) takes URL form a list
  b) contacts the server and asks for a page
  c) unlike the browser, it will not display the page, it extracts the
links
 d) ads them to the list
 e) goes back to a) 

    When the program is doing  this, it is said (poetically) that the
spider is crawling the web.

   You need a programmer to modify the program (called script in this
case) so that there are
  few more  step there, namely : d1) extract links with obidos string
and store them
                                                  d2) get the amazon
page for each and extract data (title..)
                                                   d3) store theese
data in the  desired format

 That's the 3) and 4). You need a programmer who knows perl (or other
scripting language)
 to add these step. It is a simple process (once you know the
language).

 Step 5) is just a :-)  note, meaning: Once You are done, you have
lists of thousands on
 Amazon associates.  I hope you are not going to send them unsolicited
e-mail (spam), since
 I would  not want to be helping anyone to do that.

I am leaving on a trip for few days, if you need more clarification,
please be patient.
 A perl progammer may comment and offer to  do the modification for
you, as a different
 question or you may try elance or other such place to hire a
programmer. I am not a perl hacker.

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy