Google Answers Logo
View Question
 
Q: Perl script for printing out product search engine queries (up to $25 tip) ( Answered 4 out of 5 stars,   1 Comment )
Question  
Subject: Perl script for printing out product search engine queries (up to $25 tip)
Category: Computers > Programming
Asked by: froogler-ga
List Price: $5.00
Posted: 07 May 2003 19:36 PDT
Expires: 06 Jun 2003 19:36 PDT
Question ID: 200943
My manager at work has assigned me the task of analyzing the rank of
our company's products on specific shopping search engines, like
Froogle, Y! Shopping, BizRate, etc. by querying specific searches on a
list of terms that should (hopefully) bring up results which include
our products.

Part of my task is to print out the results pages for each query
(usually 3-4 pages) to create a hard copy record of our analyses (we
do them on a regular basis).  This process is fairly cumbersome, as I
currently take each term from our list, query it and then print out
the results pages for that query.  Since I have to do 100-200 every
week or two, it is a huge time sink and I'm looking for a way to
automate the process.

So....my question is as follows:

*Is it possible to write a Perl script that will automate these print
jobs?  Mind you, I'm working with Internet Explorer 6.0.  If the
answer to this is yes...

	A.	Could you write a Perl script (and post the code here on Google
Answers) that will take a list of terms that I could cut and paste
into a web form, do the relevant queries on them and print each one
out?  You only need to write code which will do this for one of the
search engines.  Since I like Google and I'm using Google Answers,
just make the code run for Froogle (http://www.froogle.com).  ($20
tip)

A sample list of queries is as follows:

Men's gloves
Leather jacket
Men's leather shoes
Worsted wool suit
men's cashmere sweater

	B.	Could you include code in the Perl script to print the pages in
this format: double-sided, 2 pages per side? ($5 tip)

If the answer is no...explain why not and, if possible, give me a way
to accomplish this task.

Thanks,
Froogler

Request for Question Clarification by dogbite-ga on 07 May 2003 20:46 PDT
Hi froogler,

  I am interested in answering your question
  but want to clarify it first.

  Is it essential that you print the result
  pages from IE?  How many HTML pages do you
  print for each search?  When you say you
  print 2-3 pages, does that mean you print
  the first 2-3 pages of results, or that 
  one webpage takes up 2-3 paper pages.

  It would be easiest to put the search terms
  in a text file and then have a perl script
  issue the queries to froogle and retrieve
  the HTML pages.  That script could then
  simply print out the result text from the
  search results.  Alternatively, the script
  could use a simple html renderer to create 
  an IE-like printout.

  The more complex solution would be using
  a perl module like Win32::OLE to interact
  with IE.  

  Can you help me better understand your question?

                  dogbite-ga

Clarification of Question by froogler-ga on 08 May 2003 00:04 PDT
Hi dogbite,

Thank you for taking a look at my question.

To answer your first question...what I want are (up to) the first 4
pages of the search results of each query.  Most will take 4, some
will take fewer.  When printing 2 pages per side, double sided, that
should mean one physical sheet of paper is being used to print out on
our printers at work.  Also, the orientation of the pages should be
portrait relative to themselves but the physical sheet of paper will
be landscape (think of what the pages of a booklet would look like).

To answer your second item, you can issue the queries to Froogle
anyway you please - from a text list is fine.  However, I want the
actual HTML rendering of the results pages printed...with the
formatting, relative placement and images.  The script should run and
yield print-outs that look just like what the browser displays.  It
does not have to print from IE, per se...I mentioned that in case it
was relevant information.

Whether you use a simple HTML-renderer or the more complex module that
you mentioned is no consequence to me.

Another alternative would be to take the browser results pages for
each query and append the HTML for each set of results to the previous
query's results, thus making (and saving) one large file.  So, you
would get the results for query 1 on pages 1-4, the results for query
2 on pages 5-8, the results for query 3 on pages 9-12, etc. juxtaposed
in that particular order in one large HTML file.  The only stipulation
is that there would have to be some sort of code to create page breaks
after each set of results...so that the results for Query 2 would
start on a new fresh "page" rather than down the middle of the last
page of Query 1.  The sets of printed results for each query should
remain discrete from the results for the other queries.

The internal page breaks in this large file would be registered by the
browser and then I could format the printouts anyway I wanted.  That
would also solve my problem.

Thanks,
Froogler

Request for Question Clarification by dogbite-ga on 08 May 2003 08:56 PDT
Hi froogler-ga,

  I propose a solution that has
  the html2ps program at its core.
  The program's homepage is here:

http://www.tdb.uu.se/~jan/html2ps.html

  That program should be able to
  download all the result pages
  and render them, images and all,
  into PostScript.  You can then
  print the PostScript files however
  you want.

  I could write a script that would
  convert all of your search terms
  into froogle URLs and then feed those
  URLs into html2ps.

  There are a few caviats though.  First,
  you will have to handle installing all
  of the modules that html2ps requires.
  Those include Perl, ImageMagick, and 
  Ghostscript.  Second, I cannot guarantee
  html2ps will render the pages correctly.

  What do you think?

            dogbite-ga

Clarification of Question by froogler-ga on 08 May 2003 12:27 PDT
Hi dogbite,

Thanks for your helps so far.  I found out that I don't have
authorization to install ImageMagick or Ghostscript.  So if your
postscript solution requires those things, it's a no go.  It sounds
like it would work (practically speaking).

How would you tackle the problem with the Win32::OLE module?

Froogler

Request for Question Clarification by dogbite-ga on 08 May 2003 23:09 PDT
Hi froogler-ga,

  My experience with interacting with a 
  program like IE from an outside script
  is that it is always requires a lot
  of fiddling.  Also, it often requires
  a C or C++ compiler for Windows, like
  Visual C++.

  Upon further thought, I think your 
  suggestion of putting everything into
  one file is best.  Here are two pages
  that I put into one .html file:

http://froogle.google.com/froogle?q=men%27s+gloves&btnG=Froogle+Search

  Are you able to install curl on your
  Windows computer?  There is an installation
  file here:

http://www.cag.lcs.mit.edu/curl/download.html

  Also, do you have Perl installed?

           dogbite-ga

Clarification of Question by froogler-ga on 09 May 2003 13:50 PDT
Dogbite,

Is the URL you mentioned in your last clarification request a
cut/paste typo?  Curl sounds like it would work fairly well. 
However...

I'm going to close the question (a coworker came up with a potential
solution), but I think you should be paid for your time and help.  If
I end up needing help on this again, I'll specify you to answer it
first.

Thanks,
Froogler

Clarification of Question by froogler-ga on 09 May 2003 13:51 PDT
Dogbite,

Please cut and paste your clarification requests into the Answer and I
will pay you.

Thanks,
Froogler
Answer  
Subject: Re: Perl script for printing out product search engine queries (up to $25 tip)
Answered By: dogbite-ga on 09 May 2003 14:02 PDT
Rated:4 out of 5 stars
 
Hi Froogler,

  Yes, I'm sorry -- I meant to paste this:

http://nms.lcs.mit.edu/~gch/google/froogle/foo.html

  It's simply two <html>...</html> documents that
  I downloaded from Froogle with curl and then appended 
  in the same .html file.

  Thank you for the payment -- I'm happy that
  you found a solution.

                    dogbite-ga
froogler-ga rated this answer:4 out of 5 stars and gave an additional tip of: $5.00
Thanks for your help, dogbite.  Your suggestions were very helpful.

Comments  
Subject: Re: Perl script for printing out product search engine queries (up to $25 tip)
From: studboy-ga on 08 May 2003 21:56 PDT
 
I strongly recommend taking dogbite-ga's suggestion of outputing to a
file, then printing that file from Internet Explorer (besides, you can
use Window scripting or a batch (.bat) file to print it AFTER you
configure your printer's default settings:
http://www.robvanderwoude.com/printfiles.html).  The work of
automating the printing part here buys you *very little value* in
light of the effort that you have to put up to get it to work: since
Win32:OLE is a pain to use and does not guarantee portability over the
long run.  Just my 2 cents.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy