Google Answers Logo
View Question
 
Q: Perl (CGI) -script that makes thumbnails of webpages ( No Answer,   2 Comments )
Question  
Subject: Perl (CGI) -script that makes thumbnails of webpages
Category: Computers > Programming
Asked by: masali-ga
List Price: $10.00
Posted: 18 Dec 2002 15:05 PST
Expires: 07 Jan 2003 09:03 PST
Question ID: 126643
I would like to find a perl-script that uses a module or something to
access some *nix program (or some similar process) that takes
snapshots (thumbnails in jpeg) of webpages layout, that will say what
you see in a web browser. Similar to using print screen button on pc
and
paste the contents of the clipboard to a graphical program (e.g.
photoshop). Let's say
for example if I call the script located at
http://www.domain.com/cgi-bin/snapshot.pl?url=www.google.com, it
should make a jpeg-thumbnail of the front page of google and save it
on the server.

Request for Question Clarification by tar_heel_v-ga on 18 Dec 2002 17:37 PST
I have an ASP script that will do exactly what you are looking for. 
It is an ActiveX control that you run locally.  If you would like
this, let me know.

THV

Clarification of Question by masali-ga on 19 Dec 2002 06:30 PST
Hi again, 

The script/interface I would like needs to be for *nix. I've also
thought something could be done with Mozilla, Virtual Frame Buffer
(xvfb) and some screen capturing program. I was thinking of making it
myself but I don't have much knowledge in the *nix world, and thought
there might be some open source program that I could use. But I would
also like to have a look at the ActiveX that tar_heel_v-ga suggested.
Please tell me where to find it.

Request for Question Clarification by tar_heel_v-ga on 19 Dec 2002 09:43 PST
I just want to verify that you will accept this script as an answer or
if you wanted to wait and determine if anyone found a CGI script that
will meet your needs, then look at the ASP script.

-THV

Clarification of Question by masali-ga on 19 Dec 2002 14:07 PST
No, I would like to wait a bit and hope that someone finds a script for *nix.

Request for Question Clarification by coral-ga on 24 Dec 2002 18:24 PST
I have a possible answer for you, but it only applies to Mac OS X --
which is, technically, a *nix.  If that's acceptable, say the word and
I'll respond.

Clarification of Question by masali-ga on 25 Dec 2002 08:24 PST
Nope, I need one for Unix, Max OS X programs does not function on Unix.
Thanks anyways.
Answer  
There is no answer at this time.

Comments  
Subject: Re: Perl (CGI) -script that makes thumbnails of webpages
From: triniman-ga on 18 Dec 2002 17:19 PST
 
Hey,

I knew I had seen a search engine that does this: It's at
http://www.searchshots.com

The guy who did that site Paul Rydell has it listed on
http://paul.rydell.com

Apparently he got the images using VB on a Windows platform. Perhaps
you can contact me for general algorithm info.

If I had to do this on a UNIX platform with Perl, it would have to
involve something similar, using Mozilla or something on XWindows, I
would guess.

That is the closest existing implementation that I've seen.

Good luck finding/creating a script!


-triniman
Subject: Re: Perl (CGI) -script that makes thumbnails of webpages
From: cwrl-ga on 07 Jan 2003 07:58 PST
 
This isn't trivial. I'd probably do it using Xvfb, but the overhead is
not small. Below is a small shell script which should serve as a
start.

How it works: we run Xvfb, which produces a new X display which is
mapped in
memory and not associated with a physical screen. We then set the
DISPLAY
variable to refer to this screen, and run mozilla, passing it the URL
to view.
At this stage we *should* wait for it to display the page in some
sensible
manner, but for this example we just sleep for five seconds. Then, we
dump
the contents of the display using xwd -root -screen and pass the
output of
that into xwdtopnm and cjpeg, to convert it to a JPEG. If you wanted a
small thumbnail, you'd want to invoke pnmscale in the pipeline between
xwdtopnm and cjpeg. Some other notes: I invoke Mozilla using a profile
called `test', which is set up to not display any toolbars and so
forth, and
to avoid clashing with my real Mozilla profile. Various other options
in
the script are probably not necessary, but were handy for getting it
running as a prototype.

This script is quite hard work for the server to run. Mozilla itself
is a
big, slow, inefficient program. If you want to use this, you'd
certainly
want to have Mozilla sitting around in memory, and use the mozilla
-remote
command to tell the existing Mozilla to view new URLs. But that has
the
problem that every so often Mozilla will pop up a dialog box or
something
which will spoil the image. To do this really seriously, I'd recommend
writing a little X program which embeds the Mozilla rendering engine.
This
could also do the screen dumping and so forth, and you could just feed
it
URLs in reasonable certainty that it will do everything properly.

The script itself:

#!/bin/sh
set -x
set -e

url=$1
if [ -z $url ] ; then
    echo "must specify a URL" 1>&2
    exit 1
fi

Xvfb :5 -fp "unix/:-1" -screen 0 800x600x24 &
xvfb_pid=$!
DISPLAY=:5
export DISPLAY

#xdpyinfo

mozilla -height 600 -width 800 -P test -geometry +0+0+800x600 $url &
mozilla_pid=$!

xclock&

sleep 5

xwd -screen -root | xwdtopnm | cjpeg > fish.jpg

set +e

kill $mozilla_pid
kill $xvfb_pid

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy