Google Answers Logo
View Question
 
Q: google image thumbnails: what determines format and other questions ( Answered 3 out of 5 stars,   0 Comments )
Question  
Subject: google image thumbnails: what determines format and other questions
Category: Computers > Algorithms
Asked by: lewisdgriffin-ga
List Price: $50.00
Posted: 08 Sep 2003 15:29 PDT
Expires: 08 Oct 2003 15:29 PDT
Question ID: 253652
Google image returns pages of thumbnail images, presumably created by
Google while the index is being updated. The thumbnails are of various
format, most commonly .jpg and .gif, but there are small numbers of
other formats and variants in the filename extension.

i) what determines the format used for a particular thumbnail? 

ii) what details of the algorithms used by Google for thumbnail
creation are available? In particular for formats such as .gif that
use a 256 entry colormap, rather than a full 24bit representation,
what algorithm is used to select the 256 colors.
Answer  
Subject: Re: google image thumbnails: what determines format and other questions
Answered By: slawek-ga on 08 Sep 2003 19:30 PDT
Rated:3 out of 5 stars
 
Good Day lewisdgriffin-ga,


Before I answer your question, please keep in mind: 
 
There are very few people in this world that KNOW Google's algorithms
and procedures on anything. Us Google researchers are not included in
that small group. This means that my answer will be based on things
like trial and error, and some general beliefs circulating the
Internet.


HOW IS THE THUMBNAIL FORMAT DETERMINED?
---------------------------------------

The format of the thumbnail file is the same as the original file. 
Although I cannot find any statements on this anywhere on the
Internet, after trying a few dozen images myself I have yet to find an
image that would disprove this theory.

You can try doing a search for "png", without the quotes, and you will
get quite a few .PNG files as the search result. Now, PNG is a less
common file format on the Internet, and I figured it makes a good
guideline. All thumbnails that are in PNG format lead to original
images that are also in the PNG format. Of course I tested this theory
with a couple of other formats, namely GIF and JPG.

I have also performed a search for a random term, which if course
resulted in images of various file types being returned as search
results. Again, I have followed a few of these thumbnails back to
their original website. The results were as I predicted: the thumbnail
file format matched the original file type.


SO WHAT DOES THAT MEAN FOR COLOUR ALGORYTHMS?
---------------------------------------------

There is no need for any colour conversions. Since the thumbnail is in
the same format as the original image, the file limitations do not
change and the palette settings can be carried over to the thumbnail.
Of course, since the thumbnail is smaller physically it might contain
fewer colours because some pixels will get edited out during the
"shrinking process".


WHAT ABOUT THE SOFTWARE BEHIND THE PROCESS?
-------------------------------------------

As I mentioned in my first paragraph, very few people know for sure
what processes, algorithms, and software is used by Google behind the
scenes. Guessing what software is actually used for the process would
be just that: a guess.

Still, there are a few good programs out there that can accomplish
that which Google is doing. Some of these programs have an option to
convert files to JPG format. I have not found a program that can
convert from a high colour format, to a low colour format. I suspect
this is due to the dilemma you presented: selecting only the best 256
colours.


WHAT ABOUT THE ALGORYTHM?
-------------------------

Here is the formula for one algorithm I have found for image resizing.
Of course, there is no way of telling if this it the one that Google
uses for their thumbnail creation:

LanczosResize Algorithm:
LanczosResize(clip, int target_width, int target_height)
LanczosResize(clip, int target_width, int target_height, float
"src_left", float "src_top", float "src_width", float "src_height")

The AutoSiteGallery software, and probably many others employ this
algorithm.

WEBSITE: AutoSiteGallery by Brizsoft
URL: http://www.brizsoft.com/galmaker.html
SOURCE OF ALGORYTHM:
http://ftp.eenet.ee/doc/AviSynth/filters/lanczosresize.html


OTHER THUMBNAIL SOFTWARE
------------------------

WEBSITE: Au2HTML
URL: http://www.filehouse.com/au2html/

WEBSITE: SimplyTheBest Thumbnail Shareware
URL: http://simplythebest.net/shareware/graphics/thumbnail_tools.html
NOTES: This link is a library of 11 separate thumbnail creation
utilities.


FINAL NOTES
-----------

From my experience in graphics manipulation and design, I venture that
if a thumbnail creation program did offer the option to convert to a
lower colour depth image, the process would be two stage:

- Image would be resized with a resize algorithm, thus creating the
thumbnail
- The image would be put through a colour reduction filter to fit the
specific file format into which it is to be converted

Why the two separate stages? Well, reducing size on an image with few
colours results in very poor quality of the smaller image. Any
resizing should be done in as many colours as possible.

I have search around for a colour reduction algorithm, but did not
find anything. I would be happy to attempt at prying some information
out of graphics software engineers. Please let me know if you do
indeed want this kind of information.

Since Google needs everything done automatically, there are PHP and
ASP programs that are designed to create thumbnails.  Again, please
let me know if you require links to such programs.


SEARCH STRATEGY
---------------

Google search for: "create thumbnails"
                   "images.google.com"+"thumbnails"+"how"
                   "colour reduction"+"algorythm"
            
Answer includes experience gained from graphics manipulation and
design for my web design company.

I sincerely hope that I have answered all your questions in full and
to your satisfaction. If this is not the case, please do not hesitate
to ask for a clarification. I will respond to it in a timely manner.

Thank you for choosing Google Answers!


Regards,
slawek-ga

Request for Answer Clarification by lewisdgriffin-ga on 09 Sep 2003 02:12 PDT
Dear slawek-ga,

Thanks for your prompt reply.

1. The first part of your answer - that thumbnails have the same
format as the image from which they are made - is very useful. I take
your point - therefore Google doesn't need to do any color reduction
itself - that follows from that.

2. I guess what I really need to know then is - "what are the most
commonly used algorithms for choosing the reduced colormaps of .gif
images?". I have a fair body of academic papers dealing with color
quantization, colour indexing, color histogram reduction, but these
don't tell me what algorithms are actually used in practice. Perhaps
your graphics software engineer contacts would be able to help with
this?

3. Thank you for the other material and comments that you found, but
it's not quite what I'm after.

    lewisdgriffin-ga

Clarification of Answer by slawek-ga on 09 Sep 2003 07:58 PDT
Hi lewisdgriffin-ga,

Glad that you could use most of the information I provided.
I have to step out for about half the day, and will respond in detail
when I return. My friends grandmother passed away, and I am off to
help with some of the arrangements and other things... I expect to be
back in 4-6 hours, and will then work on delivering more information
to you on colour reduction.

Regards,
slawek-ga

Clarification of Answer by slawek-ga on 09 Sep 2003 17:40 PDT
Good Day lewisdgriffin-ga,


I am looking at how Paint Shop Pro 7 reduces colours. The two most
powerful options for reducing colour are the Median Cut and Optimised
Octree. This is what I used as my starting point...

The Median Cut palette is similar to the NeuQuant method that works by
compressing a range of values to a quantum value. By reducing the
number of discrete symbols in a given stream, the stream becomes more
compressible.

Having looked around for the Optimised Octree method, I have found
very little information on it. It would appear that the Median Cut /
NeuQuant method is much more popular, and of higher quality.

I did find a lot of information on the Median Cut / NeuQuant methods,
some of which appears to contain partial formulas. I have to admit,
although I am familiar with graphic editing and manipulation, I know
very little about how my software does what I ask of it.  Therefore, I
will resist the urge to draw my own conclusions and assemble the
material I have found into something bigger and better. Instead I will
provide some links that I have read over and appear to have the
information you are looking for. Hopefully this makes more sense to
you then it does to me. :)

WEBSITE: NeuQuant Method
URL: http://members.ozemail.com.au/~dekker/NEUQUANT.HTML

WEBSITE: The Median Cut Algorithm
URL: http://euklid.mi.uni-koeln.de/c/mirror/www.cs.curtin.edu.au/units/cg351-551/notes/lect2r1.html

WEBSITE: An Overview of different Clustering Algorithms
URL: http://www.comp.lancs.ac.uk/~kristof/research/notes/clustr/

Because I found all this information, I have yet to E-mail anyone
about more information. Of course, this is still an option if the
material I found does not answer your questions. Please let me know if
I should E-mail away, or if the above is quite sufficient. I am
learning a lot here! J


SEARCH STRATEGY:

Google search for: "median cut"+"what is"
                   "optimized octree"+"what is"


Regards,
slawek-ga
lewisdgriffin-ga rated this answer:3 out of 5 stars
thanks very much.

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy