Google Answers Logo
View Question
 
Q: Google Meta query ( Answered 5 out of 5 stars,   1 Comment )
Question  
Subject: Google Meta query
Category: Computers > Internet
Asked by: pafalafa-ga
List Price: $5.00
Posted: 27 Oct 2002 08:08 PST
Expires: 26 Nov 2002 08:08 PST
Question ID: 90542
For the Google query:

://www.google.com/search?num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&q=parse+url

(1) What do the different parts of the query mean?  For instance, what
is "hl=en"(or is it "hl=en&lr"?), etc.

(2) Please point me to a plain-English primer on the commands and
variables used to build a query like this.

Thanks so much.

Clarification of Question by pafalafa-ga on 27 Oct 2002 15:13 PST
P.S.  I think this will be a very easy question to answer for someone
who is already familiar with the language/protocol of URLs, and a
pretty tough question for anyone who isn't.  I'd appreciate an answer
from the former rather than latter, if possible.

Thanks.
Answer  
Subject: Re: Google Meta query
Answered By: theta-ga on 28 Oct 2002 02:19 PST
Rated:5 out of 5 stars
 
This answer will be a lot more understandable to you if you know some
programming and are familiar with basic programming concepts such as
variables.

To answers the first part of your query :
in the URL you have given
'://www.google.com/search?num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&q=parse+url'
 '://www.google.com/search ' is the address of the search script
used by Google to execute your query.
All the data occurring after the ? in the url refers to the different
variables used by Google to tailor the search to your particular
needs. This data is of the form 'variable=value'. Multiple
variable-value pairs are seperated by an ampersand(&).
So in the remaining URL
'num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&q=parse+url' , num is a
variable and '100' is the value assigned to it.Similarily hl is
another variable and 'en' is the value assigned to it.
You should understand that the variable names and values used are
specific to Google and can be changed by them as and when they like.
As to what they mean, here is a simple explaination :
 1) num=100 refers to the number of results that are shown at once on
the results page.The default value is 10 results per page.
 2) hl=en refers to the language in which the google home page will be
shown to you. The default value is 'en' which is the language code for
ENglish.
 3) lr= this variable is used to specify the language of the pages to
be returned in reply to your query.For example, you may specify that
you want only pages in spanish to be returned.By default, no
particular language is specified,which is why there is nothing after
the = sign, and Google returns matching pages in all languages.
 4) ie=UTF-8&oe=UTF-8 refer to the input and output character
encodings. A character encoding refers to the way the characters
making up the particular language you use are represented in computer
form. There are a lot of different character encodings (the most
famous is ASCII) and which one is specified in the google URL depends
on your browser and your Operating System. Here the encoding used is
the Unicode UTF-8 encoding.
 5)safe=off refers to the SafeSearch feature offered by the Google
engine which, if enabled, automatically removes sites that contain
pornography and explicit sexual content from the search results. Here
the value off indicates that Safe Search has not been enabled.
 6)q=parse+url refers to the search terms you specified in your query.
Here google will search for pages containing both the words 'parse'
and 'url'. The '+' symbol is actually the space you entered between
the two words. Since URL's cannot contain a space character, all space
characters are converted to the '+' symbol.

You can experimrnt with the values of these variables (and many
others) at the Google Advanced search page (
://www.google.com/advanced_search )

Now for the second part of your question :
  These variables are passed to the relevant script by using the
CGI(Common Gateway Interface) protocol.The protocol defines two
methods for sending data : the GET and the POST methods.
The data being sent is URL Encoded, ie since no space characters are
allowed in a URL, all space characters in the data are converted to a
'+' sign, and other characters are encoded as '%xx' , where 'xx' is
the hexadecimal ASCII code for that character. For eg : an '='
character in the data is encoded as '%3d'.
An easy to understand intro to CGI can be found at the NCSA's CGI page
(http://hoohoo.ncsa.uiuc.edu/cgi/overview.html).
This page also contains a tutorial for using CGI with HTML forms(which
is what you want to do).
 A simple explaination of CGI and using it with forms is given at
James Marshall's site (http://www.jmarshall.com/easy/cgi/).
 Again. since this is a programming topic, some programming experience
will be helpful.
  The best way to learn CGI is by looking at scripts others have
written.You can find links to various public domain CGI scripts and
tutorial in the Google Directory : Computers > Programming > Internet
> CGI > Tutorials ( http://directory.google.com/Top/Computers/Programming/Internet/CGI/Tutorials/?il=1
)

Well thats it.Hope this answered satisfied your queries. If you need
any clarifications, just ask and I will be glad to help.
:)

RELATED LINKS :

 Customize GOOGLE Help Page
   Contains information on the various customizable options offered by
Google, and what they do.
   ( ://www.google.com/help/customize.html )

 Unicode Home Page
   Contains information on the Unicode character encoding and UTF-8
   ( http://www.unicode.org/ )

 An instantaneous introduction to CGI scripts and HTML forms
   A comprehensive introduction provided by the University of Kansas
   ( http://www.ku.edu/~acs/docs/other/forms-intro.shtml )
pafalafa-ga rated this answer:5 out of 5 stars
Great answer...and comment, too.  Thanks!

Comments  
Subject: Re: Google Meta query
From: osmosis_vic-ga on 27 Oct 2002 23:21 PST
 
everything after the search? is a variable definition, used by the
'search' command being sent to the ://www.google.com/search
script.  for instance, num=100 could mean that the 'num' variable
defines number of returns, number of returns received already, number
to begin display at, or something else.  hl=en likely means homepage
language (hl) is equal to english (en)

When passing variables thru HTTP, you need to use the ampersand (&) to
join different definitions together.  in plain text, the browser is
saying to the server :
Retrieve the code at ://www.google.com/search and use
num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&q=parse+url as the
variables when you execute the code.

hope this helps you out.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy