This answer will be a lot more understandable to you if you know some
programming and are familiar with basic programming concepts such as
variables.
To answers the first part of your query :
in the URL you have given
'://www.google.com/search?num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&q=parse+url'
'://www.google.com/search ' is the address of the search script
used by Google to execute your query.
All the data occurring after the ? in the url refers to the different
variables used by Google to tailor the search to your particular
needs. This data is of the form 'variable=value'. Multiple
variable-value pairs are seperated by an ampersand(&).
So in the remaining URL
'num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&q=parse+url' , num is a
variable and '100' is the value assigned to it.Similarily hl is
another variable and 'en' is the value assigned to it.
You should understand that the variable names and values used are
specific to Google and can be changed by them as and when they like.
As to what they mean, here is a simple explaination :
1) num=100 refers to the number of results that are shown at once on
the results page.The default value is 10 results per page.
2) hl=en refers to the language in which the google home page will be
shown to you. The default value is 'en' which is the language code for
ENglish.
3) lr= this variable is used to specify the language of the pages to
be returned in reply to your query.For example, you may specify that
you want only pages in spanish to be returned.By default, no
particular language is specified,which is why there is nothing after
the = sign, and Google returns matching pages in all languages.
4) ie=UTF-8&oe=UTF-8 refer to the input and output character
encodings. A character encoding refers to the way the characters
making up the particular language you use are represented in computer
form. There are a lot of different character encodings (the most
famous is ASCII) and which one is specified in the google URL depends
on your browser and your Operating System. Here the encoding used is
the Unicode UTF-8 encoding.
5)safe=off refers to the SafeSearch feature offered by the Google
engine which, if enabled, automatically removes sites that contain
pornography and explicit sexual content from the search results. Here
the value off indicates that Safe Search has not been enabled.
6)q=parse+url refers to the search terms you specified in your query.
Here google will search for pages containing both the words 'parse'
and 'url'. The '+' symbol is actually the space you entered between
the two words. Since URL's cannot contain a space character, all space
characters are converted to the '+' symbol.
You can experimrnt with the values of these variables (and many
others) at the Google Advanced search page (
://www.google.com/advanced_search )
Now for the second part of your question :
These variables are passed to the relevant script by using the
CGI(Common Gateway Interface) protocol.The protocol defines two
methods for sending data : the GET and the POST methods.
The data being sent is URL Encoded, ie since no space characters are
allowed in a URL, all space characters in the data are converted to a
'+' sign, and other characters are encoded as '%xx' , where 'xx' is
the hexadecimal ASCII code for that character. For eg : an '='
character in the data is encoded as '%3d'.
An easy to understand intro to CGI can be found at the NCSA's CGI page
(http://hoohoo.ncsa.uiuc.edu/cgi/overview.html).
This page also contains a tutorial for using CGI with HTML forms(which
is what you want to do).
A simple explaination of CGI and using it with forms is given at
James Marshall's site (http://www.jmarshall.com/easy/cgi/).
Again. since this is a programming topic, some programming experience
will be helpful.
The best way to learn CGI is by looking at scripts others have
written.You can find links to various public domain CGI scripts and
tutorial in the Google Directory : Computers > Programming > Internet
> CGI > Tutorials ( http://directory.google.com/Top/Computers/Programming/Internet/CGI/Tutorials/?il=1
)
Well thats it.Hope this answered satisfied your queries. If you need
any clarifications, just ask and I will be glad to help.
:)
RELATED LINKS :
Customize GOOGLE Help Page
Contains information on the various customizable options offered by
Google, and what they do.
( ://www.google.com/help/customize.html )
Unicode Home Page
Contains information on the Unicode character encoding and UTF-8
( http://www.unicode.org/ )
An instantaneous introduction to CGI scripts and HTML forms
A comprehensive introduction provided by the University of Kansas
( http://www.ku.edu/~acs/docs/other/forms-intro.shtml ) |