Google Answers Logo
View Question
 
Q: Follow up to Google Meta Query ( Answered 5 out of 5 stars,   2 Comments )
Question  
Subject: Follow up to Google Meta Query
Category: Computers > Internet
Asked by: pafalafa-ga
List Price: $5.00
Posted: 28 Oct 2002 08:59 PST
Expires: 27 Nov 2002 08:59 PST
Question ID: 91040
This is a follow-up to an earlier question at:
https://answers.google.com/answers/main?cmd=threadview&id=90542

Please refer to the original question, before answering this one.

I understand now a bit about how CGI is used to form queries in the
URL.  Theta-ga, in his answer to my original question, noted that
these CGI commands can be custom tailored.  He said:  "You should
understand that the variable names and values used are specific to
Google and can be changed by them as and when they like".

I'm sure this is true.  However, there also seem to be a set of
"commands" used in URL's that are common to a lot of query-based
systems.  For instance, the "num=X" command is not unique to Google,
but is used in a lot of URL queries.

One the other hand, the command used in the URL to create this very
question is: [https://answers.google.com/answers/main?cmd=askquestion]
and perhaps the "cmd=askquestion" phrase is something unique to
Google, that was specifically created for Google Answers.

What I am looking for is *** a list of the major and/or most common
commands and variables that are used in URL queries, along with a bit
of explanation of each ***.  Since I am not an experienced programmer
or web site designer, I need these in pretty much a "plain English"
form.

For the meager amount of money I'm offering here, I'll settle for a
list of ten common commands/variables (other than the ones already
covered in my original question) though a longer list is certainly
preferable.

By way of background, here's why I'm interested.  I had long been
frustrated by searches at sites that would only return 10 results at a
time -- I guess they were designed for low bandwidth systems, but
seemed
self-defeating for broadband access.  I found that by playing with the
"num=X" figure in the URL -- or adding out outright if it didn't exist
-- I could often (not always) override the default, and get 100
results listed, rather than 10 (100 seems to be the max -- is this
true?).

I like being able to "take control" of results in this fashion, and am
wondering what other variables would be worth exploring by which I
might custom tailor the results I get.

Looking forward to an interesting answer.

Request for Question Clarification by duncan2-ga on 28 Oct 2002 09:50 PST
Hi pafalafa,

Since any website developer can write their own CGI and arbitrarily
assign names to the variables they want to use in their scripts,
there's not going to be a meaningful list of variables/attribute pairs
that works for all websites.

Would you settle for a listing of variable attributes used in Google
searches?

Clarification of Question by pafalafa-ga on 28 Oct 2002 11:37 PST
I can't quite bring myself to believe that each time a CGI program is
written, the developer reinvents the wheel for commonly-occuring
commands and variables.  Do you really mean to say there is no
"library" of common commands for a function such as searching a
database, retrieving results, and presenting them in a desired format?
 (Even if a site designer gives his/her own name to the variable, the
function, like "list X number of results" should be pretty well
conserved, n'est ce pas?) If there really is not such a library, then
I guess an explanation of the Google attributes would be an acceptable
answer (and if you could check out some attributes on, say, AltaVista
and toss those in as well, all the better).

Bottom line is, this is only a five buck question, so I'm not
expecting the world.  Go ahead and give it your best shot!

Clarification of Question by pafalafa-ga on 29 Oct 2002 08:19 PST
Sgtcory, duncan, mooncricket et al,

Thanks for your input so far.  This is all helping me make sense out
of the URL queries.  Since Google and Altavista are my main search
engines, if anyone wants to take a crack at just describing a bunch of
the main commands/variables used at these sites (other than commands
like "num" already discussed) then please go ahead, list 'em, and
collect the fantastic sum of $5 (or at least, your fraction,
thereof...)

Cheers,

paf
Answer  
Subject: Re: Follow up to Google Meta Query
Answered By: robertskelton-ga on 29 Oct 2002 16:03 PST
Rated:5 out of 5 stars
 
Hi Paf,

Although I am not a programmer, one of my other jobs involves reverse
engineering HTML search forms. The following applies for any HTML
search form, including those of the main search engines.

Search forms have different types of variables. Most variables are
where the user enters some data, or chooses data from a drop-down box,
checkbox or radio buttons. There are also hidden fields, which cannot
be changed by the visitor using the form. The search button/image
values are also data that get sent, although almost never used.

The search query data (that you see in the URL of the results pages)
begin with a question mark, and are separated by & signs. Each data
element consists of the variable name, an equals sign, and the value.

Although many search engines use variables with the same name, like q
for query, I get the impression that it's just a consensus standard,
rather than the programmers using the same libraries. Just like when
programmers test a program that outputs a sentence, they tend to test
it with "Hello World".

To reverse engineer an HTML search form, you need to compare the HTML
code with the options that the searcher sees on the page. This is how
to do it for the Advanced Search at Google:


Google
======

From your browser menu, select View Source (in IE6 it is in
View/Source). Look through the code for where the form starts and
ends.

<form method=GET  action="/search" name=f>

</form>

Within these two tags you will find lots of layout code and form
elements. The variables look like this:

<input type=text value="" name=as_q size=25>

These are the variables for the Google Advanced Search. 

as_q 
"with all of the words"
the keyword you enter into first text box

num
10/20/30/50/100 search results (dropdown box)
Changing a search results URL sometimes allows you enter other values.
With Google you can successfully change it to 37 results, if you want,
but any number over 100 defaults to 100.

hl/ie/oe
These are "hidden", with no clues on the visible page, which makes
them hard to decipher. Ultimately you can only make an educated guess.
I agree with theta-ga's answer on these three variables.

newwindow
This comes up when I do a search, because of my preferences at Google,
where I chose to have results appear in a new window. Other preference
data could appear - the only way to avoid this is by removing the
appropriate cookies from your system.

btnG
<INPUT type=submit value="Google Search" name=btnG>
This is just the search button. It has a name and value, but doesn't
do anything. I have never seen a button name/value affect search
results.

as_epq / as_oq / as_eq
These are the "exact phrase", "at least one" and "without" keywords.

lr
Return pages written in X language, selected from the drop-down box

as_ft / as_filetype
The two drop-down boxes regarding file type

as_qdr
Date criteria from drop-down box, 

anytime =all  
past 3 months =m3  
past 6 months =m6  
past year =y 

as_occt
Where the data occurs in the page, ie Title

as_dt / as_sitesearch
Only/Don't return results from the site or domain entered

safe
Can be either off or active, to indicate if the safe search feature is
to be used.


Altavista
=========

The variables are from the search form on this page:
http://au.altavista.com/searchtxt

q
Keyword you are searching for.

kl
Language.

what
Because I live in Australia, it asks me to choose between Australian
and Worldwide results. This variable (in my case) can be either "au"
or "web"

pg / text
These are both hidden. To work out what they do, I fiddled with them
in the results URL. By changing "text=yes" to "text=no" I found that
it is for selecting a text-only results page (which means no ads or
sponsored results!)

Changing or removing the pg variable does not appear to make any
difference. I have no idea what it is for.


A cool tool
===========

IE Booster is an extra for Internet Explorer, which gives a few new
tools which are accessible via a right-click menu. One of them shows
all the form elements on a page, and the related HTML code. It sounds
perfect for you.

IE Booster
http://www.paessler.com/products/ieb/index.html


Search Strategy
===============

Personal experience


I trust this answers your question. If any portion of my answer is
unclear, please ask for clarification.

Best wishes,
robertskelton-ga
pafalafa-ga rated this answer:5 out of 5 stars and gave an additional tip of: $5.00
Extremely cool answer, even if I don't know (yet!) what all of it
means.  But I certainly intend to play around and find out.  Thanks so
much.

Comments  
Subject: Re: Follow up to Google Meta Query
From: mooncrickett-ga on 28 Oct 2002 21:51 PST
 
look man.. here the deal...I'm a programmer, i dont reinvent the wheel
everytime i program, but i do sometimes reinvent the varible. in
lamens terms.. i do makeup my own names. so what this means is that
just because the program acts the same way dont mean that it was
written the same way. your question is not answerable.
Subject: Re: Follow up to Google Meta Query
From: sgtcory-ga on 29 Oct 2002 07:39 PST
 
Hello pafalafa,

Unfortunately - there is no definitive library. There are common query
sytax assignments that we as programmers would use, and by nature of
memory, we often re-use such commands. An example would be:

num=100

It's not hard to relate this command to be further translated as
'number' or 'numer of results'. Next time I write code, I might pull
this out of memory, but not off a list, or to meet any CGI standards.

In contrast - if someone is writing a CGI script that extracts data
from a simple program that allows users to enter multiplication
problems, the assignment may be something different. Here is a sample:

5 x 6 = 30

The hypothetical CGI in question, could translate those numbers into
assigned values. In this case we will assign them as follows:

num x num2 = num3

Now our 'num' is something totally different than in the context it
was used on Google. You could still manipulate the query syntax, but
it has nothing to do with the amount of results. It changes the first
digit in the equation.

To further explain this point - here is how you would manipulate the
number of results at a few other search engines:

Altavista    :    nbq= desired number up to 100

Yahoo        :    n= doesn't work anymore

Hotbot       :    numresult_field= unsure

The point is - each engine builds it's own proprietary code.

Hope that clarifies a little -
SgtCory

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy