|
|
Subject:
Follow up to Google Meta Query
Category: Computers > Internet Asked by: pafalafa-ga List Price: $5.00 |
Posted:
28 Oct 2002 08:59 PST
Expires: 27 Nov 2002 08:59 PST Question ID: 91040 |
This is a follow-up to an earlier question at: https://answers.google.com/answers/main?cmd=threadview&id=90542 Please refer to the original question, before answering this one. I understand now a bit about how CGI is used to form queries in the URL. Theta-ga, in his answer to my original question, noted that these CGI commands can be custom tailored. He said: "You should understand that the variable names and values used are specific to Google and can be changed by them as and when they like". I'm sure this is true. However, there also seem to be a set of "commands" used in URL's that are common to a lot of query-based systems. For instance, the "num=X" command is not unique to Google, but is used in a lot of URL queries. One the other hand, the command used in the URL to create this very question is: [https://answers.google.com/answers/main?cmd=askquestion] and perhaps the "cmd=askquestion" phrase is something unique to Google, that was specifically created for Google Answers. What I am looking for is *** a list of the major and/or most common commands and variables that are used in URL queries, along with a bit of explanation of each ***. Since I am not an experienced programmer or web site designer, I need these in pretty much a "plain English" form. For the meager amount of money I'm offering here, I'll settle for a list of ten common commands/variables (other than the ones already covered in my original question) though a longer list is certainly preferable. By way of background, here's why I'm interested. I had long been frustrated by searches at sites that would only return 10 results at a time -- I guess they were designed for low bandwidth systems, but seemed self-defeating for broadband access. I found that by playing with the "num=X" figure in the URL -- or adding out outright if it didn't exist -- I could often (not always) override the default, and get 100 results listed, rather than 10 (100 seems to be the max -- is this true?). I like being able to "take control" of results in this fashion, and am wondering what other variables would be worth exploring by which I might custom tailor the results I get. Looking forward to an interesting answer. | |
| |
| |
|
|
Subject:
Re: Follow up to Google Meta Query
Answered By: robertskelton-ga on 29 Oct 2002 16:03 PST Rated: |
Hi Paf, Although I am not a programmer, one of my other jobs involves reverse engineering HTML search forms. The following applies for any HTML search form, including those of the main search engines. Search forms have different types of variables. Most variables are where the user enters some data, or chooses data from a drop-down box, checkbox or radio buttons. There are also hidden fields, which cannot be changed by the visitor using the form. The search button/image values are also data that get sent, although almost never used. The search query data (that you see in the URL of the results pages) begin with a question mark, and are separated by & signs. Each data element consists of the variable name, an equals sign, and the value. Although many search engines use variables with the same name, like q for query, I get the impression that it's just a consensus standard, rather than the programmers using the same libraries. Just like when programmers test a program that outputs a sentence, they tend to test it with "Hello World". To reverse engineer an HTML search form, you need to compare the HTML code with the options that the searcher sees on the page. This is how to do it for the Advanced Search at Google: Google ====== From your browser menu, select View Source (in IE6 it is in View/Source). Look through the code for where the form starts and ends. <form method=GET action="/search" name=f> </form> Within these two tags you will find lots of layout code and form elements. The variables look like this: <input type=text value="" name=as_q size=25> These are the variables for the Google Advanced Search. as_q "with all of the words" the keyword you enter into first text box num 10/20/30/50/100 search results (dropdown box) Changing a search results URL sometimes allows you enter other values. With Google you can successfully change it to 37 results, if you want, but any number over 100 defaults to 100. hl/ie/oe These are "hidden", with no clues on the visible page, which makes them hard to decipher. Ultimately you can only make an educated guess. I agree with theta-ga's answer on these three variables. newwindow This comes up when I do a search, because of my preferences at Google, where I chose to have results appear in a new window. Other preference data could appear - the only way to avoid this is by removing the appropriate cookies from your system. btnG <INPUT type=submit value="Google Search" name=btnG> This is just the search button. It has a name and value, but doesn't do anything. I have never seen a button name/value affect search results. as_epq / as_oq / as_eq These are the "exact phrase", "at least one" and "without" keywords. lr Return pages written in X language, selected from the drop-down box as_ft / as_filetype The two drop-down boxes regarding file type as_qdr Date criteria from drop-down box, anytime =all past 3 months =m3 past 6 months =m6 past year =y as_occt Where the data occurs in the page, ie Title as_dt / as_sitesearch Only/Don't return results from the site or domain entered safe Can be either off or active, to indicate if the safe search feature is to be used. Altavista ========= The variables are from the search form on this page: http://au.altavista.com/searchtxt q Keyword you are searching for. kl Language. what Because I live in Australia, it asks me to choose between Australian and Worldwide results. This variable (in my case) can be either "au" or "web" pg / text These are both hidden. To work out what they do, I fiddled with them in the results URL. By changing "text=yes" to "text=no" I found that it is for selecting a text-only results page (which means no ads or sponsored results!) Changing or removing the pg variable does not appear to make any difference. I have no idea what it is for. A cool tool =========== IE Booster is an extra for Internet Explorer, which gives a few new tools which are accessible via a right-click menu. One of them shows all the form elements on a page, and the related HTML code. It sounds perfect for you. IE Booster http://www.paessler.com/products/ieb/index.html Search Strategy =============== Personal experience I trust this answers your question. If any portion of my answer is unclear, please ask for clarification. Best wishes, robertskelton-ga |
pafalafa-ga
rated this answer:
and gave an additional tip of:
$5.00
Extremely cool answer, even if I don't know (yet!) what all of it means. But I certainly intend to play around and find out. Thanks so much. |
|
Subject:
Re: Follow up to Google Meta Query
From: mooncrickett-ga on 28 Oct 2002 21:51 PST |
look man.. here the deal...I'm a programmer, i dont reinvent the wheel everytime i program, but i do sometimes reinvent the varible. in lamens terms.. i do makeup my own names. so what this means is that just because the program acts the same way dont mean that it was written the same way. your question is not answerable. |
Subject:
Re: Follow up to Google Meta Query
From: sgtcory-ga on 29 Oct 2002 07:39 PST |
Hello pafalafa, Unfortunately - there is no definitive library. There are common query sytax assignments that we as programmers would use, and by nature of memory, we often re-use such commands. An example would be: num=100 It's not hard to relate this command to be further translated as 'number' or 'numer of results'. Next time I write code, I might pull this out of memory, but not off a list, or to meet any CGI standards. In contrast - if someone is writing a CGI script that extracts data from a simple program that allows users to enter multiplication problems, the assignment may be something different. Here is a sample: 5 x 6 = 30 The hypothetical CGI in question, could translate those numbers into assigned values. In this case we will assign them as follows: num x num2 = num3 Now our 'num' is something totally different than in the context it was used on Google. You could still manipulate the query syntax, but it has nothing to do with the amount of results. It changes the first digit in the equation. To further explain this point - here is how you would manipulate the number of results at a few other search engines: Altavista : nbq= desired number up to 100 Yahoo : n= doesn't work anymore Hotbot : numresult_field= unsure The point is - each engine builds it's own proprietary code. Hope that clarifies a little - SgtCory |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |