Google Answers Logo
View Question
 
Q: Querying Google from Java program (URL, openStream()) ( Answered 4 out of 5 stars,   1 Comment )
Question  
Subject: Querying Google from Java program (URL, openStream())
Category: Computers > Programming
Asked by: mrz-ga
List Price: $5.00
Posted: 20 Aug 2002 03:26 PDT
Expires: 19 Sep 2002 03:26 PDT
Question ID: 56470
I try to send a query directly to Google with java program and receive
the results(not using Google API).

Here is the (part of) java code that I was created for the purpose.

String urlName = "://www.google.com/search?q=java";
URL url = new URL(urlName);
BufferedReader in = new BufferedReader(new
InputStreamReader(url.openStream()));
String line;
while ((line = in.readLine()) != null){  
   System.out.println(line);
}

The problem is that the above openStream call throws a
FileNotFoundException even though the url
(://www.google.com/search?q=java) is a valid one (i.e. a Browser
will show the results if I put the url into its address field).

FYI, I have tried to use the openConnection method of the URL class
but it was also failed.

Is there another way to succesfully send the query to Google and retrieve the
answer using java program (without using the Google API)?
Answer  
Subject: Re: Querying Google from Java program (URL, openStream())
Answered By: iaint-ga on 20 Aug 2002 07:58 PDT
Rated:4 out of 5 stars
 
Hi mrz


I strongly suspect that there is nothing wrong with your code; rather
it is Google's policy on robots and automated requests that is causing
you to have problems. If you review the Google Terms & Conditions at
://www.google.com/terms_of_service.html you will find the
following clause:

"No Automated Querying

"You may not send automated queries of any sort to Google's system
without express permission in advance from Google."

What's more, Google seems to take active steps to enforce this rule.
In particular they seem to check that the User-Agent HTTP header does
not contain any of the well-known software automation tools. For an
anecdotal tale of how readily Google will deny access to robots you
could look at:
http://www.everything2.com/index.pl?node_id=1229892

Of course, there are a number of ways to try to get round this policy,
but as you would still be in breach of Google's T&C, and as Google's
programmers have probably thought of most of them anyway, I'm afraid I
can't list any of them here.

Sorry to be the bearer of bad tidings, and I wish you luck in the rest
of your development project.


Regards
iaint-ga


Search strategy:
Visited ://www.google.com/ and clicked around to find the Terms &
Conditions page!
mrz-ga rated this answer:4 out of 5 stars
Thanks for the answer.
You remind me an important point.
I should read the term of service carefully before writing the program.

Comments  
Subject: Re: Querying Google from Java program (URL, openStream())
From: tne-ga on 26 Aug 2002 19:28 PDT
 
import java.applet.Applet;
import java.net.*;
import java.awt.*;

public class Search extends Applet {
    TextField searchParameter;
    Choice    searchEngine;
    Button    searchButton;

    // initialize the display
    public void init() {
	setBackground(Color.white);
	searchParameter = new TextField(20);
	add(searchParameter);
	searchEngine = new Choice();
	searchEngine.addItem("AltaVista");
	searchEngine.addItem("WebCrawler");
	searchEngine.addItem("Yahoo");
	searchEngine.addItem("AskJeeves");
	searchEngine.addItem("Hotbot");
	searchEngine.addItem("Google");
	searchEngine.select(0);
	add(searchEngine);
	searchButton = new Button("Search");
	add(searchButton);
    }
    
    public boolean action(Event e, Object o) {
	if (e.target.equals(searchButton)) {
	    try {
		sendSearch();
	    }
	    catch (Exception e1) {
		showStatus("Exception caught:" + e1.toString());
	    }
	}
	return true;
    }

    public void sendSearch() throws Exception {
	String searchString = searchParameter.getText();
	if (searchString.equals("")) {
	    showStatus("Must enter a search string");
	    return;
	}
	String url;
	switch (searchEngine.getSelectedIndex()) {
	case 0: url = "http://www.altavista.digital.com/cgi-bin/query?pg=q&;what=web&fmt=.&q=";
	    break;
	case 1: url = "http://www.webcrawler.com/cgi-bin/WebQuery?searchText=";
	    break;
	case 2: url = "http://search.yahoo.com/bin/search?p=";
	    break;
	case 3: url = "http://www.askjeeves.com/main/askjeeves.asp?ask=";
	    break;    
	case 4: url = "http://hotbot.lycos.com/?MT=";
	    break; 
	case 5: url = "://www.google.com/search?q=";
	    break; 	    
	default: showStatus("Invalid search engine selected.");
	    return;
	}
	
	// encode the search data
	url += URLEncoder.encode(searchString);
	
	// launch the search engine
	showStatus("Connecting to search location " + url);
	getAppletContext().showDocument(new URL(url), "_top");
    }
}

I wrote this code years ago it still works as far as I know

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy