Google Answers Logo
View Question
 
Q: Retreiving data from url with php ( No Answer,   2 Comments )
Question  
Subject: Retreiving data from url with php
Category: Computers > Programming
Asked by: dpslot-ga
List Price: $20.00
Posted: 17 Nov 2005 05:28 PST
Expires: 17 Dec 2005 05:28 PST
Question ID: 594137
I want to retrieve the contents of http://snout.omroep.nl/tekst/501-01.html
with php.

In the browser the text shows fine, but when I do it with a php script
I keep getting the 404 not found message.

I use $data=file_get_contents("http://snout.omroep.nl/tekst/501-01.html")
but that gives the 404.

Can someone with more php knowledge give me a working script for this
annoying problem?
Answer  
There is no answer at this time.

Comments  
Subject: Re: Retreiving data from url with php
From: isitaboat-ga on 17 Nov 2005 08:15 PST
 
When I do:
<?php
$data=file_get_contents("http://snout.omroep.nl/tekst/501-01.html");

echo $data;
?>

I get:
Warning: file_get_contents(http://snout.omroep.nl/tekst/501-01.html):
failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in
/home/dev/google.php on line 2

My guess is that they are checking the either the referrer (doubtful,
as a direct link works), or checking the browsers type (sent in the
request headers).

Have a read of http://uk.php.net/curl - the cURL library. Its a much
better way of getting files (allows storage of cookies, setting of
cookies, refferrs, etc).

This is working and tested on my server;

<?php
echo url_get('snout.omroep.nl', '/tekst/501-01.html');

function url_get($domain, $uri, $referer = '-')
{
	$header = array();
	$header[] = 'GET '.$uri.' HTTP/1.1';
	$header[] = 'Host: '.$domain;
	$header[] = 'Connection: close';
	$header[] = 'Accept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5';
	$header[] = 'Accept-Language: en-gb,en;q=0.5';
	$header[] = 'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7';
	$header[] = 'User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1;
en-GB; rv:1.8) Gecko/20051107 Firefox/1.5 Web-Sniffer/1.0.22';
		
	$ch = curl_init(); 
	curl_setopt($ch, CURLOPT_URL, $domain.$uri);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
	curl_setopt($ch, CURLOPT_REFERER, $reffer);
  	curl_setopt($ch, CURLOPT_HTTPHEADER, $header);

  	$result['exec'] = curl_exec ($ch);
	$result['info'] = curl_getinfo($ch);
	
	//use this line to get more info
	//return $result;
	return $result['exec'];

	curl_close ($ch); 
}
?>

I belive they are checking the headers strictly - but the above works
:P. NB: cURL is also faster than the method you were using, and more
flexible and reliable. If it is not installed on your server (its on
most), contact me for instructions on installation.

Demo @ http://dev.isitaboat.co.uk/google.php
Subject: Re: Retreiving data from url with php
From: dpslot-ga on 18 Nov 2005 05:27 PST
 
Thanks,

Your code works fine!

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy