Okay,
I have come up with the following code. It uses a RSS parsing class
(phpRSS) availble from here: http://www.thewebmasters.net/
It does not require any unusual PHP libraries or modules and should
work pretty much as is.
My code assumes the phpRSS class ("class.RSS.php") is in a
subdirectory called "RSS". It also has a caching function which
expects a writable directory called "cache" under the directory from
which the file is executed. This can be disabled with a global
variable. Also, the directory and default cache times are easily
modified.
It can open remote RSS files (via HTTP) or local files.
It should be fairly self explainatory, but let me know if you run into
problems.
The CODE:
----
<?php
$cachetime = 600;
$cachedir = "cache/"; // needs trailing slash.
function webFetch($host,$path,$port=80) {
$fp = fsockopen ($host,$port);
if(!$fp) {
die("Could not connect to host.");
}
$header_done = false;
$request = "GET ".$path." HTTP/1.0\r\n";
$request .= "User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows
98)\r\n";
$request .= "Host: ".$host."\r\n";
$request .= "Connection: Close\r\n\r\n";
$content = '';
fputs ($fp, $request);
$line = fgets ($fp, 128);
$header["status"] = $line;
while (!feof($fp)) {
$line = fgets ( $fp, 128 );
if($header_done) {
$content .= $line;
} else {
if($line == "\r\n") {
$header_done=true;
} else {
$data = explode(": ",$line);
$header[$data[0]] = $data[1];
}
}
}
fclose ($fp);
return array("header"=>$header,"content"=>$content);
}
function cacheExpired($feed) {
global $cachedir,$cachetime;
$cachefile = md5($feed);
if ((is_readable($cachedir.$cachefile))&&(is_writable($cachedir.$cachefile)))
{
$mtime = filemtime($cachedir.$cachefile);
$now = time();
print "<!-- Mtime: $mtime -->\n";
print "<!-- Expiry: ".($mtime+$cachetime)." -- Time: $now -->\n";
if (($mtime+$cachetime) < $now) {
return TRUE;
} else {
return FALSE;
}
} else {
return TRUE;
}
}
function checkCache($feed) {
global $cachedir,$cachetime;
$cachefile = md5($feed);
if (cacheExpired($feed)) {
return ($feed);
} else {
return $cachedir.$cachefile;
}
}
function refreshCache($feed,$data) {
global $cachedir,$cachetime;
$cachefile = md5($feed);
if (cacheExpired($feed)) {
$fp = @fopen($cachedir.$cachefile,"w");
if (!$fp) {
return FALSE; // Open failed. Cannot write cache.
} else {
fwrite($fp,$data);
fclose($fp);
return TRUE;
}
}
return TRUE;
}
function getFeed($feed) {
global $caching;
if ($caching) {
$get = checkCache($feed);
} else {
$get = $feed;
}
print "<!-- Getting: $get -->\n";
if (eregi("^http://([^/]+)(.+)",$get,$reg)) {
if (ereg(":",$reg[1])) {
list($host,$port) = split(":",$reg[1]);
} else {
$host = $reg[1];
$port = 80;
}
$path = $reg[2];
$out = webFetch($host,$path,$port);
$feeddata = $out["content"];
} else {
$ftemp = file($feed);
$feeddata = implode($ftemp,"");
}
if ($caching) {
if(!refreshCache($feed,$feeddata)) {
//There was an error!
print "\n\nCaching Error\n\n";
}
}
return $feeddata;
}
function processFeed($data) {
include("RSS/class.RSS.php");
$rss = new RSS($data);
$allItems = $rss->getAllItems();
$itemCount = count($allItems);
for($y=0;$y<$itemCount;$y++)
{
print "<a href=\"".$allItems[$y]['LINK']."\">" .
$allItems[$y]['TITLE']."</a><br>\n";
}
}
$caching = true;
$data = getFeed("http://xml.newsisfree.com/a/a3/a3b0130fb5d59fa647121895617891e4.xml");
processFeed($data);
?>
----
Some lines will have been wrapped when I pasted the code, but this
should not cause any problems.
The feed I am using in the example is the BBC News feed available to
me as a personal user of News Is Free.
Regards,
sycophant-ga |
Request for Answer Clarification by
webjam-ga
on
20 Apr 2003 20:58 PDT
Good day sycophant,
Thank you for your contribution,
I tried to use the code you wrote, and created a file: feed_php.htm,
and posted it on:
http://www.incross.ca/feed_php.htm
It is giving Caching Error
I created an RSS directory and copied the extracted PHP-RSS files in
it.
1) What am I missing? I just copied the PHP code and pasted it
between:
<body>
..
</body>
2) When I click on the newslinks, it come with The page cannot be
found
http://www.incross.ca/%22%22.Array[%27LINK%27].%22/%22
3) Still do not know how I could customize it to get the news
categories or the news source languages.
4) Could I use it to get the news from Moreover? In another section of
the page.
Therefore, I am still looking for the answer.
Thank you for you help.
|
Clarification of Answer by
sycophant-ga
on
21 Apr 2003 04:31 PDT
Hi Webjam,
The Caching Error, is more for my debugging than anything - it will
happen if there is no writable directory called "cache" -- it does not
affect the operation, it may just slow the loading of the page
sometimes.
You can remove it by finding the line that says: "//There was an
error!" and delete the print line below it.
I am not sure about the link problem you are talking about? I checked
your example page, and it worked fine.
To customize it, you just need to change the link that is provided to
getFeed(). You can get the RSS links from NewsIsFree, by browsing the
feed categories and getting the unique RSS URL for the ones you want
to use. For information about that, see here:
http://www.newsisfree.com/syndicate.php -- Basically, go to the feed
page, click on the blue RSS link and then generate a URL for RSS 0.92
with no descriptions (all that's available to free users).
I know that Moreover used to offer RSS/RDF newsfeeds, however I have
not been able to find them on the current site, so I do not know if
they are still available.
Basically, it should work with any RSS 0.92 compliant newsfeed, try
the following third parth ones for exmaple:
http://xml.metafilter.com/rss.xml
http://www.bbc.co.uk/syndication/feeds/news/ukfs_news/world/rss091.xml
http://sitereview.org/rss/
You could include the functions in a page, and then use getFeed() to
call different feeds in different parts of the page.
Hope this helps.
Regards,
sycophant-ga
|