Google Answers Logo
View Question
 
Q: How to automatically have the contents of a website transfer into a database. ( No Answer,   1 Comment )
Question  
Subject: How to automatically have the contents of a website transfer into a database.
Category: Computers > Programming
Asked by: philadel-ga
List Price: $2.00
Posted: 03 Dec 2005 09:09 PST
Expires: 02 Jan 2006 09:09 PST
Question ID: 600910
There are several websites that I goto several times per hour. They
contain information such as sports statistics, weather information,
traffic information, 1 of the sites is plain
text, one is basic clean html, none have complex flash or java, all
the information is text on the pages. If you want to check the sites
out,
email me and I will send you links so you get a feel of what the data
looks like.  What I am looking to have happen is a website that I host
with a database SQL or mysql what ever is easiest, add all of my
content into 1 "live" page, I could either host this page, or maybe
this is not necessary and I can do this locally in an app on my
computer. I do not know. Google.com/IG homepage is similar to what I
am looking for but under customization you cannot specify exact urls,
you can only search topics that google has added to the possible
feeds... These sites do not do RSS or any live feeds but update as
often as every 10-30 minutes,  If I could set the URL on a form and
have it give me all of this information at a set interval of time.
THis would be very useful for me. How can I create an application or
find the code to do this on a website that I own the hosting access to
Answer  
There is no answer at this time.

Comments  
Subject: Re: How to automatically have the contents of a website transfer into a database
From: larkas-ga on 04 Dec 2005 15:37 PST
 
First, make sure you carefully read the leagl terms on the sites where
you are screen scraping the information from. Often, there are legal
restrcitions on using their information on your own pages.

With the disclaimer out of the way, I would suggest that you use Java
and Xquery. This IBM article should get you started

http://www-128.ibm.com/developerworks/java/library/j-jtp03225.html?ca=dgr-lnxw16XQuery

Although it discusses, creating a new page from the results, you could
just as easily dump the data into a database, running the Java program
on some cron job (i.e. every minute or so). Otherwise, you could turn
the code into a servlet that would fetch the content on the fly but
this would be inappropirate if you have many users to your site.

An alternative choice is to use MonkeyGrease -- the server side
equivalent of GreaseMonkey for the job, using the XMLHTTPRequest to
get the data and DOM to manipulate to the data you want and insert
into your page. See:

http://monkeygrease.org/

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy