Hello,
I am currently looking for sources for news headlines for a project I
am working on involving word frequency. We are currently pulling
great volumes of headlines from services like "newsisfree.com" so the
data we have now is extrememly valuable. Unfortunately, the headlines
we have from the past (1988 - as of 2 weeks ago) is heavily skewed
towards arts and entertainment headlines (ie, "bone-thugs-and-harmony
take it up a notch"). We are looking for sources free or priced less
then $750 which would provide this data in total. Sources must be:
a) electronic and retrievable by scripting (ie, writing a perl script
to collect headline data). If there is a web interface, this is good.
b) there should be significant volume (100-500 articles per day,
minimum - target is at least a thousand)
c) the collection should be searchable by date. IE - I am interested
in headlines for "January 31, 1988." Certain services, like
lexis/nexis document on demand have a non specific date setting ("I am
interested in article from the last 5 years" is no good).
d) the headlines should be sortable at least into "world, politics,
business"
e) those three catagories are our primary interest.
f) the sources should be varied and international. Just having one
newspaper will not cut it for us. At least several major newspapers
or news sources.
Some things to note about our needs:
a) we don't need abstracts/bylines/sources (although sources would be
nice)
b) we don't need the text of the articles, or even access to the text
of the article
c) we are compiling a database of headlines for an art project, so
query based services are a bit expensive for us. The cheaper the
better. If there is a database already compiled that is accessible,
this is the best solution.
Questions or clarifications? |