Google Answers Logo
View Question
Q: Trends In Word Frequency In News Sources ( Answered,   0 Comments )
Subject: Trends In Word Frequency In News Sources
Category: Reference, Education and News
Asked by: norwegian-ga
List Price: $40.00
Posted: 03 Oct 2006 20:14 PDT
Expires: 02 Nov 2006 19:14 PST
Question ID: 770639
I am trying to find how often certain words appear in combination in
the same article, and how it changes over time.  How often have the
words "obesity" and "soda" for instance appeared toghether in the news
media and how has it changed over time. Or "energy" and "crisis".

Was hoping to use Google News or something similar (so to limit the
search to news-sources) and was hoping to see the results in something
similar to what "Google Trends" does for number of times a search term
has been used.  It would also be useful to see the actual number of
occurances and even better if I could download it into Excel so I
could use it in presentations.

Is there any websites like this out there, or potentially any
web-crawler software that would do this for me?

Request for Question Clarification by pafalafa-ga on 05 Oct 2006 05:19 PDT

I can think of a few possibilities here, but none of them quite as
automated as Google're going to have to do a fair amount
of keyboard work to get the data you're looking for.

First off, I'd steer clear of Google News as a source.  They are just
beginning to buid up content for prior years, and I don't think you
can realistically use the serice to compare, say, 2000 to 2006, since
the more recent years in they system are much more complete than the
earlier years.

Your absolute best way to do this would be to make use of news source
content providers, like Factiva, or Lexis-Nexis.  With either of
these, you can get these sorts of results (which I pulled from

A search on [ obesity w/5 soda ], that is, the word 'obesity'
appearing within 5 words of the word 'soda' shows the following:

2006: (thus far):  77

2005:  42

2004:  32

and so on....Factiva can take you back at least to the 1980's.

Similarly, a search on the exact phrase "energy crisis" shows:

2006:  7,540

2005:  8,629

2004:  8,118

If you have access to either of these, they would be your best source,
rather than a web-based (i.e. non-subscription) service.

Let me know your thoughts on this.


Clarification of Question by norwegian-ga on 14 Oct 2006 20:56 PDT
Thanks pafalafa,

I was afraid that was the answer.  Have been searching around for a while.

The research library has a database called ?EBSCO Host ? Business
Source Corporate?, but unfortunately not Factiva or Lexis-Nexis.  I am
able to search through a tremendous amount of sources on it, but the
return format is not very useful for my project as it is very focused
on links/articles as opposed to statistics on number of finds, dates

I might be able to get someone to pay for one of these databases. 
Would you recommend Factiva over Lexis-Nexis?  From their web page, it
seems like Factiva can provide a date/occurrence over time overview. 
I assume that what you used for your search was

I appreciate your help with this.  It is too bad that there is no web
tool like this as more of discussions are moving from newspaper op-eds
to blogs every day.

Request for Question Clarification by pafalafa-ga on 15 Oct 2006 06:13 PDT

I hope I didn't misstate my earlier remarks.

None of the databases will crank out a neat list of statistics for
you.  Instead, you need to conduct your search for a specific year,
see how many results you get, then repeat the search for a different
year.  And so on, for all the years of interest to you.

You can do this in Ebsco Business as easily as elsewhere.  For
instance, for the term [energy crisis] there are:

529 hits in 2005

548 hits in 2004

497 in 2003

and so on.

Should I post the full details for creating these data in Ebsco as an
answer to your question?

Alternatively, you could always post the terms of interest to you as a
question at Google Answers (and the years of interest), and we could
run the searches for you, and post the statistical results as an

Lastly, I was a bit surprised by your final comment about op-eds vs
blogs, as I understood from your original question that you
specifically wanted to limit your searches to mainstream news sources.

Let me know your thoughts on all this at this point.



Clarification of Question by norwegian-ga on 27 Oct 2006 11:15 PDT

Thanks, No I don't think I missunderstood.  I do understand that I
have to manually have to repeat the search.  Your comments have been
very useful however, as I had no familiarity to Factiva, which seems
like a tool I can use for several other parts of my work.  Ebsco
Business is a tool we have in the research library, but it is seldom
used.  If you compare the number of answers your search returned on
each of the databases it might indicate why people often have not
found what they are looking for on Ebsco.  It is possible that we
might change over to Factiva as it might be more suitable for some of
the projects we are working on.

My reference to was simply to the upper right hand graph
on the following page>

Oh, and the reference to blogs, was simply a more forward looking
comment.  As blogs don't really have much of a history, it does not
help me much in my research.  However, in a couple of years, it will
have a few years of history and be worth researching.



Request for Question Clarification by pafalafa-ga on 27 Oct 2006 12:20 PDT

Thanks for the update.

I'm not sure, at this point, whether you are in need of any additional
information or not?

I haven't yet posted an answer to your question, so you have not been
charged.  But if you're still looking for some information on this,
I'll be glad to try and pull it together, and post an answer.

Let me know what you think.


Clarification of Question by norwegian-ga on 27 Oct 2006 13:17 PDT

thanks for all you have gathered for me.  I don't think I need anything further.

I'm not sure how we move forward from here, but I'm very happy with
what you have provided me.  Do you post it as an answer so I can pay

Subject: Re: Trends In Word Frequency In News Sources
Answered By: pafalafa-ga on 27 Oct 2006 13:23 PDT


You don't need to do anything.  By my posting here, your question is
now officially answered.

As you begin to play around with Factiva and other sources, feel free
to post an update here if you'd like any more advice as to how to make
best use of them for the type of word frequency work you are doing.


There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  

Google Home - Answers FAQ - Terms of Service - Privacy Policy