![]() |
|
![]() | ||
|
Subject:
Archives of 2 news groups (and no, google groups won't work for this)
Category: Computers > Internet Asked by: lorin-ga List Price: $10.00 |
Posted:
01 May 2002 14:56 PDT
Expires: 03 May 2002 21:37 PDT Question ID: 9015 |
I need, for a research project, the last six months of posting with headers from two newsgroups (alt.Baldspot and alt.support.diet). I need to be able to download and parse that data using a perl script (which is why I cannot use google groups as they really don't seem to like scripts). I can access, using perl, almost any standard news server, but the only ones I have been able to find have, at most, 4 months of data. Therefore, I need either a server with at least 6 months of history or a location of an archive of those particular groups. I am willing to pay a reasonable amount of money to access that data if necessary. | |
|
![]() | ||
|
There is no answer at this time. |
![]() | ||
|
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: j0e-ga on 01 May 2002 16:25 PDT |
is it possible for you to start your own news server? the age of data stored is configurable based on the storage space you allot to the lists |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: missy-ga on 01 May 2002 17:05 PDT |
lorin, The reason that you've only been able to find about 4 months of data on the news servers you've tried is because Usenet retention time is based on traffic and disk space - articles totalling *several* gigbytes per day flow through the system. Space limitations at various servers dictate that articles only remain on the server for a set period of time (anywhere from a couple days to 2 or 3 months) before dropping off to make room for new posts. If a group is busy, its retention may be lower due to disk space limitations. Firing up Gravity to see how far my own Usenet provider [ news.cis.dfn.de ] went back in alt.baldspot, the earliest header I was able to pull was from March. I asked Gravity to give me 2000 headers, it gave me about 1000. Most commercial servers offer, at most, about 90 days of retention time for text newsgroups. I'm amazed that you found some with 4 months worth! Several major Usenet providers tout their measely 21 day retention times as "Best Retention Ever!"! There are a couple providers who claim not to set retention times on their servers, allowing posts to expire only as disk space fills up. Here are two: [ http://www.meganetnews.com/index.html ] (The FAQ claims no retention limits) [ http://www.meganetnews.com/index.html ] (Their FAQ makes the same claim) I'd highly recommend talking to their technical support department to verify the date ranges before subscribing. To the best of my knowledge, the only decent, long term archive is the one at Google. I do hope one of the two servers above have what you need! Usenet junkie--> missy-ga |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: lorin-ga on 01 May 2002 20:59 PDT |
To answer the first question, yes I could set up my own news server, but then I would have to wait 6 months for my data :) Missy, |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: lorin-ga on 01 May 2002 21:03 PDT |
(prev. comment continued...darn enter key) Missy, I am aware of the problem and indeed emailed several of these hosts but no one had more then the 4 months that I found. I don't think I tried the one you recommended so i have emailed them just in case. I was hoping someone on google would have a answer to that problem. Otherwise I will either have to wait for an API for google groups or, as j0e suggested, wait 6 months (or at least an addtional 2 months for the one server I did find with 4 months) to get my data. This is not the end of the world, but I would prefer not to wait |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: evie-ga on 01 May 2002 23:52 PDT |
Have you tried emailing Google and asking if they can provide you with this data? Have you tried posting to those newsgroups to see if anyone has kept such an archive of their own? Just askin' :) -Yvonne |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: idea-ga on 02 May 2002 00:51 PDT |
Lorin, GigaNews [www.giganews.com] has posts in both groups starting from November 27. That's still not the six months you're looking for, but it's getting closer. -Idea |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: encounterwithrama-ga on 02 May 2002 16:16 PDT |
http://netscan.research.microsoft.com/ and it will show information and statistics for both groups, although extracting the information may be difficult. |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: aquila-ga on 02 May 2002 16:27 PDT |
Supernews has messages dating back to August 2001 for both these groups - about 75,000 messages in total. You can get a 30-day trial at http://www.supernews.com |
Subject:
Re: Archives of 2 news groups (and no, google groups won't work for this)
From: lorin-ga on 03 May 2002 21:36 PDT |
Supernews seems to have done it for me. Thanks all :) |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |