Google Answers Logo
View Question
 
Q: Program to collect (several different) discussion groups' recent posts ( Answered 5 out of 5 stars,   0 Comments )
Question  
Subject: Program to collect (several different) discussion groups' recent posts
Category: Computers > Programming
Asked by: jude1-ga
List Price: $75.00
Posted: 06 Sep 2003 17:36 PDT
Expires: 06 Oct 2003 17:36 PDT
Question ID: 253038
I am looking to have a sort of "Talk Soup" (the show on E!) sort of
page on my site, but instead of recapping the highlights of talk
shows, I would like to recap the daily highlights of several different
discussion groups, on several different forums. (I.E.: A few groups
are on Yahoo, some on Delphi, some on EZBoard, etc.). I want my
visitors to be able to get the lowdown on all the discussion groups
without having to sign up for all of them or visit many different
sites. My questions are:
1. Is this possible?
2. Is this legal?
3. How can I do this? (I.E.: Is there an existing program that I can
use to do this, if not, then how?)
**If this is not possible, or not legal, please do not answer, as it
will simply be a waste of $75.00. I hope this is acceptable.**
Thanks, Jude1
Answer  
Subject: Re: Program to collect (several different) discussion groups' recent posts
Answered By: webadept-ga on 07 Sep 2003 00:02 PDT
Rated:5 out of 5 stars
 
Hi, 

First of all it is legal, you just need permission from the groups to
do so. That's easy. Send a message to the group owner/owners, get
his/her permission and have them announce in the group and add to the
FAQ's of the group that this is being done. It is likely that they
will accept it as 1) it gives their group more attraction from the
outside, and 2) the users of the groups themselves may enjoy having
such a "Highlights" page they can access.

Second it is definitely possible. In fact it is done quite a bit. The
main method of doing this is using a script written in Perl. There is
a mod in Perl called LWP, which is designed for doing this exact task.

I checked the Yahoo groups and they are not in Secure mod, nor does it
look like any of the others are either, but the login program that
gets your username and password and passes back the cookie information
we need to access the Group pages, is in a secure area.

https://login.yahoo.com/config/login

To see this, go to this page and look at the source code:
http://login.yahoo.com/config/login?.intl=us&.src=ygrp&.done=http://groups.yahoo.com%2F


This is not really a problem for LWP, but you need another Perl Mod
called NET::SSLeay to do so.
http://search.cpan.org/author/SAMPO/Net_SSLeay.pm-1.25/SSLeay.pm

But we also need to be running our Perl script on a Linux box, or a
Unix type box which has OpenSSL installed with the Developer
libraries. As far as I know, Net::SSLeay is not available for a
Windows installation.

Also, in the javascript area you will see that the user name is made
into an MD5 string. MD5 is a hash, that takes your string (in this
case your email address) and makes a type of encrypted string from
that, which is always 45 characters long. The reason they are doing
this is that this sign in page is not encrypted, so they are
encrypting the information before sending it to the login program
which is in the background. That is not a worry either, we just
install the MD5 mod and encrypt our email address before sending it
with LWP.

Pages on LWP and how to use it can be found here :

LWP Page on CPAN
http://search.cpan.org/dist/libwww-perl/lib/LWP.pm

lwpcook 
http://www.xav.com/perl/site/lib/lwpcook.html

LWP CPAN Download Page
http://www.cpan.org/modules/by-module/LWP/

Page on Net::SSLeay
http://search.cpan.org/author/SAMPO/Net_SSLeay.pm-1.25/SSLeay.pm

Page on MD5 Mod
http://search.cpan.org/author/GAAS/MD5-2.02/MD5.pm


As I said before a Mod or Module is a set of code that is common, has
an large interest base and is setup to perform a basic task that is
done over and over, to save you and me time when we come to that task
ourselves. Just out of curiosity, and knowing that someone had to have
done this before, I looked for Mods dealing with the Yahoo groups, and
found quite a few.

http://search.cpan.org/search?query=WWW::Yahoo::Groups&mode=module

There is another, new Mod, WWW-Mechanize, which allows your code to
act exactly as a web-browser would. Follows links, pushes buttons,
saves pages, has a back-button object. The whole deal. If you can see
it with your web browser, Mechanize can get it and parse it for you.
The WWW::Yahoo::Groups Mod is built on it, which is what reminded me
of its existence.

So, Yes, very possible and being done by others as we speak. Just need
the right tools for the job, and a system that can use those tools.
Specifically we need a Linux box with RedHat or something like it
installed. OpenSSL installed for those areas that we need to address
the secure pages, and the Perl with Mods installed so we can use the
code. Then, we write the code, and get our information, parse it out
to your Talk Soup page. Since we have a Linux box, we can then setup a
timed job to get the pages, parse them and put them on your website,
once an hour, once a day or once a week, depending on your needs.

I don't personally have any experience with the Mac OS X, but I am
told that OpenSSL and all the Perl Mods can be installed on that, and
used just as they are used under any other Linux system. I can check
into that more if you have that option available to you, just ask for
Clarification on your question.

Thanks, 

webadept-ga

Request for Answer Clarification by jude1-ga on 07 Sep 2003 03:09 PDT
Dear Webadept,
Thank you for the quick answer. I must admit, however, that it was
mostly "Greek to me," meaning that I do not understand it. While I do
run a website, I use Frontpage to create & maintain it and so I've
never used Perl. Is it possible to reword your answer in layman's
terms for the non-programmer? I understand that yes, it is possible,
but beyond that...
I have a Windows XP home computer. Since you mentioned that it cannot
be done on this operating system, can I have someone else write the
code and implement it? (I am thinking of ELance, Ebays freelance
service)? What do you think? Thank you - Jude1

Clarification of Answer by webadept-ga on 07 Sep 2003 10:52 PDT
Hi again, 

There is another way of doing this, which can run on windows, but I
need some more information from you. First, do you have the ability to
setup a email address which can be used only for those groups. Each of
them appear to have the option of sending you new posts via email. It
would be best if you could have a single purpose email address for
each group, in other words, the yahoo group sends it's messages to one
address, while the Delphi group sends its messages to a different
email address.

If you have this ability, to have a separate email address for each
group, then you could have created a program in Perl on Windows, or VB
or just about anything really, to access those Pop Mail accounts,
download the new messages and sort them out for you into HTML files.

These email accounts would be only accessed by your new program. You
wouldn't want to access them with Outlook or anything like that,
unless the new program used Outlook's VBA to get and sort the
messages. I do very little VB programming with Outlook, so I don't
know if it would allow you to build HTML pages or not. It should,... ?

Second, if this method is not feasible, then does your ISP run a
Unix/Linux system with SSL installed? And do they have Perl or PHP
installed as well. A phone call or email to them will find that out
for you. Check to see if they have CGI perl, with LWP and Net::SSLeay
installed. The program could then be created on your ISP server as a
CGI script and run from there.

The last avenue I can think of right now is to use Eudora as the mail
client, instead of the Microsoft clients. This wouldn't be much of a
hassle since all the email addresses should only be receiving mail for
the Groups you are scanning. The reason for Eudora is it has a
fantastic filtering system, and, has the ability to call a program
outside of itself and send the message to it, if that message meets
the filter requirements. I use this for clients all the time.

For instance, lets say you are a client, and I want to make sure I see
your message no matter where I am in this big world. I set up the
Eudora filter to see your email address, and then send that message to
a special folder and send it to an outside program. This outside
program sends the message via text to my Pager. Very cool stuff that.

Using this system you can gather and sort the messages rather easily
and all you would need is a program which formatted the message into
HTML from that point. Again, this outside program could be created in
Perl for Window, PHP, Python Visual Basic, .. anything. No system
requirements would be restricting the development since it is not, by
itself, trying to access anything that is using encryption.

Installing Eudora takes out the need for the seperate email addresses,
since we can now call seperate handling programs, depending on which
group the message was from, so we only need one new email address
where all our group messages are sent to.

Eudora EMail
http://www.eudora.com/

Let me know which of these are available to you and I'll look around
for someone that is doing this. I have a suspicion that all of these
have been done already by someone.

For example :

http://www.rentacoder.com/RentACoder/misc/BidRequests/ShowBidRequest.asp?lngBidRequestId=78925

and 
http://sourceforge.net/projects/grabyahoogroup/


thanks, 

webadept-ga

Request for Answer Clarification by jude1-ga on 07 Sep 2003 17:59 PDT
Webadept:
Thanks for the translation. It sounds like Eudora would be the best,
if I understand correctly. This can be done on my Windows XP home
machine and it doesn't matter if my ISP has Unix/Linux with SSL,
right? Plus I need only one email address. I just now have to find
someone to write the program? (And make it easy enough for a
non-programmer to understand & run.) If you do know anyone, I would
appreciate a recommendation very much (and if you don't mind, your
opinion on what would be a fair price to pay for the program). -Jude1

Clarification of Answer by webadept-ga on 08 Sep 2003 00:40 PDT
Hello again, 

Eudora is a good option for this, both in function and in your
understanding of how to set it up. I'll post a list of sites that have
help files on the Eudora filters here in a while.

To help you with pricing I would need to know a few things. First, the
exact number of groups you are trying to capture, and second (more
determinably) what it is that determines a "highlight" from those
groups. This can be a little tricky.

Some options I can think of off the top of my head is listing out just
the first and second level of the messages... meaning the first post
with the first reply.

Another would be a display of the postings with check boxes on your
program where you physically tell the program to put the ones you
choose on your webpage for display.

The first is an automated switch, and may or may not show real
"highlights", the second is more time consuming on your end and would
require more programming in the user interface area, thus lifting the
cost. User interfaces are always the most time consuming part of
programming. Despite the added cost and time, this second option gives
your page the highest percentage of getting real "highlights".

If you can describe to me what you think determines a highlight, I can
help you with the programmer and the pricing. Not to toot our horn too
much around here, there are more than a few experienced programmers in
the ranks of the Google Researchers. So you may be able to just post a
new question with a bid and have your program created for you right
here. There are other options as well and I'll post those once I have
an idea of what it is you are going to require.

Just as a note, my schedule is a bit cramped on Monday, so I may not
get back to you until late in the evening, once you reply to this.

thanks, 

webadept-ga

Request for Answer Clarification by jude1-ga on 08 Sep 2003 17:39 PDT
Webadept,
I am looking at about 15 - 20 groups. The "highlights" are basically
what's been said that day, exactly as you described in your first (and
thankfully easier!) solution. Please don't waste time on finding the
Eudora info, as I am thinking that I won't really know what to do with
it and I have already decided to bid it out. I have a few more
questions, but at this point, I feel I need to add more money to the
question, seeing as all the work you've done. Is it possible to do
this, and if not, to link two questions together? Thanks, -Jude1

Clarification of Answer by webadept-ga on 08 Sep 2003 21:56 PDT
Yes you can start another question and request a researcher by name.
It is done, though not common. Another option is the tip option, which
you can add quite a bit to the payment, but doing a tip closes the
question, so it would probably be better just to open another
question.

I would suggest at least having a glancing knowledge of how the Eudora
mail program does filters, so it will be easy to understand what you
programmer is talking about. Here are some links, you can read them or
not. At least you will have them for future reference.


Here is Eudora's own tutorial
http://www.eudora.com/techsupport/tutorials/win_filters.html

This one is more involved and shows many of the abilities
http://www.cecilw.com/eudora/filters.htm

25-30 doesn't tell me much. I'm sorry for being vague in my request.
What I need to know really is how many per service. See, setting up a
ripper for the incoming email and creating the HTML you want for you
page, is going to change very little between groups on the same
service, and may change a great deal between different services.

Having the first option puts the cost for that setup below a couple
hours of work. The messages will come in, the program will record the
subject line of the thread in a file, minus all the RE's and FW's, and
then HTML the first two for that thread, with a counter. Once the
counter reaches two, all others are ignored. Pretty basic stuff there.

You will need to make an "example" page of what the HTML page(s)should
look like so the programmer can make it. It is common to make a
template, which has a place for an embedded table which can grow in
size.

You also need to decide how often the page gets created, once a day,
at the end? Twice a day? Will you be the one making and posting the
page, or do you want the program to do that on a timer?

I know I seem to be asking more questions than you are, but if you get
a good idea of what you want, before you higher the programmer then it
will save you a great deal of money and frustration.

By the way, do you have a working knowledge of Cascading Style Sheets
and do you use those?


webadept-ga

Request for Answer Clarification by jude1-ga on 09 Sep 2003 07:41 PDT
Dear Webadept,
Whew! This turned into a long one, huh? I have an idea: How about I
include extra money in the tip now, and future responses from you (or
me) can be posted as comments? Would this be acceptable?

Clarification of Answer by webadept-ga on 09 Sep 2003 09:19 PDT
When you post a Clarification Request, the Researcher working on your
question, in this case, me, gets an email letting me know that you
need something else or have replied to my last post.

I get nothing for comments. I won't know when you post, so really
that's not such a good idea. I like to be as responsive as I can.

webadept-ga

Clarification of Answer by webadept-ga on 09 Sep 2003 10:50 PDT
Just a quick note here : So far I'm fine with the amount of your bid
and if you decide to add a tip at the end of this, that's great, but
the bid for now is fine. So far I'm just pulling from previous
experience of doing this type of thing in previous years. (in other
words, I'm not really working that hard here :-) So, let's continue in
this thread and get your program needs laid out, some more of your
questions answered and a real plan to move forward. Just forget about
the funds for a while and let's get you a real working answer.

If you can post the answers to my last questions today I'll have time
tonight to work up a project sketch for you and some instructions for
your programmer, as well as a bid expectation.

webadept-ga

Request for Answer Clarification by jude1-ga on 09 Sep 2003 19:28 PDT
Q)  How many per service: 
A)  About 20 Yahoo, 3 EZBoard, 1 Topica. 

Q)  "Having the first option puts the cost for that setup below a
couple
hours of work. The messages will come in, the program will record the
subject line of the thread in a file, minus all the RE's and FW's, and
then HTML the first two for that thread, with a counter. Once the
counter reaches two, all others are ignored. Pretty basic stuff
there."
A)  YES! This is what I want.
 
Q)  You will need to make an "example" page of what the HTML
page(s)should
look like so the programmer can make it. It is common to make a
template, which has a place for an embedded table which can grow in
size.
A)  I'm not exactly sure what you mean here... My forum is generated
by Bravenet ( www.bravenet.com  ) So it's their standard forum.

 
Q)  You also need to decide how often the page gets created, once a
day,
at the end? Twice a day? Will you be the one making and posting the
page, or do you want the program to do that on a timer?
A)  Once a day, at the end of the day. I would like the program to be
on a timer.
  
Q)  By the way, do you have a working knowledge of Cascading Style
Sheets
and do you use those?
A)  Yes, and Yes.

Whew.

Request for Answer Clarification by jude1-ga on 09 Sep 2003 19:35 PDT
Dear Webadept,
Can you possibly either:
1) email me the quote you think would be fair (as opposed to posting it here) 
-or if not possible-
2) not include it?
Thanks, -Jude1

Clarification of Answer by webadept-ga on 09 Sep 2003 20:28 PDT
Oh.. what is your web address.. I didn't realize that you wanted these
posted into a forum program.

Clarification of Answer by webadept-ga on 09 Sep 2003 20:32 PDT
Okay.. whew.. freaked me out a little there, thought we were back to
the linux box thing. I just checked the bravenet demo site, and if
yours is as basic as that, we shouldn't have much of a problem there.
Send me your website URL anyway, so I can go over it.

thanks

Request for Answer Clarification by jude1-ga on 09 Sep 2003 22:11 PDT
Webadept,
It's www.mammasmilk.com
More specifically, it is
http://www.mammasmilk.com/the_lowdown.htm
-Jude1

Clarification of Answer by webadept-ga on 09 Sep 2003 22:59 PDT
Okay, Here it is, 

You have 3 main setups. The individual lists won't change much, but
the program does need to know they exist and why they are being
called. So, 3 major setups, about 1 hour a piece there, and about 1
hour more for the tweaks nessesary for each group. That's 4 hours.

Setup instructions for the Eudora filter, about 15-30 minutes. 

Setup instructions for the program, about 15-30 minutes. 

Program mail sorting (getting the right posts, in this case, first and
second to a thread) about 2 hours

Program upload into your forum, about 3 hours. 

Testing 1 hour. 

That's a real lean look, I would guess 8-10 hours just to be on the
safe side. It's not hard coding, but it is coding and all coding like
this takes a level of skill.

I would look for either a Perl, Visual Basic or Visual C++ or a Visual
NET programmer of some type. The program isn't really doing anything
"visual" but does need to run on windows without causing problems. So
one of those.


Your Project description is this:

Windows XP compatible program that can be called from command line. 
Program will receive email from Eudora, parse the email, and post it
to a basic Bravenet forum as a given user.
see http://www.bravenet.com/ for example forum. 

Program must "login" so that the posting is from a known user. 

Emails are coming from 3 separate group services with a total of about
24 separate groups
About 20 Yahoo, 3 EZBoard, 1 Topica. 

Program needs to identify new thread from each group, post the
first/starting thread and the first reply to that thread only. All
other emails on that thread need to be ignored.

Name and Email will be from the predetermined user. 
Subject field will be the subject of the Email
and the Message field will be the body of the email, with any advert
lines ripped from the message.

Program will gather a data file throughout the day and post once a day
using the Windows XP scheduler program.

Acceptable languages are any Visual NET, Visual Basic, Visual C++ or
Perl.

That should do it for you. 

thanks, 

webadept-ga

Clarification of Answer by webadept-ga on 09 Sep 2003 23:00 PDT
By the way, have to say this now that I see the scope of the idea you
are trying to achieve, it is an assume use of Internet resources. Very
cool.

webadept-ga
jude1-ga rated this answer:5 out of 5 stars and gave an additional tip of: $50.00
Impressive. All the information I needed and more. Very quick, too.

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy