Google Answers Logo
View Question
 
Q: Simple tool for email parsing ( Answered 5 out of 5 stars,   0 Comments )
Question  
Subject: Simple tool for email parsing
Category: Computers
Asked by: spinalwiz-ga
List Price: $3.00
Posted: 11 Nov 2002 11:38 PST
Expires: 11 Dec 2002 11:38 PST
Question ID: 105421
Could you find me a linux program/perl module to parse emails. I would
need to be able to access individual emails from a linux mbox (so I
can move them to various folders), and extract sender, subject,
and body information. The simpler the better. Also, as many emails
contain html in the body, I would like a way of removing this.

Thanks
Answer  
Subject: Re: Simple tool for email parsing
Answered By: kencyber-ga on 11 Nov 2002 13:29 PST
Rated:5 out of 5 stars
 
When looking for help on parsing anything, perl is a very good place
to start.  When searching for Perl modules, I always start my search
on CPAN (the comprehensive Perl archive network.)  This site provides
a nearly complete list of all proven modules for Perl and will help
you extend Perl to accomplish many tasks.

Mark Overmeer has written an excellent suite of Perl modules that will
assist you in managing mail called Mail-Box.  It provides all the
functionality you will need to connect to a mail server, navigate
folders, retrieve messages and parse the headers.  It even provides a
search functionality (Mail::Box::Search) that will allow you to locate
individual messages.

The Mail::Box module is what you need to open a mail box and retrieve
all of the messages in that box.  It supports the standard Linux
"mbox" format and makes reading a mail forder a breeze.

Another Perl module, Mail::Parser (which is provided in this package)
gives you the ability to parse out the header using the "readHeader"
method.  This will place all of the message header information into an
associative array (hash table.)  This will allow you to enumerate all
of the header values using a "foreach" statement in Perl.

In order to move message from one folder to another, you will need to
create an instance of a mailbox manager (Mail::Box::Manager) class. 
This provides you with the "moveMessage" method which will move a
message from one folder to another.  The documentation on the CPAN
site is pretty straightforward and I have provided references below.

Be sure and read the documentation for all of the Perl modules in this
package carefully.  A lot of these modules tie together so that ono
class utilizes another class.  I've found that Perl documentation is
some of the most well written product documentation you can find.

You should be fairly proficient at Perl in order to utilize these
modules.  If you are programming this yourself, try to start with some
basic functionality and add features as you go along.  Best of luck!

kencyber-ga


References

CPAN - Comprehensive Perl Archive Network
http://www.cpan.org/

Mark Overmeer / Mail-Box 2.029
http://search.cpan.org/author/MARKOV/Mail-Box-2.029/

Mail::Box - manage a mailbox, a folder with messages
http://search.cpan.org/author/MARKOV/Mail-Box-2.029/Mail/Box.pod

Mail::Box::Parser - reading and writing messages
http://search.cpan.org/author/MARKOV/Mail-Box-2.029/Mail/Box/Parser.pod

Mail::Box::Manager - manage a set of folders
http://search.cpan.org/author/MARKOV/Mail-Box-2.029/Mail/Box/Manager.pod


Search Strategy
Personal Knowledge

Clarification of Answer by kencyber-ga on 11 Nov 2002 13:35 PST
For converting HTML in e-mails to text, you can make use of the
Mail::Message::Convert::HTMLFormatText class that is part of this same
package.  The "format" method will strip out all HTML coding from the
mail message and give you a plain text version.


References

Mail::Message::Convert::HTMLFormatText
http://search.cpan.org/author/MARKOV/Mail-Box-2.029/Mail/Message/Convert/HtmlFormatText.pod
spinalwiz-ga rated this answer:5 out of 5 stars

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy