Google Answers Logo
View Question
 
Q: Ho to convert a *.pdf file to a text file ( Answered 5 out of 5 stars,   4 Comments )
Question  
Subject: Ho to convert a *.pdf file to a text file
Category: Computers > Programming
Asked by: vaac-ga
List Price: $2.00
Posted: 08 Feb 2004 20:40 PST
Expires: 09 Mar 2004 20:40 PST
Question ID: 304870
Does anybody know of a way to convert a *.pdf file, or a document word
pad file or a new bitmap image paint file to a text file. I want to do
this so that I can edit it. The only methods I can think of are
manually copying it, or printing it out, scanning the printed page and
using OCR. Both methods are very inconvenient, laborious and error
prone. Is there a better way? If not can at least the *.pdf file be
OCR-d or introduced in Visioneer's PaperPort without first copying it
to paper?Does anybody know of a way to convert a *.pdf file, or a
document word pad file or a new bitmap image paint file to a text
file. I want to do this so that I can edit it. The only methods I can
think of are manually copying it, or printing it out, scanning the
printed page and using OCR. Both methods are very inconvenient,
laborious and error prone. Is there a better way? If not can at least
the *.pdf file be OCR-d or introduced in Visioneer's PaperPort without
first copying it to paper?Does anybody know of a way to convert a
*.pdf file, or a document word pad file or a new bitmap image paint
file to a text file. I want to do this so that I can edit it. The only
methods I can think of are manually copying it, or printing it out,
scanning the printed page and using OCR. Both methods are very
inconvenient, laborious and error prone. Is there a better way? If not
can at least the *.pdf file be OCR-d or introduced in Visioneer's
PaperPort without first copying it to paper?
Answer  
Subject: Re: Ho to convert a *.pdf file to a text file
Answered By: webadept-ga on 09 Feb 2004 12:24 PST
Rated:5 out of 5 stars
 
Adobe has a service online, which you can use by posting the pdf file
on a webserver some where, so the bot can get at it, or, by emailing
the pdf to them.  I use the email version of this quite often, and it
is rather good, and the responding email is very fast.

The link for these tools is here :

http://www.adobe.com/products/acrobat/access_onlinetools.html

google search PDF to Text


webadept-ga

Request for Answer Clarification by vaac-ga on 10 Feb 2004 20:50 PST
I have tried right double clicking and left double clicking the "T" in
Acrobat reader. Both enable  to highlight several lines in the first
column but not without also highlighting several lines in the next
column. Pressing CTRL-C,  going to notepad or Wordpad  and pressing
CTRL-V produces text; but the text is  from somewhere else in the
*.pdf document, NOT the text I have highlighted.
Could you please elablorate more precisely how I am going to go about
if I want to copy a sentence ot a table to text.

Clarification of Answer by webadept-ga on 11 Feb 2004 03:23 PST
Hi, it appears you are asking for clarification from one of the
comments below. Did you try the address I gave you for the Adobe site?
That will take all of the text from the document and email you back
that text in a plain text email. From there you can do what you want
with it.

As I suggested, the turn around is very fast, and this should give you
a more automated way of conversion as well. I personally use a perl
script which takes all the pdf's in a directory and emails them one at
a time to the adobe site, and then collects the returning text files
as they come in, placing them in a database. Just an example of what
can be done.

The commentors below will not get a notice that you have asked for
clarification, so don't feel bad if they don't answer you.

Thanks, 

webadept-ga
vaac-ga rated this answer:5 out of 5 stars and gave an additional tip of: $1.00
Sorry I could not rate this earlier but I was too busy and at first
things did not work. Hope to be able to use your answer and comments
by others advantageously to the extent that this nessy task can be
made to work.

Comments  
Subject: Re: Ho to convert a *.pdf file to a text file
From: pinkfreud-ga on 08 Feb 2004 20:47 PST
 
This might be useful:

http://www.verypdf.com/pdf2txt/pdf2txt.htm
Subject: Re: Ho to convert a *.pdf file to a text file
From: pinkfreud-ga on 08 Feb 2004 20:52 PST
 
Another possibility:

http://www.simtel.net/product.php?url_fb_product_page=51612
Subject: Re: Ho to convert a *.pdf file to a text file
From: mathtalk-ga on 08 Feb 2004 21:09 PST
 
If you have an up-to-date copy of Adobe Acrobat Reader, then you will
see that there is a button with T on it, which turns on the text
selection tool.  For a modest size document you can cut and paste the
text using this mode.  By default the Reader is not in text select
mode, but rather in "scroll" mode, so that pressing the left mouse
button causes the the cursor to "grab" document and move it up or down
with the mouse.

For a complex document text selection often gets the text from
sidebars and other "intrusion" elements interleaved with the main
text, so this is offered simply as a "quick and dirty" approach to
getting such a job done.

regards, mathtalk-ga
Subject: Re: Ho to convert a *.pdf file to a text file
From: mathtalk-ga on 11 Feb 2004 21:50 PST
 
Hi, vaac-ga:

I see that you read my Comment, part of which remarks on the
difficulty of disentangling intruding elements from a cut-and-paste
job in Acrobat Reader.

The only simple approach I'm aware of is to work with small patches of
text at a time.  This may well be more effective than an OCR exercise,
certainly with a limite amount of text to process.

For an industrial strength solution, try the method outlined by
Webadept-ga in his Answer.

regards, mathtalk-ga

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy