Google Answers Logo
View Question
 
Q: PDF conversion ( Answered 5 out of 5 stars,   4 Comments )
Question  
Subject: PDF conversion
Category: Computers > Software
Asked by: xsansara-ga
List Price: $40.00
Posted: 16 Jun 2004 08:09 PDT
Expires: 16 Jul 2004 08:09 PDT
Question ID: 361889
I am searching for an open source "pdf to ascii" conversion tool.
Programming language preperrably JAVA but any other will do.
Alternatively a good source for information to write my own. (Not the
1000 pages of the Adobe specification)
Answer  
Subject: Re: PDF conversion
Answered By: tox-ga on 16 Jun 2004 10:54 PDT
Rated:5 out of 5 stars
 
Hi xsansara!

I have found an open-source Java library which includes text-extraction from a PDF.

From its homepage:
"Description
PDFBox is a Java PDF Library. This project will allow access to all of
the components in a PDF document.
...
This ships with a utility to take a PDF document and output a text file."

It can be downloaded here:  http://www.pdfbox.org/
More information can be found at its SourceForge page here:
http://sourceforge.net/project/showfiles.php?group_id=78314

Included with the source are the appropriate documentation which
should be all you need to get started.

If you require any clarification or would like some assistance using
this library, please don't hesitate to ask for clarification before
closing and rating this answer, so that I may best help you.

Cheers,
tox-ga

Clarification of Answer by tox-ga on 16 Jun 2004 11:02 PDT
Hi,
Just to clarify, the class in here you will want is 
  org.pdfbox.util.PDFTextStripper

--tox
xsansara-ga rated this answer:5 out of 5 stars
Gee thanks just what I was looking for.

Comments  
Subject: Re: PDF conversion
From: crythias-ga on 16 Jun 2004 08:46 PDT
 
I am not a google answers researcher.

For less than the price of your question, you can get a program
($12.95) to do this:
http://www.thebeatlesforever.com/processtext/abcpdf.html

Open Source: try ImageMagick www.imagemagick.com

Search Criteria: PDF convert
Also personal knowledge.
Subject: Re: PDF conversion
From: pafalafa-ga on 16 Jun 2004 09:44 PDT
 
I'm not enough of a programmer (heck...I'm not a programmer at all!)
to know if this will do the trick or not:

http://rootr.net/man/man/ps2ascii/1
ps2ascii  -  Ghostscript translator from PostScript or PDF to ASCII

but it looked like it came awfully close.  

Let me know if it's on target or not.

pafalafa-ga
Subject: Re: PDF conversion
From: clergy-ga on 16 Jun 2004 10:07 PDT
 
www.gnu.org is where id look for open source since it is the largest
opensource community
Subject: Re: PDF conversion
From: xsansara-ga on 17 Jun 2004 01:10 PDT
 
Point was I didn't want some pdf2ascii Tool. There are enough out
there already but something well documented and open source to put
some of my own stuff inside.  Tox found me right what I needed.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy