Google Answers Logo
View Question
 
Q: Allow a user to perform OCR on pages of mathematical text. ( Answered 3 out of 5 stars,   6 Comments )
Question  
Subject: Allow a user to perform OCR on pages of mathematical text.
Category: Computers > Graphics
Asked by: david_j_kaplan-ga
List Price: $4.00
Posted: 23 Apr 2002 22:53 PDT
Expires: 30 Apr 2002 22:53 PDT
Question ID: 3304
What applications exist for optical character recognition of documents 
containing extensive mathematical notation.  Such an application would 
allow a user to perform OCR on pages of mathematical text.
Answer  
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
Answered By: roguedog-ga on 27 Apr 2002 10:49 PDT
Rated:3 out of 5 stars
 
Dear David_J_Kaplan,

In my research, I did not find any documentation in the professional,
commercial packages that mentioned their accuracy or even ability to
deal with mathematical equations.   It seems though OCR applications
can convert mathematical equations, the issue is accuracy.  To adjust,
most people still use some sort of additional application for the
proper layout of mathematical equations.

Interestingly, much of my research seems to point to the community of
people who are trying to convert textbooks into Braille as a group
that is actively trying to utilize OCR technology to convert
mathematical equations.  Even they still use TeX for equation layout.
( http://www.rit.edu/~easi/lib/oppo4.htm )

While OCR technology has come along way in the last few years,
"current OCR technology does not always recognize scanned mathematical
or scientific notations accurately. Proofreading is an essential part
of the transcription process to ensure the accuracy of the material."
( http://www.washington.edu/doit/Faculty/Strategies/Academic/Science/science_lab_faq.htm
l)

Or as another reviewer wrote, "OCR, or optical character recognition,
is one of life's disappointments. Like unwrapping a solid, heavy
Christmas present and finding eight airs of socks inside, using OCR
with the expectation that what was printed on the paper will actually
appear in perfect form on your PC screen usually results in amazement,
followed swiftly by annoyance.” (
http://www.itreviews.com/software/s59.htm )

From my readings, when people want to ensure proper notation of their
mathematical expressions, they frequently use TeX.  TeX is a
typesetting system written by Donald E. Knuth, who says in the Preface
to his book on TeX (see books about TeX) that it is "intended for the
creation of beautiful books - and especially for books that contain a
lot of mathematics".  A good reference for TeX vendors and information
is:

http://www.tug.org/interest.html

==Some of the primary vendors in the OCR market are:

ABBYY FineReader OCR 
http://www.abbyy.com/products/fine/

OmniPage Pro
http://www.scansoft.com/products/

NewSoft Presto
http://www.newsoftinc.com/redir/digitaloffice_all.asp?category=ocr4

PrimeOCR
http://www.primerecognition.com/augprime/ocr_accuracy_cost.htm

LaserFiche
http://www.laserfiche.com/products/index.html


OCRchie
http://www.cs.berkeley.edu/~fateman/kathey/ocrchie.html

OCRchie is a Modular Optical Character Recognition Software project
started by a group of UC Berkeley students using the algorithms from
Professor Richard J. Fateman whose "interests include scientific
programming environments; algebraic manipulation by computer (programs
like Macsyma, Mathematica, Maple, Axiom, Reduce); distributed
computing; analysis of algorithms; programming and measurement of
large systems; design and implementation of programming languages;
digital document analysis (optical character recognition). "  
Obviously there are no reviews of this application or even support
probably.

SourceForge.com
For other freeware or shareware OCR applications, you can go to
http://sourceforge.net/search/?type_of_search=soft&words=ocr .
david_j_kaplan-ga rated this answer:3 out of 5 stars
The answer is very close to what I thought it would be.  The comment
about the interest in the Braille community in problems of this sort
was interesting.  The comments concerning the insufficiency of the
current state of the art corresponds to my views; I just wish it was
not so.  I first asked this question to the professional staff of a
research library about fifteen years ago and the answer on the whole
hasn't changed much.  It was certainly worth the cost to revisit the
problem.

Comments  
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
From: watershed-ga on 23 Apr 2002 23:09 PDT
 
Greetings!

There seems to be many different products to choose from, ranging from very 
expensive to completely free.  Here is a resource that lists dozens of 
companies that provide OCR software:

http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Computers/Softwar
e/Character_Recognition__OCR_ICR_/

More information on OCR software can be found here:

://www.google.com/search?hl=en&q=optical+character+recognition

Hope this answers your question.

watershed
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
From: david_j_kaplan-ga on 23 Apr 2002 23:43 PDT
 
Let me try to be more specific.  Although me question could be 
generalized, I want to be rather mundane.  Suppose I have a college or 
graduate level english text concerning a subject in physics, mathematics, 
statistics or the like.  

I want to be able to scan and perform OCR on a selected pages of that 
text with a Macintosh or PC and be reasonably assured that the result will 
be translated into some standard font with perhaps special symbols for 
the less standard graphical constructs of the text.

A general desciption of current OCR technology is too broad.  My question 
might be implicitly  answered someplace in the description; but it might 
also be answered by suggesting that I check the Library of Congress.

I want  a pragmatically useful answer.
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
From: olav-ga on 23 Apr 2002 23:57 PDT
 
This link refers to a company which does it for you. After a while of searching 
I advise this, because I think that existing packages generally will not be 
accurate enough to to the job right for you.

http://www.autotext.com/Services/Scanning_OCR.asp
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
From: mhofstede-ga on 24 Apr 2002 00:56 PDT
 
My best guess would be an OCR program like Adobe Acrobat Capture. Have a look 
at the full feature set documentation: 
http://www.adobe.com/products/acrcapture/fullfeature.html, and especially these 
passages: 

"Automatically store suspect words as bitmapped images in PDF Formatted Text 
and Graphics files. The suspect word bitmap density is adjusted to closely 
match the visual appearance of the surrounding text."

"PDF Formatted Text and Graphics: for compact, searchable files with only one 
layer. The layer reproduces graphics and replaces bitmapped text with formatted 
text based on OCR. This file type (formerly known as PDF Normal) is smaller 
than any other Adobe PDF option, so it is the ideal Web format."

Hope this helps.
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
From: mhofstede-ga on 24 Apr 2002 01:52 PDT
 
Also have a look at: http://www.dbi-
berlin.de/projekte/d_lib/einzproj/intkoop/fp1gb.htm
and
http://www-sop.inria.fr/cafe/Stephane.Lavirotte/Ofr/root.html

Best regards.
Subject: Re: Allow a user to perform OCR on pages of mathematical text.
From: jamesuk-ga on 24 Apr 2002 01:53 PDT
 
My suggestion would be to use the OCR software to scan in the text and produce 
the formulae using a specialised mathematical typesetting language such as 
Latex. Although not the pure answer you would have wished with regards to the 
OCR, this will allow you to manipulate the formulae as needed rather than have 
them in your document as purely graphical objects.

Once you have become familiar with Latex, the entry of formulae should not take 
too much time. 
For more information on Latex see
http://directory.google.com/Top/Computers/Software/Typesetting/TeX/LaTeX 
For information on other mathematical typesetting languages
http://directory.google.com/Top/Science/Math/Software/Typesetting/

Regards,

James

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy