Google Answers Logo
View Question
 
Q: Change, modify or correct text in a scan ( Answered 5 out of 5 stars,   0 Comments )
Question  
Subject: Change, modify or correct text in a scan
Category: Computers > Software
Asked by: pendleton-ga
List Price: $2.00
Posted: 13 Oct 2002 05:42 PDT
Expires: 12 Nov 2002 04:42 PST
Question ID: 76009
I am scanning booklets that I want to print. The text is sometimes
blurry.
I want some software, preferably shareware to take that scanned text
and
redo it in a clearer font as well as to make corrections.

Just the text. Some of which are like the "speech bubbles" in
cartoons. How do it do this as easily and cheaply as possible?
Answer  
Subject: Re: Change, modify or correct text in a scan
Answered By: arcadesdude-ga on 13 Oct 2002 20:39 PDT
Rated:5 out of 5 stars
 
Hello pendleton!

Taking scanned images and recognizing and editing the text in this
manner is called Optical Character Recognition or OCR for short. These
OCR programs take a scanned image and look for patterns that represent
letters. Good OCR programs can recognize images and text and keep the
formatting allowing you to edit the images or text to your liking and
reprint the document that you changed.

I have done OCR on my books to convert them to my personal ebook
reader and am pleased with the results. OCR programs are not perfect
but most of the time they will be correct. Sometimes, though, they
incorrectly "recognize" a character (a letter) and I must edit it by
hand. This is not a problem for small books or single pages, but for
long books this can be tedious to read though the entire text. OCR
programs that are "smart" can use a spelling dictionary to compare
recognized words with the word it "thinks" is correct and show you
there is a discrepancy. This greatly helps fix OCR errors and can be a
big help in the end. So the main thing to do if you go the OCR route
is to find the best OCR program. (I have several recommendations in
the OCR section below.)

That is one method for how to do what you want. (See below for how to
do this step by step).
There may be an easier way depending on the text you wish to
recognize.

If the text is not much to type and it is behind a white (or solid
color) background
(white inside a speech bubble for example) then you may be able to
scan the image and edit it.
You could delete the blurry text and then use the text tool of many
image editing programs
and type in the correct and non-blurry text yourself. That would be
the best way if you do not have much to type. If you have a lot to
type, then try the OCR method.

The Image Editing method:
Use this method if you can type all the text in (if the amount to type
is not too much) and
the background for the text is white or of a solid color.

1. Scan the image.
2. Open the image in an image editor.
3. Select the text in a box using the select tool.
4. Press delete or use a command that will delete (turn to a solid
color you set or white)
   the selected are.
5. Use the Text tool (Usually an uppercase 'A' or 'T') and select your
font, size, and styles
6. Type the text in the section you just cleared.
7. Voila! Cleaner text!

The OCR method:

This method uses a program that can preserve images while recognizing
text and allowing you to edit it. You need a "smart" ocr program such
as

TextBridge
http://www.scansoft.com/textbridge/
$80 for Pro 11 version

or

PaperPort
http://www.digitalriver.com/dr/v2/ec_MAIN.Entry10?V1=339132&PN=1&SP=10023&xid=21763
$99 for 8.0 full version

Generally, the best OCR software that is for home use is <$100 and
will allow you to preserve your images, scan, recognize and edit text,
publish to pdf and various formats, be very accurate, and do so with
the least amount of human intervention.

There are free OCR programs such as

SimpleOCR
http://www.spyfind.com/ocr.html
Free

But they are not as good at keeping images.

You follow this procedure which is generalized for most OCR programs:

1. Scan the image[s]
2. Recognize the text
3. Edit and correct the text
4. Save/Print or format the final product depending on how good the
OCR program is.
5. Done!

You can use a combination of the above methods as well by using the
OCR to scan the text
and then past the text into the text tool when you are editing the
image.
That could be a way you can do this without buying a good OCR program.

Anyway you do this, it is going to be time consuming so your best bet
is to do the above then show a friend how to do it for you. :)

This is not legal advice, just an opinion. You can edit the brochures
and use them for personal use only, unless you own the copyright on
them (or have permission from the copyright owner of the brochures or
they are in the public domain). Such editing with non-commercial
intent is covered in the "fair use" clause of the DMCA (Digital
Millinum Copyright Act). That is not legal advice however, so use good
judgement and contact a lawyer about such copyright questions if you
need to.

I hope that answers your question.

If you have anymore questions about this please "Request Answer
Clairification" and I'll do my best to help you further.

[In the meantime I'm looking for free OCR software that can keep
images.]



Useful Links

Project Gutenberg "Making Etexts from Paper Originals" paper"
http://promo.net/pg/vol/a_v_anders.html

From Paper to PDF:
About scanning text
http://slashdot.org/askslashdot/00/06/05/2353219.shtml

About scanning books
http://slashdot.org/article.pl?sid=02/05/07/2117231&mode=thread&tid=137

OCR Programs:

Windows OCR Programs:

TextBridge
http://www.scansoft.com/textbridge/

Cuneiform '99
http://www.ocr.com/

OmniPage
http://www.caere.com/omnipage/


Unix Based Programs:
http://documents.cfar.umd.edu/ocr/


About OCR:
http://www.scansoft.com/omnipage/ocr/
http://www.dataid.com/aboutocr.htm
http://www.imageprocessingtools.com/ocr.html
http://www.webopedia.com/TERM/o/optical_character_recognition.html

Search Strategy

"free ocr software"
://www.google.com/search?num=100&hl=en&lr=&ie=UTF-8&oe=utf-8&q=%22free+ocr+software%22&btnG=Google+Search

"about ocr"
://www.google.com/search?num=100&hl=en&lr=&ie=UTF-8&oe=utf-8&q=%22about+ocr%22&btnG=Google+Search

Clarification of Answer by arcadesdude-ga on 16 Oct 2002 12:14 PDT
I search many free and shareware ("trialware") OCR programs and the
best that I could find that is free (limited to 15 saved) is this:

TypeReader Professional 6.0 Trial Version
http://www.expervision.com/download_tr6.htm
Free

It is limited to 15 saves, but you can put all your images in a huge
tiff image (multipaged) so that you can OCR an unlimited number of
images (scanned brocures) as long as you do the scanning first and
save all the brochures to a multipage tiff file. Then you can OCR and
fix the text or change the font and it will keep the images and
formatting. From there you can print them out!

You could use any free OCR program you wish but this is the only free
one (even a limited trial version) that I found that will keep the
images.

Good luck and please request clarification if you need futher help!
pendleton-ga rated this answer:5 out of 5 stars
Excellent job. Gave some new ideas to my web master that he had not
tried. Should help us to get a really great image on the Internet and
then download to print. Thank you arecadesdude-ga and Answers Google.

Blessings.
JohnP.

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy