Google Answers Logo
View Question
 
Q: Scanning to produce small .pdf format files ( Answered 3 out of 5 stars,   10 Comments )
Question  
Subject: Scanning to produce small .pdf format files
Category: Computers
Asked by: web7-ga
List Price: $7.00
Posted: 21 Oct 2002 21:00 PDT
Expires: 20 Nov 2002 20:00 PST
Question ID: 86284
PROBLEM: Many of my Email recipeints cannot open .TIF, .JPG, .BMP, or
.PNG files I send as Email attachments. Yet, when I scan a page and
save the image as a .pdf file, a 1-page scan produces a 1.4 MB file. 
Yet .pdf files I receive are as small as 50 KB.

QUESTION: How do I get my new HP-5470c scanner to produce small .pdf
format files that I can send via Email which almost everyone can open,
view and print?  Or is there some format that is small, almost
universal, and better to use than .pdf format?

MY HARDWARE & SOFTWARE: I have a new HP-5470c scanner with the
following scanning software: Precisionscan, PaperPort, ScanDirect,
ACDSee, Corel Print Office 5, and Microsoft Office Document Imaging.

Thank you,
WEB7
Answer  
Subject: Re: Scanning to produce small .pdf format files
Answered By: leapinglizard-ga on 22 Oct 2002 16:54 PDT
Rated:3 out of 5 stars
 
Most email client programs should allow the user to view files in all
the formats you've mentioned. Some of your correspondents may be
inhibited not by the file format, but by their lack of experience in
opening attachments of any kind. You may wish to refer them to a web
tutorial on this subject.

Computertim Technologies FAQ
opening email attachments using Microsoft Outlook
http://www.computertim.com/howto/article.php?topic=outlook&idn=73

Seattle Pacific University Computer Help
opening email attachments using Microsoft Explorer
http://www.spu.edu/help/email/attachments.html

Silicon Connections FAQ
opening email attachments using Netscape Communicator
http://www-old.silcon.com/howto/faqattnet.htm

The reason you're used to receiving small PDF files is that they are
typically generated from text files, which contain little data apart
from some typesetting information and the text itself. In contrast,
graphics files such as the kind you're converting to PDF contain a
great deal of visual data, which tend to occupy much more space than
those concerned with text.

Your best bet is to use a compressed graphics format, such as .JPG,
that will squeeze the information into less space while doing its best
to preserve the appearance of the file. You may also reduce file size
by lowering your scanning resolution; for further guidance, consult
the documentation that came with your scanning software.

If you insist on using .PDF, I can only recommend that you compress
your files to reduce their size. By using the compression utility that
comes with every copy of Windows, you ensure that anyone who knows how
to uncompress attachments and can deal with the .PDF format will be
able to read what you send them. Your space savings will depend on the
nature of the graphics you are compressing.

Microsoft Help
compressing files and folders in Windows XP
http://www.microsoft.com/windowsxp/expertzone/tips/october/edwards1.asp

Smart Computing Learning Series
how to compress and uncompress files
http://www.smartcomputing.com/editcat/ SMART/STORAGE/158/11971/

Keywords used:
how to open email attachments
how to compress files windows
how to uncompress files windows

Regards,

leapinglizard

Clarification of Answer by leapinglizard-ga on 22 Oct 2002 17:02 PDT
That last link (Smart Computing Learning Series: how to compress and
uncompress files) wasn't reproduced properly in my posting. The URL
does, in fact, include a space between the "editcat/" and "SMART", so
you can't get to the page merely by clicking on the link. To reach
that page, you must either copy the entire URL into the navigation bar
of your browser, or use the link I've provided below.

Smart Computing Learning Series
how to compress and compress files
http://www.smartcomputing.com/editorial/article.asp?article=articles%2Farchive%2Fl0601%2F42l01%2F42l01%2Easp

Request for Answer Clarification by web7-ga on 22 Oct 2002 23:00 PDT
Thanks, Leapinglizard. Your info broadens my understanding -- and
makes me realize my question was not specific enough. Most of my scans
are of text without images.  Some are forms: text with blanks filled
in with writing.  I just scanned 1 page of plain text using HP
Precisionscan, black & white bitmap type and saved it as .pdf format
file.  The file size is 2.2 MB!

My specific question is, what do I need to do to scan a page of text
into a small .pdf file? Do I need to buy Adobe Acrobat to do that, or
can I use any of the software I already have (listed in the original
question)?  I know there has to be a way, since I receive small pdf
text files as Email attachments.
Thanks, WEB7

Clarification of Answer by leapinglizard-ga on 23 Oct 2002 00:00 PDT
There is no practical way to transform a scanned image into what you
call a "small" PDF file. Once again, PDF files are typically "small"
because they are generated directly from a text file, not from an
image. Since you do not have the source text at your disposal -- only
a picture of it! -- there is absolutely no program that will generate
a small PDF file. You can, however, use Windows to transform a large
PDF file into a somewhat smaller ZIP file. In the case of a page that
consists mostly of white space, you can expect space savings of 75% to
90%. To learn how you can compress PDF files into ZIP format, consult
the links provided above.
web7-ga rated this answer:3 out of 5 stars
The answer was very informative and provided good links, although it
did not provide a specific solution to the basic question.  I'll
follow his/her suggestions and see what happens. WEB7

Comments  
Subject: Re: Scanning to produce small .pdf format files
From: lot-ga on 21 Oct 2002 22:10 PDT
 
Hello web7-ga,

PDF's can be created in two ways.
1. Vector graphics in a program such as Adobe Illustrator, Corel Draw.
These files can be very small, and these are the ones you probably
receive.
2. Bitmap graphics created in a program like Adobe Photoshop or other
paint program / scan. These files are very big in comparison with
vector graphics. To reduce size the graphics can be compressed as tif
or jpg in the application you use to save the PDF file.

Generally speaking, the smallest file size for scans of text documents
(with simple diagrams) is .gif format. This reduces the size
considerably especially if you reduce redundant colors from your file.
For example if your scanned document only contained 10 colours it
would be pointless saving the full 256 colors that .gif is capable of,
it just bumps up the file size. By taking out the other 246 colors
that are 'not used', the file size is reduced without any noticable
reduction in image quality.

You can compare file sizes of your PDF file with a gif file and see.
For simple documents like text, and flat colours, gif files are
smaller, for photographic images jpg is smaller.
Sirius Web describes the formats
http://www.siriusweb.com/tutorials/gifvsjpg
Pacificsites too
http://www.pacificsites.com/~chrisk/graphics/b06mn40.htm
Web Photojournals
http://www.webphotojournals.com/gifvsjpeg.htm

As to scan resolution, as a rule of thumb you should scan in at 300
dpi for documents to be printed and 72 dpi for documents to be viewed
on screen only (to reduce file size).

Search Strategy:
gif vs jpg
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&newwindow=1&c2coff=1&q=gif+vs+jpg

Kind regards
lot-ga
Subject: Re: Scanning to produce small .pdf format files
From: funkywizard-ga on 21 Oct 2002 22:26 PDT
 
I would have to say that among users who can access the internet, JPG
files are indeed the most universal. Since you have stated this is a
problem, I would think it would be possible to embed a .jpg file into
a psd, thereby making the psd file not much bigger than the .jpg would
have been. Your 1.4 meg files are probably a result of the program you
use to make pfds does not compress the images. If your users can open
.zip files, you might try zipping the pfds, as this would
significantly reduce filesize.
Subject: Re: Scanning to produce small .pdf format files
From: madmonk-ga on 22 Oct 2002 15:02 PDT
 
Is the problem with your recipients not having software to view these
attachments? If so (and they use PCs), send them (or get them to
download) Irfanview which will allow them to view all these - and many
more - file formats. Of course - there might be other reasons for them
not being able to view your files.
However, if your documents are text the best answer is to recreate
them somehow - maybe using OCR as a start point. Simple, pure text
pdfs can be very small - especially if they do not embed fonts. Even
containing small graphics they can be compact if prepared correctly (I
have found pngs to work well in pdfs).
Subject: Re: Scanning to produce small .pdf format files
From: gan-ga on 24 Oct 2002 12:50 PDT
 
Hi Web7-ga, One workaround I have used occasionally is to use some
form of optical character recognition (OCR) software on the scanned
document.

This only works for pages which consist purely of simply formatted
text (i.e. no complex tabulation of data etc), but can produce a very
small _text_ file from a large file-size image scan. You may then use
ghostview & postscript printer driver software (printer itself not
needed) to create a .pdf from the text document, which is considerably
cheaper than using the full version of Acrobat.

Obviously this method is limited to a very specific type of document,
but I hope it might be of some use.

Extra information:


"...Creating a pdf..."
http://www.eea-esem2002.it/esem/pdf.html



http://pdf995.com/

Quote:
 
"The pdf995 suite of products is a complete solution for your document
publishing needs, offering ease of use, flexibility in format, and
industry-standard security. And all at no cost to you.

Pdf995 is the fast, affordable way to create professional-quality
documents in the popular PDF file format. Its easy-to-use interface
allows you to create PDF files by simply selecting the "print" command
from any application, creating documents which can be viewed on any
computer with a PDF viewer."

Hope it helps :)
Subject: Re: Scanning to produce small .pdf format files
From: mplungjan-ga on 25 Oct 2002 05:36 PDT
 
Strange comments...
Scanned images in 300dpi b/w tiff will best be stored in ccitt g4
compressed data inside the pdf - that is a filesize of around 10-100K
per page where the average for a normal page of text is around 40k

This is what I would expect the scanner to do.
My guess is a setting or such does not save it as 300dpi but rather at
1400 or so.
You did not tell us which of the software listed is the one you use to
save as pdf from.

Michel
Subject: Re: Scanning to produce small .pdf format files
From: krakilin2001-ga on 27 Oct 2002 03:56 PST
 
If you still want to worry about a smaller file.  You can use
PrecisionScan's OCR (optical character recognition).  When you press
the scan button on your scanner select Microsoft Word under scan to. 
Make sure 'Select parts of page or View page first' is checked.  Once
the scan is finished select the text area you wish to capture and
click 'Accept'.  To go any farther from here you will have to purchase
the full version of Acrobat.  If you have it then open up the scanned
Word document and go to File, Print.  As your printer select either
Acrobat Distiller or Adobe PDFWriter.  Either of these will save to a
PDF for you.  Good luck!
Subject: Re: Scanning to produce small .pdf format files
From: passthemustard-ga on 28 Oct 2002 23:44 PST
 
Actually I do this quite often myself and I can consistently get two,
8.5x11", B&W pages under 90K in .pdf format.  [leapinglizard] is right
in that you will get smaller pdf files if generated from electronic
text, like as from MS Word using Adobe PDFWriter, but you can still
get smaller pdfs than you are getting now by scanning.  The pdf format
(postscript) by it's very file encoding structure is very
compressable.  That's why it's so popular in the first place as a
defacto document standard.  It's your scanning that is causing the
bloat.  When you scan, tell the HP PrecisionScan applet to scan at
Black&White/1-bit/300dpi.  The preview window will make it look
"dirty", but the finished product will be ok.  This is the preferred
combination for B&W text documents.  You most likely are scanning
greyscale or full-color mode, which unecessarily adds size to your
file.  Try it.
Subject: Re: Scanning to produce small .pdf format files
From: davidtalmage-ga on 25 Nov 2002 18:07 PST
 
You could use the OCR program that probably came with your scanner to
convert your scanned page to text to reduce its size.
Subject: Re: Scanning to produce small .pdf format files
From: helpers-ga on 02 May 2003 02:57 PDT
 
I guess this is a bit late - but you can use the free software "Save
to PDF" (http://www.savetopdf.com/). It adds a virtual printer to your
Windows system. A PDF-file will be printed when you "print" to this
printer. So in your case you'd use the process; 1.) scan, 2.) print
and 3.) voilą - the PDF! :-)
Subject: Re: Scanning to produce small .pdf format files
From: helpers-ga on 02 May 2003 02:58 PDT
 
Correction to my comment - a PDF file will be CREATED, not PRINTED :-)

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy