Google Answers Logo
View Question
 
Q: Saving web pages on my computer -- just need some basic info ( Answered 5 out of 5 stars,   9 Comments )
Question  
Subject: Saving web pages on my computer -- just need some basic info
Category: Computers
Asked by: bbb-ga
List Price: $4.00
Posted: 05 Jul 2003 09:45 PDT
Expires: 04 Aug 2003 09:45 PDT
Question ID: 225370
I need help in deciding which format to save Web pages in. I'm
including the info I got from my Explorer Help file on this, which
gives me 4 options, which I've copied & numbered below. I understand
option 4, text only, but sometimes I want to keep a page in exact
graphic format, so I can print it or send it to other people. I'm not
sure about the other 3 formats. First, you'll see the help file info,
then my comments.

FROM THE HELP FILE:

1. To save all of the files needed to display this page, including
graphics, frames, and style sheets, click Web Page, complete. This
option saves each file in its original format.
 
2. To save all of the information needed to display this page in a
single MIME-encoded file, click Web Archive [single file*]. This
option saves a snapshot of the current Web page. This option is
available only if you have installed Outlook Express 5 or later.

*This is what my browser actually says, in the 2nd save option.

3. To save just the current HTML page, click Web Page, HTML only. This
option saves the information on the Web page, but it does not save the
graphics, sounds, or other files.

4. To save just the text from the current Web page, click Text Only.
This option saves the information on the Web page in straight text
format.
MY COMMENTS ON THE ABOVE OPTIONS:

I suppose that option 2 -- save in a single file -- is the best for my
purposes. I gather that it saves the entire page, as is. But still,
it's a bit puzzling. I'd expect it to turn the page into graphics,
which would be a fixed version. Yet in this saved version, I can still
try to enter data in the search and URL window, etc.-- so is this
really permanantly saved or not? And do I have to worry that it may
not be retrievable at some point, if standard formats change? (The
info below says this is MIME format, which I don't know much about.)
Wouldn't it be better to save it as a kind of graphic file, so that
every pixel would be completely fixed? If so, how do you do that?

As for option 1, I've tried it and it keeps each graphic file separate
and puts them all in one folder. I assume you'd do this if you ever
want to utilize the graphics separately, or edit them, etc. -- but
otherwise, this is too complex... Which means that I can disregard it,
for my purposes.

As for option 3, it saves the shape of the page but without the
graphics. I can't imagine any use for this. (What IS it used for?)

Thanks for any help. I'm pricing this question low, since I think it's
pretty straightforward...I hope.
Answer  
Subject: Re: Saving web pages on my computer -- just need some basic info
Answered By: sublime1-ga on 05 Jul 2003 12:04 PDT
Rated:5 out of 5 stars
 
bbb...

You ask:

"...in this saved version, I can still try to enter data
 in the search and URL window, etc.-- so is this really
 permanantly saved or not?"

It is permanently saved, on your hard drive, in a single
file which is smaller than if you used option 1, which,
as you noted, saves all the relevant graphics in a
subfolder, and is unnecessarily complex, unless you 
wish to edit the page, as you also noted. The advantage
here is that it is a relatively small file, and it
preserves the full function of the webpage from which it
was derived. If you saved a copy of Google's search page,
for example, all the links would work, and you can enter
a search from it, as well. This is so because this type
of file is associated with, and opened by default, by your
browser - so when you click on the file, your internet
browser opens, and you can navigate from there as you would
from any open browser window. The file is also small enough
to email as an attachment, if, for some reason, you prefer
not to simply send the URL.

Option 1 is primarily used by website authors who want to
preserve copies of the various pages on their website in
an archive which can be used for editing the pages.
It could also be used to download an entire webpage, or even
an entire website, so that you could view it at your leisure
without being signed on to the internet through your ISP.


"And do I have to worry that it may not be retrievable at
 some point, if standard formats change?"

A file in this format will always be associated with and
opened by your browser. What may change is the content of
the actual webpage, in which case, some of the graphics
and links may not work in the future. Since these are not
stored on your hard drive, as in option 1, they may not
continue to work if the webpage author redesigns the page.

 
"Wouldn't it be better to save it as a kind of graphic file,
 so that every pixel would be completely fixed? If so, how
 do you do that?"

It is possible to use certain graphics programs, such as
Adobe Photoshop, Paint Shop Pro, etc., to obtain what is
called a 'screen capture'. This will be an accurate image
of exactly what is showing on your computer screen. One
limitation of this is that, if the webpage is larger than
your screen, and you must scroll down to view it all, you 
would have to 'capture' each segment and blend them all
together to get an image of a page which is several screens
in length. Another consideration is that none of the links
would work. Additionally, the graphic image file would be
larger than the .mht file. The advantage of an image is 
that you will have an image of what the page used to look
like even if the webpage author redesigns it.


"As for option 3, it saves the shape of the page but without the 
graphics. I can't imagine any use for this. (What IS it used for?)"

You might use this if you were designing a webpage, and wanted
to study the html (hypertext markup language) used by another
site, to determine if it might be something you wanted to 
incorporate into your own design. You wouldn't need to save
the images, since they would be irrelevant to your own site.


Please do not rate this answer until you are satisfied that
the answer cannot be improved upon by means of a dialog
established through the "Request for Clarification" process.

sublime1-ga

Request for Answer Clarification by bbb-ga on 05 Jul 2003 13:12 PDT
To sublime1-ga:

Thanks for a very, very thorough and clear answer. In fact, you make
it clear that if I want to record a certain web permanently, I really
do have to take that "unnecessarily complex" option of storing all
parts of the page, in a folder. The reason: If I choose the simpler
"one-file" option, I can never be certain that the page will look the
same, per your warning here:

"A file in this [one-file] format will always be associated with and 
opened by your browser. What may change is the content of the actual
webpage, in which case, some of the graphics and links may not work in
the future. Since these are not stored on your hard drive, as in
option 1, they may not
continue to work if the webpage author redesigns the page." 

Thus if I really want to store the page as it looks now, I've got to
use that complex option...

...Or use a graphics option, because as you note: 

"The advantage of an image [file] is that you will have an image of
what the page used to look like even if the webpage author redesigns
it."

Two final questions: 1. If I use the complex option, I gather I'll be
looking at the page with my browser, but my browser will have
available all the pieces needed to re-assemble it. So it's permanently
available (until browsers get modified)... right?

2. As for the image format, can I try to just save a webpage as a PDF
file, or convert one of those other formats to PDF after saving? That
is, what are the ways to TRY to save it as an image? (And I know that
some image files are not necessarily huge, depending on how much
resolution is needed; most of want I want to save is text material, so
I may not need much resolution... Maybe you can tell me how much to
use...? ) This may be too much to answer simply, but please do give me
a rough idea. Thanks!

bb

Clarification of Answer by sublime1-ga on 05 Jul 2003 14:00 PDT
bbb...

1. If I use the complex option, I gather I'll be
looking at the page with my browser, but my browser will have
available all the pieces needed to re-assemble it. So it's
permanently available (until browsers get modified)... right?

If you use the complex option, the pages you'll be viewing
will be in standard html format, and it's highly unlikely
that future browsers will not be able to view this format,
no matter what additional formats are created and added,
since the vast majority of current websites are in html.


2. As for the image format, can I try to just save a webpage as a PDF
file, or convert one of those other formats to PDF after saving? That
is, what are the ways to TRY to save it as an image? (And I know that
some image files are not necessarily huge, depending on how much
resolution is needed; most of want I want to save is text material, so
I may not need much resolution... Maybe you can tell me how much to
use...? ).

You can certainly save webpages in PDF format, in a relatively small
filesize, up to about 70kb. However, I know of no way to do this
short of purchasing Adobe Acrobat - the standard version is currently
selling for $299:
http://www.adobe.com/store/products/master.jhtml?id=catAcrobatStnd

The other option is to use some kind of graphics program, as I
mentioned earlier, like Adobe Photoshop or Paint Shop Pro. These
can also cost a pretty penny, though you may be able to locate
some freeware options that will do the job satisfactorily, as
from this Google search:

freeware "screen capture"
://www.google.com/search?q=freeware+%22screen+capture

As for the minimal resolutions which would allow for readable
screen captures of text, this is not my area of expertise, but
you could likely do so with a relatively small filesize as the
result. I would just try a few captures at the 'default' 
resolution, and, if the resulting filesizes are reasonable,
leave it at that. If they seem to large, then you can look
into the options which become available during the 'save'
process. Some will have options like "optimize for web viewing",
which essentially means "make the saved file small enough that
someone using a 36.6 modem will be able to see the image load
quickly in their browser if I upload it to my website". Then
see if that smaller file is easy enough to read to suit your
purposes.

However, if you are primarily saving text from a particular
page, you can save hard drive space by simply selecting and
copying the text you want to save, and pasting it in a simple
text editor like Notepad, and saving it as a text file.

sublime1-ga
bbb-ga rated this answer:5 out of 5 stars and gave an additional tip of: $5.00
This was very thoroughly handled, initially and when further
clarification was requested. "Sublime1-ga" is not only well-informed,
but writes very clearly indeed. Also--"sublime1-ga" helped with a
separate matter involving another question--so thanks for that too!

Comments  
Subject: Re: Saving web pages on my computer -- just need some basic info
From: sublime1-ga on 05 Jul 2003 12:28 PDT
 
bbb...

In regards to your other recent question about the GA system:
http://answers.google.com/answers/main?cmd=threadview&id=225400

...it is currently locked by the GoogleAnswers bot because of the
use of the word 'Google' in the question. It will remain locked
until the GA editors have reviewed the content to see if it's 
something they prefer to answer themselves, since it is about
Google.

I can tell you here, however, that the option to 'close' the 
question is for customers who want to cancel or 'expire' their
question, prior to receiving an answer. What happened on the 
question you are referring to is that knowledge_seeker-ga
used the Clarification feature to have an ongoing diagnostic
dialog with you, but never posted a formal answer in the 
Answer box. To avoid this dilemma in the future, leave a 
message in your final Clarification saying "Yes, this worked
and solved my problem, please post something in the Answer
Box so I can pay you for your work and rate the question".

After the researcher posts something in the Answer Box, you
will see the options to rate the question and tip the 
researcher, if you so desire.

If this satisfies the question you posed in the link above,
you can 'close', 'cancel' and 'expire' that one (I'm not
positive if you can do so while the GA bot has it locked).

sublime1-ga
Subject: Re: Saving web pages on my computer -- just need some basic info
From: bbb-ga on 05 Jul 2003 13:17 PDT
 
to sublime1-ga:

Thanks! This is a big flaw in the Google system, then. My answerer
gave a clarification, or comment, that was in fact an answer. So I
closed, assuming Google knew what it was doing--i.e., it wouldn't let
me close the question without deciding whether to pay or not. I don't
see how they let this happen. (I'm going to figure out how to pay the
answerer in any case, but an answerer could eaily  get penalized for
making a helpful comment).
bb
Subject: Re: Saving web pages on my computer -- just need some basic info
From: sublime1-ga on 05 Jul 2003 14:10 PDT
 
bbb...

I agree that the use of the term 'close' the question would
seem potentially confusing, when it actually serves to 'cancel'
the question. Your feedback to the editors in this matter may
serve to motivate a change in this area. You can contact the
editors here:
mailto:answers-editors@google.com

As for the simplest way to pay the answerer of your previous
question, you could simply open another question with the
phrase "For knowledge_seeker-ga only" included in the subject
line. Then ask them to post a token remark of some kind in
the Answer box and you'll be able to rate them after that.
Subject: Re: Saving web pages on my computer -- just need some basic info
From: knowledge_seeker-ga on 05 Jul 2003 17:37 PDT
 
Hey bbb--

I found you here thanks to sublime! 

I see the Google Editors removed your second question. I sent them an
email and asked them to unlock it, but they deleted it instead. They
told me they were going to answer you directly.

To make things simple, to pay me, here's what you can do. Post another
$2 question with the subject line: FOR KNOWLEDGE SEEKER.

DO NOT use the word GOOGLE anywhere in your question and do not
include a link to the other question. Both of those will trigger the
bot.

When I see it I will post something in the Answer space and collect my
$1.50. Then, when you rate the question, you can add whatever tip you
deem appropriate. Make sure not to hit the CLOSE button AT ALL.

This is a rather awkward way to communicate, but it could have been
worse. I'm glad Sublime alerted me to this live question so I could
let you know how to proceed.

Thanks for your patience with all of this ...

(And thanks Sublime :-) ) 

-K~
Subject: Re: Saving web pages on my computer -- just need some basic info
From: damiam-ga on 05 Jul 2003 18:41 PDT
 
There are free ways to make PDFs and take good screen captures. For a
howto on creating PDFs without Acrobat, see
http://etd.jhu.edu/etdpublic/howto/pdfinstructions/. And a good free
program for screenshots, as well as an excellent image editing program
overall, is the GIMP - http://www2.arnes.si/~sopjsimo/gimp/. To take a
screenshot within the GIMP, just go to File -> Acquire -> Screen Shot.
Subject: Re: Saving web pages on my computer -- just need some basic info
From: sublime1-ga on 05 Jul 2003 20:32 PDT
 
bbb...

Thanks very much for the rating and the tip.

Damiam-ga's suggestion to use GhostScript, available from
his first link, is an excellent idea for creating pdf files
at no cost for the software. I was aware of this option in
converting text files to pdf, but didn't realize you could
use it to convert webpages, as well. Thanks damiam!
Subject: Re: Saving web pages on my computer -- just need some basic info
From: bbb-ga on 05 Jul 2003 21:08 PDT
 
damiam--yes, thanks a lot!
bbb
Subject: Re: Saving web pages on my computer -- just need some basic info
From: owain-ga on 06 Jul 2003 05:58 PDT
 
For creating PDFs I use pdfFactory from Fineprint.com - it installes
as a printer driver in Windows, so may be easier to use that some
versions of Ghostscript. The demo (free) version puts a footer at the
bottom of each page of the pdf, but it's not that intrusive, and
acceptable if you're just using it for personal use.

Owain
Subject: Re: Saving web pages on my computer -- just need some basic info
From: bbb-ga on 06 Jul 2003 08:27 PDT
 
to owain--
Thanks! I'll try that, also.
bbb

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy