Google Answers Logo
View Question
 
Q: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"? ( No Answer,   14 Comments )
Question  
Subject: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
Category: Computers > Programming
Asked by: severisth-ga
List Price: $15.00
Posted: 05 Aug 2003 09:18 PDT
Expires: 04 Sep 2003 09:18 PDT
Question ID: 240289
Going to http://www22.verizon.com, you can do view source and see that
the code is all left aligned.  But if you do File > Save As, and look
at the code saved on the computer, it has a ton of extra spaces
pushing the html to the right.  What causes this?

I need to be able to prove that the load time of the website does not
include those spaces.  Ideally, I'd like references from a microsoft
or similarly reputable website (e.g. W3C).  Please let me know if any
questions come up, I'll reply quickly.

Clarification of Question by severisth-ga on 05 Aug 2003 13:18 PDT
I was able to use the information you provided for the review we did
today.  Thank you!

I need to send out the explanation of the "Save As" function today. 
When you finish compiling the links, please post them as the answer.

Thank you so much for your help!
Answer  
There is no answer at this time.

Comments  
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save A
From: slawek-ga on 05 Aug 2003 09:50 PDT
 
Good Day,

The short and simple answer is that when you save as in IE, you are
saving a file that has already been processed by the browser and
appended file specific code.  The same will happen with Netscape if
you save as... the code is a little different than what you would see
if you FTPd the file down.

Saving a file results in file type attributes being added to the
information: Each file type has different encoding for spaces, new
lines, etc.  When you actually save the file, these file type
preferences get added into the saved file.

The simplest way to see that "spaces take no time to load" is to try
putting in multiple spaces between a set of words.  Only one space
will show up every time.  HTML does not parse anything more than just
one space. It skips it till it sees the next character.

I am out the door, and will be back in an hour.  It sounds like it
will be too late to help you further by the time I return, but I hope
this helped at least somewhat.  Please let me know if I can post an
official response.

I have no time to find references right now, but can add them into my
official response if the above answers your question.


Regards,
Researcher slawek-ga
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: severisth-ga on 05 Aug 2003 10:37 PDT
 
Post as an answer if you have links showing that IE saves the rendered
version of the page instead of the downloaded version, and explains
why the saved version looks different than the version seen if you do
"Tools > View Source".

Thanks for your help!
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save A
From: slawek-ga on 05 Aug 2003 11:09 PDT
 
Good Day severisth,


Since time is if the essence, I am posting just one link while looking
for others.  Please let me know how many you want. I will post the
complete answer after I am sure I have assembled all the info you
require.

Site: HTML Source Explorer Bar
URL: http://home.worldonline.dk/viksoe/htmlbar.htm
Excerpt: "The Internet Explorer MSHTML component will parse the
downloaded HTML and add its own tags, close unclosed tags and even
remove tags, which violate its parser logic.

A good example of how the parsed HTML source code can differ from the
original HTML is the TBODY tag. This HTML tag is automatically added
after any TABLE tag by the Internet Explorer HTML parser."

Search Strategy: Google search for
"internet explorer"+"html"+"different"+"source"+"view"


Regards,
slawek-ga
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save A
From: slawek-ga on 06 Aug 2003 12:28 PDT
 
Hi,

I am assembling documentation with references on the Save As feature
in IE.  I will have something online within an hour or two.  Thanks
for your patience.

Regards,
slawek-ga
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save A
From: slawek-ga on 06 Aug 2003 15:26 PDT
 
Good Day severisth-ga,

The Save As feature saves the file in the default format of the
application that was used to open the file in the first place.  Each
application has it's own way of handling spaces, tabs, etc.  When you
view the source you are probably using notepad application to view it.
 When clicking "Save As" in IE, you are using the IE software which
defaults the file type to HTML and saves the file differently than
Notepad would.  When you load the file again in an application
different than IE, you see changes in the code because IE modified it.

A great example is saving a text file in Word, and trying to open it
with notepad.  Notepad will display a series of characters that look
like gibberish when used to load a file saved in the default format of
Word.  This is due to the fact that Word uses different codes than
notepad to identify special characters like new line breaks, tabs,
etc.  When we open the file again in Word, everything will look fine
because the coding in the file is native to Word. I hope this example
gives you a good contrast the idea behind "Save As" and file types.

In short, Save As converts the file type and stores the data in a
different file using different methods. I did not find specific
examples for IE and Notepad compared, but have some other links that
explain the inner workings of the process.

Again, I am posting this as a comment because I am not sure this is
what you are looking for and have no hard sources that deal with this
your situation specifically. Also, different versions of IE might save
the file differently producing unique results.  What it comes down to
is making sure that the file type you are "saving as" the file in, is
the same as the file type you opened the file from.  If the file type
is different through defaults or user preference changes, the final
file will look different as the original even though they both were
saved using the same software.

If you provide me with:

-	the name and version of your web browser
-	the name of the application you use to view the source in (before
save as)
-	the name of the application you use to load the code after save as
-	operating system version (win98, winME, XP, etc)

I can probably help you choosing settings that will keep the spaces
extra spaces out of the file.  Since I am not sure if my explanation
helps, I am hoping a solution that gets rid of the spaces will do just
as well.

Anyway, here are some very basic resources on Save As, which should
give you enough background on the feature, and why a file might look
different after using it:

Site: The "Save As" Option
URL: http://tlt.its.psu.edu/suggestions/briefcase/saveas.html

Site: Save? Save As? Save as WHAT?
URL: http://pubs.logicalexpressions.com/Pub0009/LPMArticle.asp?ID=23

Once again, thanks for your patience.


Regards,
slawek-ga
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: severisth-ga on 07 Aug 2003 13:57 PDT
 
This is for IE 6.0.2600 running on Windows XP.
The code is looked at using notepad (View > Source in IE) before it is
saved.
The code is loaded up in notepad after it is saved.

I have a solution for eliminating the spaces... I just need to prove
to a VP that Save As is not a viable tool for measuring the download
size of the website.
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: severisth-ga on 07 Aug 2003 13:58 PDT
 
I just re-read your post:

"I can probably help you choosing settings that will keep the spaces
extra spaces out of the file."

This would work as the answer if it can be done through IE's Save As.
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: aceresearcher-ga on 07 Aug 2003 17:38 PDT
 
Greetings, severisth!

<< I need to be able to prove that the load time of the website does
not include those spaces.>>

Unfortunately, according to Andrew B. King in his book "Speed Up Your
Site: Web Site Optimization":

"Optimizing HTML is a matter of using the fewest number of bytes to
deliver a valid page that renders properly. There are a number of
techniques you can use to shrink your HTML. These include removing
whitespace, omitting optional closing tags and quotes, removing
redundant tags and attributes, cutting comments, and minimizing HTTP
requests." (Chapter 3, page 48)

"Step 3: Remove Whitespace
...Browsers don't care how pretty your markup is; they're just looking
between tags -- real or implied.
***Those extra spaces, tabs, and returns make your markup easier to
read but slower to display.***"
(Chapter 3, page 53)

"This whitespace is entirely unnecessary (with some exceptions for
JavaScript) for browsers rendering HTML. They see the HTML file as a
stream of bytes with tags insterspersed around data. Indents and
spaces before or at the end of lines are simply wasted bandwidth and
are ignored by browsers. If necessary, you can re-beautify your markup
for editing by using sophisticated test editors like BBEdit and
Homesite or by using regular expressions or short shell scripts."
(Chapter 3, page 54)

I have found this book to be an excellent reference and I recommend
it:
http://www.amazon.com/exec/obidos/tg/detail/-/0735713243

BBEdit:
http://www.barebones.com/products/bbedit/index.shtml

Homesite:
http://www.macromedia.com/software/homesite

I hope that you will find this information extremely helpful.

Regards,

aceresearcher
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: severisth-ga on 08 Aug 2003 09:00 PDT
 
Thank you Aceresearcher!

The issue is that while it is true, as you have proven, that any
browser reading html will skip over any spaces after the first, it
will still have to download them.  A file with 10,240 spaces and 2,048
characters will still download in 12 seconds at 3.0 KB/s because it
has to download 12KB.  In my case, extra spacing has already been
removed, but when saved in IE, thousands of spaces are inserted into
the resulting file.

The challenge is to prove that those spaces are inserted by IE, and
are not in fact served up from the server.  i.e. I need to be able to
prove that the load time of the website [in question, verizon.com,]
does not include those spaces [which are added in by the Save As
feature].
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: slawek-ga on 09 Aug 2003 13:56 PDT
 
Good Day severisth-ga,

I am still looking for resources that can be of help to you.

In the mean while, what about FTPing the file down, and opening it,
rather the saving source?  The FTPd file should be unchanged, as it
will not be parsed by the browser.  You could upload the file, save
source, and than FTP down the same file and save it beside the saved
source.  Look at the file size and spacing, and maybe you will have
yourself some proof?

Let me know if this helps, and in the mean while I will crawl the net
for more resources that might be of help. Hope you are staying sane
through all this!  Hang in there!

Regards,
slawek-ga
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save A
From: slawek-ga on 10 Aug 2003 12:36 PDT
 
severisth-ga:

What do you use to create the original HTML file?  Is it also notepad?

Should I just skip ahead and find you resources on how a file download
time is best calculated?  You are absolutely right that just looking
at file size after a Save As is not the way to do it.  A 100K File
with straight text will take a less time to load in a browser than a
100K file that is full of tables or frames.  Download time is only a
portion of the wait time, and it is becoming a smaller portion every
day.  Downloading a flash file can take less time than it might take
the computer to actually process the file that was downloaded, and
execute it to the user, depending on how well the code was written.  A
good example are fade effects... on a slower PC the text can take a
few seconds to appear at full intensity, while the download itself
might have taken a second or two.

As suggested by another researcher, you could skip some tags to "save
download time", but you will probably lose twice the saved time during
the parse process.  I am not aware of any webmaster that leaves tags
open to "save on download time".  It usually will take longer to load,
and cause more problems with compatibility than it is worth the risk.
The browser spends time cleaning up the code instead of just
downloading a slightly larger file, and just loading it with few small
changes.

There are web sites out there that will actually allow you to enter a
web site URL, and it will fetch the web site and calculate how long
the site will take to load at what speed connection.  This takes into
account all links to images, code, etc.  A Save As of the HTML file is
no way to judge download speed under any circumstances.  The fact is
that smaller files take less time to download, but if information is
missing in the file, or the code is complex, the load time on two
files with exact same size will differ although they were downloaded
in the same amount of time.

Let me know if I am on the right track in helping you... 

Have a great weekend.
slawek-ga
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: severisth-ga on 11 Aug 2003 13:51 PDT
 
I am extremely well versed in HTML and filesize reduction techniques.

My problem is this:
I'm trying to *prove* to a corporate VP that he can't use "Save As" as
a safe judge of filesize.  The only way I can prove it to him is to
email links which explain what "Save As" does to the file.

This link was heading down the right path when it discussed the MSHTML
component:
http://home.worldonline.dk/viksoe/htmlbar.htm
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: cerealdud-ga on 12 Apr 2004 06:11 PDT
 
If this is still an open issue, why no use Frontpage or a similar editor
to download the URL\page? "Spaces" won't even be an issue then... Yes?
Subject: Re: NEED ASAP (within 1.5 hours): Why does IE change html when you do a "Save As"?
From: severisth-ga on 12 Apr 2004 09:25 PDT
 
cerealdud,

Thanks for the idea!  Luckily, the VP didn't pursue the space issue
when I presented my rebuttal.  The bottom line was that the VP saved
using IE, so I was concerned that when I told him "IE adds spaces to
the file", that I wouldn't have an answer to "Why does it do that? How
do you know?".

I was hoping to be able to point him to a link on Microsoft or other
reputable site, citing the extra spaces as a bug.

Guess it was a rather complicated search; I wish you could tip
researchers for the effort even when they don't have an official
answer they're comfortable with posting!

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy