Google Answers Logo
View Question
 
Q: Internet Explorer History Merge (Repost) ( Answered 5 out of 5 stars,   4 Comments )
Question  
Subject: Internet Explorer History Merge (Repost)
Category: Computers > Internet
Asked by: nhdw-ga
List Price: $50.00
Posted: 27 Nov 2002 23:19 PST
Expires: 27 Dec 2002 23:19 PST
Question ID: 115849
First, before I start with the explanation, I just wanted to note that
this is a rather complicated and technical question and likely won't
be solved by 1 simple google search.. It may require intuition
involving combinations of searches, etc... If you're not absolutely
sure your method will work, please request clarification... Dont
immediately post a guess as an answer. I'm in no real rush to get the
answer, but would like it within 2 months from 11/27/02.

Explanation:

Operating systems involved: Windows 2000 Server

Applications involved: Internet Explorer 6

My Goal: To take previously backed up Internet explorer history files
which were originally located in "C:\Documents and
settings\Username\Local
Settings\History\History.IE5" and 'MERGE' them in with a current,
up-and-running internet explorer 6 history... eg: once this project is
complete, I want to be able to click the history button and see not
only the history of the current windows installation, but the history
that I had previously backed up.... The current history dates back 3
months, so I currently see "Today", "Yesterday", "3 weeks ago", etc up
to 3 months ago. Assume the backed up history comes from before the
year 2000, which would be more than the allowable 999 days of history
in internet explorer, so I need to combine all previous backed-up
history into 1 history "Folder" ... For example, "4 months ago"...

What I have backed up and how I backed it up: The files I have backed
up from the previous windows installation are a hirearchy of
"index.dat"s underneath what appear to be timestamped dictories under
the %USERPROFILE%\Local Settings\History\History.IE5\ path -- the
timestamped directories look similar to "MSHist012002111420021115"...
(%USERPROFILE% is an environment variable that expands to
"C:\Documents and Settings\<Currently-Logged-On-Username>")...
This is how the directory/file heirarchy appeared on the original
Win2k Installation when I originally backed up the history files:

%USERPROFILE%\Local Settings\History 
  | 
  |-desktop.ini 
  |-History.IE5\ 
         | 
         |-desktop.ini 
         |-index.dat 
         |-MSHist012002111420021115\ 
         |         |-index.dat 
         | 
         |-MSHist012002111320021114\ 
         |         |-index.dat 
         | 
         |-(many more of these MSHIST directories with the only file 
            inside being index.dat) 

I copied them from this path to another drive using the following
command under a DOS box:
(from %USERPROFILE%\Local Settings\History):
xcopy *.* d:\savedir /s/e/v/h 
The directory and file structure under d:\savedir looked identical to
%USERPROFILE%\Local Settings\History after the copy was complete.

What I have tried (ie: What doesnt work :-)): I have attempted
xcopying back all of the MSHist################## files and the
index.dat's underneath from the backup d:\savedir back into the
%USERPROFILE%\Local Settings\History\History.IE5\ directory. This does
not work. It might
be that the "Main" index.dat in the %USERPROFILE%\Local
Settings\History\History.IE5\ directory is an index of all the
timestamped directories themselves. This is right about where I get
lost...


Acceptable answers: 

1) Methods of "reindexing" the MSHIST directories in the main
index.dat file (if this is how IE history functions) and Methods of
"combining" the index.dat's from inside the individual (older) MSHIST
directories, in order to get the previous history to appear in 1 IE
History "Folder". (if this type of combination of index.dat's is
possible?).

or

2) Any other method of somehow maniuplating the history folders to do
what I've described above in the "My Goal" section... (I have several
computers at my disposal, so the method can involve more than 1 Win2k
system if necessary).

or

3) URL(s) of tool(s) that will do specifically what I've described
above in the "My Goal" section... (I've searched, and found none that
do specifically what I'm looking for here...only tools that
delete/clean history/cookies/etc)

or

4) Some kind of proof that this simply can't be done.


Thanks... Please request clarification with any questions...
-nhdw

Request for Question Clarification by theta-ga on 29 Nov 2002 10:16 PST
Hi nhdw-ga,
  I am sorry that your previous posting was not able to provide you
with a satisfactory answer. Lets hope we can solve it this time. But
first, a few clarifications :
- Is it necessary for you that all your backed up history files appear
in the Internet Explorer History Bar, or would you be satisfied with a
method of generating an html or text file which contains your all your
history links ?
  This way, you would still have all your history links intact, and
wouldn't have to worry about IE removing them in the future. Plus,
this will give you performance and stability improvements with IE, as
IE has been known to become quite slow if it's history becomes very
large.
   This is the simplest solution to your problem, and I recommend that
you go for it.

 The main problem with your original requirements is that the format
of the index.dat files is undocumented, and changes with nearly every
new version of IE. So, although people have been able to hack it
enough to read the URL's from it, there are no programs available that
will update/create an index.dat file.

 If it is ABSOLUTELY necessary for you to get the page liks in IE's
History SideBar, I could take a crack at trying to update/create a
valid IE6 index.dat file. However, I make no promises of success.
Also, this could take some time.
 From your question, it appears that you backed up your history from
IE5.If so, I will need to know the version of index.dat that you have
backed up.
 If you open the index.dat file in any text editor(NotePad,WordPad
etc.), you can see the first line contains text like this :
      Client UrlCache MMF Ver 5.2
 Here 5.2 is the file version.
It would be very helpful if you could post the version number for both
the backed up index.dat and your current index.dat.
Thanks,
Theta-ga

Clarification of Question by nhdw-ga on 29 Nov 2002 15:31 PST
theta-ga,

> - Is it necessary for you that all your backed up history files
appear
> in the Internet Explorer History Bar, or would you be satisfied with
a
> method of generating an html or text file which contains your all
your
> history links ?

Yes, they must appear when history is clicked, inside internet
explorer... Unfortunately, an html/text file is not acceptable. (I
know there's tools out there to do that) ... However, if there's a way
to re-import the URLs once they're in html format (into one of the
history folders), (short of clicking every URL in the html file), then
exporting to HTML can be used as part of the process.

> The main problem with your original requirements is that the format
> of the index.dat files is undocumented, and changes with nearly
every
> new version of IE. So, although people have been able to hack it
> enough to read the URL's from it, there are no programs available
that
> will update/create an index.dat file.

That's definitely a problem ... Still, the question remains if there
is any kind of jury-rigged process I could follow to get this history
imported/merged. There's almost always alternate methods of doing
these kinds of things... I just cant think of one, hence this rather
tough question :-)
 
>  If it is ABSOLUTELY necessary for you to get the page liks in IE's
> History SideBar, I could take a crack at trying to update/create a
> valid IE6 index.dat file. However, I make no promises of success.
> Also, this could take some time.

It's abolutely necessary to get them into the sidebar/history folders
... :-(
You could take a crack at manipulating the index.dat's directly if you
are good at that kind of thing... (I assume you'd be coding something,
which you may even be able to put up as a shareware utility and make
money off of, if there's actually anyone else out there interested in
doing this kind of thing :-) )...
I think another approach would be SOMEHOW (again, I'm not sure how) of
getting the URL's to go through the OS, so that windows somehow writes
the URLs ... It could even be into today's folder for all I care
(doesn't even have to be the 4 months ago folder as in the original
text above)...
The original post (https://answers.google.com/answers/main?cmd=threadview&id=107482)
mentioned copying of the URL's directly, except the researcher didn't
understand the complexity of this, and assumed I was using 98 -- there
are definitely differences between doing this in Win98 & Win2k...

>  From your question, it appears that you backed up your history from
> IE5.If so, I will need to know the version of index.dat that you
have
> backed up.

The version of IE5 that came with Win2k-Server ... I'm guessing the
index.dat's are pretty much the same between IE5 and IE6, since IE6 is
still using the files from inside the History.IE5...

>  If you open the index.dat file in any text editor(NotePad,WordPad
> etc.), you can see the first line contains text like this :
>       Client UrlCache MMF Ver 5.2 
>  Here 5.2 is the file version. 
> It would be very helpful if you could post the version number for
both
> the backed up index.dat and your current index.dat.

Both backed up and current are "Client UrlCache MMF Ver 5.2" ...

Thank you VERY much,
nhdw-ga

Clarification of Question by nhdw-ga on 29 Nov 2002 15:49 PST
Also, a couple other notes... In my searches, I've found that the
desktop.ini file inside %USERPROFILE%\Local Settings\History is one of
those files that controls how windows "looks at" the history
directory...

Check this out ...
1) Go to the %USERPROFILE%\Local Settings directory in explorer    
   (everything's Hidden+System, so make sure you're viewing hidden
    files)
2) Check out the History folder icon... It's a Sun dial, right? 
   Go into the history directory... "Today", "Last Week", "1 month
ago" folders,
   etc, right? Go back up 1 directory to the Local settings folder
again.
   Keep this window up.
3) In Dos, go to %USERPROFILE%\Local Settings\History ... Type the
following:
   CD %USERPROFILE%\Local Settings\History
   ATTRIB -S -H DESKTOP.INI
   RENAME DESKTOP.INI DESKTOP.BAK
     (This just removes the Hidden+System attributes, and renames
      the file temporarily... we'll put everything back... Keep this
dos box
      up).
4) Back in explorer, Check out the History folder icon again... 
   Now it's a normal folder. If you get back to the explorer window
fast
   enough, you can watch it change right before your eyes :-)
   Go into the history directory... Now the History.IE5 directory is
there,
   with all the MSHIST folders underneath...
5) Back in the dos window, type the following:
   RENAME DESKTOP.BAK DESKTOP.INI
   ATTRIB +S +H DESKTOP.INI
6) Back in explorer, the history icon should be back to normal and
everything is back to it's nice, fluffy, microsoft-undoccumented state
:-)

Request for Question Clarification by theta-ga on 30 Nov 2002 12:11 PST
Hi nhdw-ga,
    Ooooooooh! A challenge! Well, there goes the weekend! Goodbye
Sleep!
     :-D
   
    Below are some of the ideas (that I can think of right now) that I
will try out in order to get your History URL's transferred :
    - Try writing to the .dat files directly
    - Try to find built in IE functions that write to the History file
    - Try to programatically get IE to visit the sites again, thereby
adding them to the history. Hopefully I can do this without popping up
a million IE windows or tying up your net connection for hours. (This
should be simpler than the above two options )
    

>which you may even be able to put up as a shareware utility and make
>money off of, if there's actually anyone else out there interested in
>doing this kind of thing :-) 
 I wish!  :-)  


 Whichever of the three methods I use to finally import your history
into explorer, it would help if these URL's were in a plain text file,
with one URL per line. This way I can dispense with code to read the
.dat files.
 Here is how you can get the URL's into a text file :
  - Download the program Spider, which can read and display the
contents of IE's .dat files
           - Ward van Wanrooij's : Spider v1.16
             ( http://www.fsm.nl/ward/ )
  - Once you have downloaded and installed the program, start it up
  - Go to the Options menu. The options dialog will be displayed. Make
sure that all the options are unchecked. We don't want to delete any
URL's by mistake.
  - Also, you will probably want to set the scan option to "Windows
drive only", since this will make for a shorter scan. Since we want
the program to scan your backed up history, make a new subfolder under
the windows directory and copy your backup dats into it.
  - After making the changes, Click Accept in the options dialog.
  - Now start the search
  - After the search completes, Spider should display that it found
dat files in the subfolder you created and also show the URL's they
contain.
  - Save the results of this search in a text file using the 'Save
Scanresults' item in the File menu.
  - Now open this text file in any text editor, and copy just the
URL's that were contained in your backed up dats. Save these URLs in a
separate text file.
  - When you are done, you should have a plain text file, which just
contains one URL per line. This will be a good time for you to remove
any URLs that you do not want to include in the import.

Tell me if the above process worked for you. Also, it would help if
you could tell me how many URls I am looking at.(Hundreds?Thousands?)

Sending the files to you
========================
     Google Answers does not allow us to directly contact our
customers. So I con't just ask for your email and mail you the exe
files. So I have taken the following (elaborate?) steps :
     - I have created a temporary Yahoo! Briefcase account, which is
accessible at : http://briefcase.yahoo.com/theta_ga
     - Once I have finished my program, I will upload the files there,
and you can download them. Before you can download the files, I will
have to add you to the users list for the folder. So, please do the
following :
     - Goto Yahoo! (www.yahoo.com) and create a temporary mail id.
     - Once this is done, mail this id to me at theta_ga@yahoo.com.
Please do not post your id on Google Answers.
     - Once I recieve this id, I will add it to the folder user list,
and you can download from this folder.

Please Note : The mail id given above is a temporary id. Please use
Google Answers for any comments/clarifications you need.

Well, off I go. I hope to have something for you (at least an update
;-) ) for you by tomorrow.
Thanks,
Theta-ga

Clarification of Question by nhdw-ga on 30 Nov 2002 13:54 PST
theta-ga,

- Downloaded Spider
- Dumped previous URLs to a text file... (There's about 42,000 lines
of URLs...
  I'll also be repeating this process with history from even further
in the
  past if we get this working)
- Sent you a message from my yahoo account...

I hope you find that IE function that lets you write to to the dat's!
That would be excellent!!

Thanks very much ...
-- nhdw-ga

Clarification of Question by nhdw-ga on 30 Nov 2002 14:14 PST
By the way -- Don't kill your weekend! :-) There's no rush on this...

Thanks,
--nhdw-ga

Request for Question Clarification by theta-ga on 01 Dec 2002 07:59 PST
Hi nhdw-ga,
    - Got your mail. Added you to the access list. you should be able
to see the (currently empty) folder at :
http://briefcase.yahoo.com/theta_ga
      Hope to have it filled soon! :-D
    - Have had some success in importing History from one computer to
another. Will test it on a Win2K server tomorrow, as I only have a Win
Me machine(Have to reboot every time I want to get the index.dat
file!).
    - I was also looking at the Favourites idea ( as sparky4ca-ga has
also suggested in his comment ) since that would eliminate your
problem of reimporting History items every 999 days.
    - I have also found a commercial program (for IE 4-6) that should
help you import your backed up History items. However, since you have
indicated that you are not in any current hurry for the answer, I am
still working on my (hopefully better) solution. If I have not
succeeded by Tuesday, i will post the commercial program as the
answer.
Regards,
Theta-ga

Clarification of Question by nhdw-ga on 01 Dec 2002 08:25 PST
theta-ga,

Great! ... I can access the yahoo briefcase folder... 
The favorites idea would work out... sparky4ca-ga's idea of bringing
them into
the sidebar would work also... I would still prefer them to be in
history if possible...
OK, on the commercial product ... 

Thanks again,
--nhdw-ga

Request for Question Clarification by theta-ga on 04 Dec 2002 08:39 PST
> - You will have the answer by tomorrow
   Famous last words ?  :-D

  Anyways, I have uploaded a preliminary version for you to test out,
while I continue my never ending quest for a Win2K Server to test it
out on. :-)
  The link : http://briefcase.yahoo.com/theta_ga
  So try it out and tell me if it works.
  Regards,
  Theta-ga

Clarification of Question by nhdw-ga on 04 Dec 2002 22:47 PST
It works ... I'm able to save them in IE format and dump the files it
creates into a favorites folder... This is perfect ... If you feel
compelled to finish/touch up the app you wrote, feel free -- But just
what you have now is beyond sufficient :-) ... Hmm... I guess we can
uuencode it as an answer? :-)

Many, MANY thanks theta... This is above and beyond my expectations...
Answer  
Subject: Re: Internet Explorer History Merge (Repost)
Answered By: theta-ga on 09 Dec 2002 06:56 PST
Rated:5 out of 5 stars
 
Hi nhdw-ga,
   Glad to know that the solution works. :-D
   Here is a description of how it works :
DISCLAIMER : The Internet Explorer History file format is undocumented
and the following information is largely a result of my own research.
I can make no guarantee as to its accuracy. It works for me though!
:-)
      

The IE History Index.dat files v5.2
===================================
    The IE5 History file format has not been documented anywhere, but
is is quite similar to the IE4 cache file format documented at  :
     - MyFileFormats : Internet Explorer Temporary Internet Files
Cache
       ( http://www.myfileformats.com/download_info.php?id=5670 )

    Basically, the file is divided into three main sections : 
    - The File Header : This section begins with the identifier string
"Client UrlCache MMF Ver 5.2" and contains information such as the
size of the file and the offset from which the HASH(see below) begins.
    - The HASH record : This is a 4KB record which preceeds the URL
records in the file.This hash is used by Internet Explorer to speed up
its traversal/sorting of all the URL records contained in the file. A
file can contain more than one HASH records. This hash is generated by
IE using an undocumented algorithm, and hence cannot be easily(if at
all) duplicated.
    - The URL record : Each HASH record is followed by one or more URL
records. Each URL record has a size which is a multiple of 128 bytes.
The actual URL starts in the URL record at the 97th byte, and is
stored as a null terminated string. Each URL is preceeded by a prefix
which is usually 19 bytes long.

Creating an index.dat
=====================
   There are two main problems we face when trying to create an
index.dat file by ourselves. Firstly, large parts of the file header
are undocumented and we have no idea what information they store.
Secondly, for every URL we add to the file, we must modify the HASH
record accordingly so that IE can access the URLs. But we cannot do
this, as we do not know the algorithm being used to generate the HASH.
If IE detects an inconsistent index.dat, it deletes the file and
replaces it with an empty index.dat containing only the file header.
   Since my attempts to create index.dat files ended in failure, I
decided to attack the problem in another way.

Internet Explorer - Registry settings
=====================================
   A list of registry accesses made by Internet Explorer showed an
interesting thing; IE was reading the path to the History files from
the registry! This lead me to the following registry key, where IE
stores its list of History files :
  HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet
Settings\5.0\Cache\Extensible Cache\
   This registry key contains a subkey, which ususally has a name in
the format : MSHist01yyyymmddYYYYMMDD
   where yyyymmdd : The date from which the URLs are indexed (From:)
         YYYYMMDD : The date to which the URLs are indexed (To:)
You can see that this format is the same as the subfolder names in
your \History\History.IE5 folder. One such subkey is present for every
subfolder.
Each of the subkey contains the following data :
  - CacheLimit : usually 8192 (file limit in KB or no. of URLs ?)
  - CacheOptions : usually 11 (unknown)
  - CachePath : path to the folder which contains the index.dat file
for this  duration. This path can point to any folder on disk which
contains a valid History index.dat file.
  - CachePrefix : This stores the prefix string which preceeds the
actual URLs in the URL records(see description above).

So, in order to add a particular index.dat to the current IE History,
we have to create the relevant subkey for it and provide the time
interval for which it contains the History, and the path to the file.
If the 'To' date for the time interval falls out of the time limit
specified in Internet Explorer's Options, IE deletes the specified
file and the registry entry.

The two History Index.dat files
===============================
You will notice that your History.IE5 folder contains an index.dat
file. This file is different from those contained in the subfolders of
History.IE5. This main index.dat is not specified in the registry, and
is unique, i.e. there can only be one such index.dat. This also means
that the data contained in the main index.dat that you backed up,
cannot be imported by adding it to the registry. If you try to do
that, IE detects it as a corrupt file and deletes it. The only way
(that I was able to determine) to get the data out of this file is to
extract the URLs from it and save them as URL shortcuts.


Creating URL shortcuts
=======================
  The url shortcuts that appear in the Favourites folder, are simple
text files containing data in a fixed format. In order to create a URL
shortcut, follow these steps :
        - Create a new text file
        - Enter the following text in the file :
            [InternetShortcut]
            URL=http://www.someaddress.com/
        - Now save the file, and change its extension from .txt to
.url
Windows automatically treats files with  a .url extension as Internet
shortcuts and extracts the URL from it. You can find detailed info on
the URL file format at :
    - MyFileFormats : Internet Explorer URL files
      ( http://www.myfileformats.com/download_info.php?id=7140 )

To create all the URL files, my program parses the index.dat files,
looking for URL records. From them, it extracts the URL string, and
strips away the cache prefix string and other prefixes from it. It
then writes this data into URL file in the specified format. Naming
the URL file creates a slight problem, since the URL contain many
characters (such as / ? :), which are not allowed in a Windows
filename. I get around this by replacing all the invalid characters
with a dash(-) character.

===================================
Hope this helps.  :-)
If you need any clarificatons, just ask!
I will be putting up a new version of my app for download in a few
days, so do check it out.

Regards
Theta-ga
nhdw-ga rated this answer:5 out of 5 stars and gave an additional tip of: $20.00
Absolutely the best answer I could ask for... Theta-ga goes so far, in
this answer, as to code an application which peforms the task I
originally needed to accomplish... What more could anyone possibly ask
for? THANKS!!

Comments  
Subject: Re: Internet Explorer History Merge (Repost)
From: sparky4ca-ga on 30 Nov 2002 13:53 PST
 
Just a thought here...

Once these URLs are back into the history, will we not again run into
the 999 days limit? I mean, won't IE eventually decide that these new
URLs are too old?

Perhaps they would be better in the favourites sidebar, in folders
that match the heirarchy of the history sidebar.
Subject: Re: Internet Explorer History Merge (Repost)
From: nhdw-ga on 30 Nov 2002 14:16 PST
 
As of right now, they're in a text file, 1 URL per line (not
DateTimeStampped)... I think theta's thinking of a different method --
of putting them into a more recent folder, where they wouldn't expire.
Subject: Re: Internet Explorer History Merge (Repost)
From: nhdw-ga on 30 Nov 2002 14:17 PST
 
(At least not for another 999 days)
Subject: Re: Internet Explorer History Merge (Repost)
From: theta-ga on 02 Dec 2002 07:13 PST
 
Hi nhdw-ga,
  Here's An Update :
        - Imported years old history into IE6. Yay!
        - Tested the above on  Win ME, couldn't get the Win2K Server
today. Will try again tomorrow.
        - Will start work today on importing them into favourites.
This should be easy.
        - You will have the answer by tomorrow. :-)

Time to get back to work!
Regards,
Theta-ga

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy