I'm writing an application which will compare (n) similar text files.
This can be accomplished by comparing two at a time. If you know of a
more optimal way of comparing (n) files where n > 2, I'm all ears.
These files will be compared in order i.e.: file1 <--> file2 <-->
file3 <--> ...
I need a textural compare algorithm or choice algorithms which will
allow me produce results similar to WinDiff (included with visual
studio) where I can detect additions, differences and deletions of
lines -- and for each different line, I need to show the word
differences. This must be displayed a RichTextbox or equivlant -- to
visually show the differences (for (n) files).
Currently, I load these text files into RichTextBox controls, and use
the .Lines array property to first compare lines -- then for lines
which are similar, compare the line word by word. For lines and words
which are different, I use the .Select() method to select the text in
question, then I call the .SelectionColor property to color the
different text. This way is too slow for larger files.
In addition to finding a more optimal compare algorithm, I believe if
I can compare the text before I load the text into the control, and
embed RTF codes in the text (to denote differences) I may be able to
further optimize the compare. I'm a little lost when it comes to
embedding RTF codes in the text, so the answer would have to touch on
the subject of embedding RTF codes in the text before or after loading
the RichTextBox (or equivlant) control. |