Google Answers Logo
View Question
 
Q: Software program that de-duplicates keyword lists ( Answered,   1 Comment )
Question  
Subject: Software program that de-duplicates keyword lists
Category: Computers > Software
Asked by: ccblogster-ga
List Price: $50.00
Posted: 28 Mar 2006 08:45 PST
Expires: 27 Apr 2006 09:45 PDT
Question ID: 712782
I need to find a software program that can take two text
files, representing search term keyword lists, and output a
third text file that's an alphabetized list of the
difference between the two files.

For example, if my April keyword list (APR06-keywords.txt)
has 830 search term entries and my March search term keyword
list (MAR06-keywords.txt) has 730 entries, I need this
program to tell me which are the 100 keyword search terms
that are new (preferably in alphabetical order).

I know there must be some kind of inexpensive shareware or
quasi shareware program that can do this, however to date I
haven't found one that can simply take two text files and
simply output a third text file with the "difference"
between the files (the new keyword search phrases for that
month).

Request for Question Clarification by bobbie7-ga on 28 Mar 2006 11:59 PST
Hello Ccblogster,

Please take a look at the software programs below and let me know if
you find them suitable for your purpose.

Thanks, 
Bobbie7



Beyond Compare
http://www.scootersoftware.com/moreinfo.php

UltraCompare Professional 3.0
http://www.ultraedit.com/index.php?name=Content&pa=showpage&pid=34

Diff Doc
http://www.softinterface.com/MD/MD.htm

Request for Question Clarification by hummer-ga on 28 Mar 2006 13:21 PST
Hi ccblogster,

I don't think you'll find a program that will do everything that you
want, it will need to be a two or three step process. With the
following program (freeware), you can:

# Compare text or binary files
# Show differences
# Copy differences to the Clipboard 

Servant Salamander 2.5 
http://www.altap.cz/salam_en/features/file_comparator.html

After you have it in the Clipboard, it would then be a matter of
saving it to a program such as Word or Excel to alphabetize your
keywords.

Let Word Alphabetize Lists for You
Do you occasionally need to alphabetize a list of names? You could
waste an entire hour on that single task, but Word can sort the list
instantly:
   1. Type a list of names, pressing the Enter key after each name.
Your list should look something like:
     Mary Koepke
     Bill Coan
     Paul Jones
   2. Select the entire list.
   3. On the Table menu, choose Sort.
   4. Click Options, then click Other, then press the spacebar and click OK.
   5. Choose "Sort by Word 2", then click OK.
Now your list looks like this:
     Bill Coan
     Paul Jones
     Mary Koepke
Repeat Steps 1?5, but choose "Sort by Word 1" and your list will look like this:
     Bill Coan
     Mary Koepke
     Paul Jones
http://pubs.logicalexpressions.com/pub0009/LPMArticle.asp?ID=31

Please let me know if you are happy with this solution.
hummer

Request for Question Clarification by hammer-ga on 29 Mar 2006 07:06 PST
ccblogster,

What should happen if your March list contains more (or different)
entries than the April list?

- Hammer

Request for Question Clarification by hammer-ga on 31 Mar 2006 08:23 PST
CCBlogster,

Are you willing to use command line tools? I can point to some that
can do want you want quite nicely, but there aren't any pretty buttons
to click.

- Hammer
Answer  
Subject: Re: Software program that de-duplicates keyword lists
Answered By: leapinglizard-ga on 04 Apr 2006 14:10 PDT
 
Dear ccblogster,


I have written a Python script that carries out the task you describe
with a simple graphical interface. Instructions for installing and
executing it under Windows follow. Please inform me of any bugs you
may find and give me a chance to fix them before you rate this answer.
Similarly, let me know if you run into difficulties with the Windows
instructions, or if you need help installing and running it under a
different operating system. (Please forgive the Windows assumption if
it is wrong; I am a Linux user myself.)


First, you will need to install Python, a popular open-source
scripting environment, on your computer. You will need 28 MB of free
space for the full installation. Left-click on the link below and save
the Python installer to disk, then double-click on its icon. Choose
"Install for all users" in the wizard, and click "Next" repeatedly to
accept all defaults. Finally, click "Finish".

python.org: Download: Windows Installer
http://www.python.org/ftp/python/2.4.3/python-2.4.3.msi


Now right-click on the link below and choose "Save Link As..." or
"Save Target As..." to download the script to your computer. You
should end up with an icon in the shape of a cartoon snake (meant to
suggest a python) with the name "fresh". The full name of the file is
actually "fresh.py", but of course the extension won't be visible in
icon form.

fresh.py
http://plg.uwaterloo.ca/~mlaszlo/answers/fresh.py


Double-click on fresh.py to launch the graphical interface. To quit,
press Esc or Q. To load your old and new keyphrase files, use the
leftmost buttons at top. My script in its current configuration
assumes that you have one keyphrase on each line, and that each one
should be rendered in lower case, with all extraneous spaces removed,
for purposes of comparison. If you don't like these rules, I can
reconfigure them to your own specifications.

Once the old file and new file have been loaded, the new keyphrases
should automatically appear in the rightmost pane. In this context,
"new" means that a keyphrase occurs in the new file but not in the old
file. The same keyphrases are displayed in red in the middle pane. In
the leftmost pane, keyphrases that occur in the old file but not in
the new file are displayed in blue.

Click on the rightmost button at top, or just type Enter or Return, to
sort all keyphrases in alphabetic order.


It has been a pleasure to address your question. I remind you that I
am at your service to fix any bugs in fresh.py before you rate my
answer.

Regards,

leapinglizard
Comments  
Subject: Re: Software program that de-duplicates keyword lists
From: myexpertsonline-ga on 30 Mar 2006 11:43 PST
 
I know people who can make this happen using Microsoft Excel, even
with your text files, assuming they have a certain syntax, such as:

word, word, word

or


word
word
word

You could likely get some free help with this if you post at
www.officearticles.com/forum

And the only reason I'd use Excel over Word is because there are many
more people who can easily code Excel than their are those that can
code in Word.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy