![]() |
|
![]() | ||
|
Subject:
Off the shelf application to compare 2 text lists
Category: Computers Asked by: vicsing-ga List Price: $20.00 |
Posted:
30 Aug 2003 09:55 PDT
Expires: 29 Sep 2003 09:55 PDT Question ID: 250512 |
I am involved in a data migration project at my company and have to combine/link client records from 5-6 different product systems to one master system. As you can imagine there has historically been no uniformity in "naming convention" of clients across the 5-6 product systems i.e., a client could be spelt "ABC Limited", or "ABC Ltd." or "ABC" or "The ABC" across the various systems. To overcome this we created a "clean" master list of client names and now want to compare the 5-6 "dirty" lists to this clean list. We found some bulletin boards where users have solved such similar problems using "SQL" functions such as "LIKE" or some PHP text functions. HOwever, rather than recreate the wheel, we were wondering if there were any opensource or commercial applications out there that do this? Here is what the dream application would do: The user would input the "clean list" and the "dirty list(s)". The application would cycle through each entry in the each of the dirty lists, match it against the clean list and compute a score (i.e., the higher the score the closer the match). Then for all matches above a certain "threshold" score (say 90%) it would automaticaly rename/overwrite the dirty record with the clean one, for the others it would present the user a dialox box with the closest matches from the clean list in decending order of the score (i.e., "The closest matches are "ABC 82%, "Aaa 75%, "A^&%$ 65%). The user then would then just click one of these matches and the application would overwrite the dirty record with the clean record that the user just chose. And so on and so forth. The application should ideally also be customizable (i.e., ability to fine tune the algorithm that computes the score, or choose the "threshold" score over which the record is automatically owerwritten etc.) Anything out there? | |
| |
|
![]() | ||
|
There is no answer at this time. |
![]() | ||
|
Subject:
Re: Off the shelf application to compare 2 text lists
From: answerforce-ga on 02 Sep 2003 09:15 PDT |
I found this program on the internet: examdiff.exe (For windows). It does some or most of the things you've requested. Like you wanted, it has saved me countless hours. You can find it at this site: http://www.prestosoft.com/ps.asp?page=edp_examdiff I hope it serves you as it has served me. Good Luck Raymond |
Subject:
Re: Off the shelf application to compare 2 text lists
From: jgraves-ga on 03 Sep 2003 06:27 PDT |
Look for MaxDup on http://www.anchorcomputersoftware.com/. Pricing is not listed on the website. I wanted to become an official researcher before I posted this answer (to get the $$$) but they are not accepting new applicants. But I thought it was more important to answer you than play their political games. I'm not sure why someone needs to be 'official' to answer the question. Jay |
Subject:
Re: Off the shelf application to compare 2 text lists
From: yosarian-ga on 04 Sep 2003 03:23 PDT |
Hi vicsing-ga. If I understand you correctly, you are looking for a record linkage program. Here's one such program: http://www.linkagewiz.com/ Here is a list of several others: http://datamining.anu.edu.au/projects/linkage.html#record_linkage_software I have no experience with any of the above programs, but they look as a step in the right direction. (They may be an overkill as you mention only one field to be matched). My search words in Google were: "record linkage" software Good luck, yosarian-ga P.S. jgraves-ga, as you said, nobody has to be 'official' to answer the question. However, As someone who also wishes someday to become an official GA researcher, here's my 2 cents: What Google Answers sells is not only the answers, but the answers with a certain reputation - Google Answers staff test their researchers for correctness, promptness, communication skills; Later they check them according to user ratings. While anybody with good intentions may answer a question, I think non-researchers have less of a 'quality guarantee'. Is this a political game? I do not think so. |
Subject:
OT
From: jgraves-ga on 04 Sep 2003 09:21 PDT |
Hi yosarian-ga; Thanks for your comments. I'm going to look into your links (It's a business need for one of company's I work with.) As far as the political portion of my answer, I guess I am questioning why Google Answers exists in such a limited capacity. Are the official researchers so omnipotent that they can answer any question. Apparently not because all three of the suggestions have come from non-official users. Do any of the suggestions so far help the user? I don't know because he hasn't replied, but what is the guarantee that the 'official researcher' comes up with an acceptable answer. There seem to be an awful lot of unanswered questions on the site (I've only looked thru the computer section though.) It seems to me that opening it up a little more would help all sides. I see your point but I hope you see mine too. Jay |
Subject:
Re: Off the shelf application to compare 2 text lists
From: yosarian-ga on 07 Sep 2003 09:26 PDT |
Hi jgraves-ga, Having read your response, I guess I agree with your reasoning: There is no guarantee the official researcher's answer is adequate. I have sampled an (unrepresentative) number of unanswered questions. Most of them have had comments or clarifications that made a complete answer redundant, but the case is really not open and shut. I believe if you keep answering these questions for free, the GA establishment will eventually take you into account, and make you official :-) (that was the story told by pinkfreud-ga, one of my favourite researchers.) Good luck, yosarian-ga |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |