This is a question that has been bothering me for some time. I just
recently started as part of the fundraising staff at a small nonprofit
with a database of about 50k donors. The database software is called
Paradigm 4.1 by MIP Software, running on Sybase Adaptive Server
Anywhere.
My problem is this, we have had problems in the past entering data in
a organized fashion so that some data would be entered in one way and
some in another way. The case in point is the way addresses have been
entered.
There are 1000s of addresses that will be in the style of '125
Magazine Avenue' and 1000s in the style of '125 Magazine Ave.'
What I am trying to do is find and replace specific text within a
specific string. I have no problem loacating the candidates for
correction and have run some SQL scripts through ISQL to clean up
other parts of the database, but cannot figure out how to do a find
and replace within a string. Some addresses will be '125 Magazine
Ave.' and some '125 Magazine Ave., #3', so the street address is not
always the end of the string.
I'd actually like to be able to have a general find and replace script
for this database to change other things, like Apt. --> #, Ln. -->
Lane and eventually St. --> Street
The reason for this change is that our db software often allows two
very similar addresses to be entered which causes duplicates which
means we spend a lot of time and money fixing them. And it just looks
more professional for us to have the entire adddress.
So, the database is obviously a bunch of tables, the relevant table is
'address' and row is 'address1' (and rarely 'address2' has the street
address if business address is on 'address1')
I guess it's something like this (or maybe this is much more complex)
UPDATE Address
SET address1 = ???Avenue???
WHERE address1 LIKE '%Ave.%'
I really appreciate your help & apologize for not being able to offer
more $$$
Thanks |
Request for Question Clarification by
mathtalk-ga
on
06 Sep 2003 21:40 PDT
Hi, nate_marsh-ga:
I suppose that a great deal of practical advice might be given about
the task of cleaning up address data. However you've focused on a
fairly narrow Sybase SQL topic, i.e. how to do a "find & replace" on
string fields (across rows in a table).
The LIKE operator does not have enough power to accomplish what you
want. You need a function such as charindex or patindex which tells
exactly where a search string begins. The input string can then be
disassembled by the substring function.
However, after doing a fair number of experiments, I have come to
believe that a number of subtleties intrude upon this simple outline.
Let's consider the search for an abbreviation for Avenue, which you
would like to replace with Avenue as a full word. The abbreviation
may be followed by a period or a space, or neither if it falls at the
end of the string, for example.
Before attempting to post a SQL UPDATE statement that "addresses"
these variations, I thought it best to gauge your interest in such
fine points. If you are interested only in the basic tools of the
trade, I don't want to belabor you with minutiae.
It might also be a good idea to identify the version number of your
Sybase implementation, in case what I'm using for testing is
different.
regards, mathtalk-ga
|
Clarification of Question by
nate_marsh-ga
on
08 Sep 2003 06:26 PDT
mathtalk-ga,
Thanks for looking over my question.
First of all, the software I am using is: Sybase ISQL Version
6.0.4.399
I am aware that this is a pretty complicated process, as there is no
way to simply replace one part of a string with another segment.
Furthermore, I am pretty much stuck with the system as it is now.
Although it might make sense to put street name & number in one field
and apartment/unit/floor in the second field, I don't have that option
now. In fact, only a small percentage (about 1%) close with just Ave.
That said, if there is a way to specify how many extra characters
(i.e. '%Ave.____' for and address like "125 Magazine Ave., #4" or the
other way '_______________ Ave. ____' although that seems to be a bit
excessive, is it possible to count from the right side?) and then run
the the change. It might be more unwieldy, but it's easier than me
either doing it piece by piece or exporting it --> Excel and back an
imperfect and frustrating process.
I am very interesting in restructuring this database in a more useful
and coherent way, any advice is helpful. I do have a practice database
that I could sent to you if you wanted to see the exact set up of this
db.
Thanks!
|
Clarification of Question by
nate_marsh-ga
on
11 Sep 2003 10:56 PDT
I wish I could simply export the addresses and fix them up in emacs or
excel, etc. but it is a rather complicated process with paradigm, the
db frontend we have. I would need to reimport everything in the
address table (address1, address2,address3, email, phone #s etc.) and
then I would need to erase the previous address entries, otherwise
they'd just cause duplicates. Plus, the import utility is less then
perfect and I'm not confident it'll handle these well.
|