Im currently using Microsoft Excel 2002 and here is my problem:
I currently import a lot of data from a comma delimited source;
however, the information is being given to me in a manner which
doesnt allow me to begin manipulating the data until a considerable
amount of effort has been expended on my part first. To elaborate, I
get lists (...and lists) of customer names, which I have to sort into
two columns by last name, first name order. Unfortunately, I receive
this information the opposite way to which I need it (first name, last
name) condensed in just one column. In a long winded attempt to
rectify the problem, I currently copy the Customer Name column into
a notepad (*.txt) file, then re-import this as a space delimited file
directly into the original comma delimited (now Excel of course) file.
The problem I then have is...before I can actually begin manipulating
this data, I must compress the space delimited info, sometimes from up
to as many as 8 columns, into only 2. Last name and first name.
For example:
A customer name such as John Doe would import perfectly into the 2
column format Im looking for, but
John A. Doe would import into 3 columns, so I would manually have to
move the middle initial in with John to create John A. column.
Naturally things get more complicated, not to mention more time
consuming when you encounter names such as:
John A. & Mary B. La Penne Jr.
...because the re-imported information, which spans 8 Excel columns,
needs to be manually changed into the two column format:
La Penne Jr. John A. & Mary B.
When youre dealing with tens of thousands of records, it becomes a
tedious pursuit...so I hope Ive explained my conundrum clearly enough
for you to be able to offer some kind of solution.
Thank you for reading. |
Request for Question Clarification by
jeanluis-ga
on
18 Jun 2002 16:14 PDT
I have come up with a simple perl script that will do what you want
automatically. However it is very crude, and does not work in the case
that the name has a suffix (such as John A. & Mary B. Le Penne Jr.)
Basically what it does is take an input file with the name list and
reads it, then outputs the list of names with the last word on each
line first.
I would like to work on this more, and improve it so that it will be
more robust, and work in all cases. But I would like to know if this
is an acceptable solution to you. Please note that you can run perl
scripts on Windows machines for free by downloading perl here:
"Downloading the latest version of Perl"
http://www.perl.com/pub/a/language/info/software.html#win32
Please let me know if this is ok, if you give me the thumbs up I will
spend some more time on my script, make it work in all situations, and
post it here.
Thanks
--jld
|
Request for Question Clarification by
jeanluis-ga
on
18 Jun 2002 16:16 PDT
Also if my idea is ok, I would like to know the exact format of the
input data, and the output data.
For example something like this:
input:
john a doe,
output:
does john a,
With whatever the actual formats are.
Thanks
--jld
|
I took the liberty of finishing my script, here is what you need to do
in order to use it:
Install Perl for windows, and make sure perl.exe is in your path.
Then cut and paste the script below into a text editor (i.e. notepad),
and save it to a file (ex: list.pl)
To run the script here is what you want to do, lets say you have a
directory called C:\lists, and in this directory you have the perl
script, and the input list <firstname lastname>.
You simply have to run the following command:
C:\Lists> perl list.pl input.csv output.csv
Where input.csv is the name of the file that contains the <firstname
lastname> list, and output.csv is a file that will contain the
<lastname firstname> list.
I have broken the problem down into 2 cases as follows:
Simple case: persons name is either 3 words or less. ex: John A. Doe,
In this case the output is: Doe John A.,
Hard case: The hard case is when a name contains more than 3 words.
ex: John A. & Mary B. La Penne Jr.,
In the hard case the program will prompt you with the following:
-----
PROBLEM: John A. & Mary B. La Penne Jr.
Please enter number of words in the last name:
-----
Here you are to give the number of words in the last name of this
person, in this case 3, just type in 3<enter>, and it will produce
this output:
La Penne Jr. John A. & Mary B.,
All of the output is placed in the output file you gave on the command
line.
I assume the input is exactly as follows:
John Doe,
Fred Smith,
George W. Bush,
Jean Luis Desjardins The Third,
And the output is exactly as follows:
Doe John,
Smith Fred,
Bush George W.,
Desjardins The Third Jean Luis,
If the format of either the input or output is incorrect please let me
know, and I can fix it.
If you have any questions, or need any other help please just let me
know.
--jld
The script starts on the next line, cut and paste from the next line
on down:
#!/usr/bin/perl
open(fInFile, $ARGV[0]) || die "Cant open input file\nUseage: list.pl
<inflie.csv> <outfile.csv>\n"; # opens the file
@data = <fInFile>; # reads the file
close(fInFile); # closes the file
open(fOutFile, ">$ARGV[1]") || die "Cant open output file\nUseage:
list.pl <inflie.csv> <outfile.csv>\n";
foreach $line (@data) { # goes through each line in the file
@words = split(/ /,$line); # splits each line on a white space
$size = @words; # find how many words are on each
line
$words[$size-1] =~ s/,\s+$//; # remove comma and any trailing white
space (newline)
$end = 1;
if ($size > 3) {
printf "PROBLEM: @words\n";
printf "Please enter number of words in the last name: ";
$end = <stdin>;
}
for ($i = 0; $i < $end; $i++) {
print fOutFile "$words[$size - $end + $i]"; # print the last
name
if ($i != ($end - 1)) {
print fOutFile " ";
}
}
for ($i = 0; $i < $size - $end; $i++) { # print the rest of the
name
print fOutFile " $words[$i]"; # print the rest of the
name
}
print fOutFile ",\n"; # print a comma and a
newline
}
close($fOutFile); |
Clarification of Answer by
jeanluis-ga
on
18 Jun 2002 20:16 PDT
It appears the some of the lines have wrapped which should not have,
this will cause problems. Here I think I have fixed it by making some
lines shorter, please use this script:
#!/usr/bin/perl
open(fInFile, $ARGV[0]) || die "Use: list.pl <in.csv> <out.csv>\n";
@data = <fInFile>; # reads the file
close(fInFile); # closes the file
open(fOutFile, ">$ARGV[1]")||die "Use: list.pl <in.csv> <out.csv>\n";
foreach $line (@data) { # goes through each line in the file
@words = split(/ /,$line); # splits each line on a white space
$size = @words; # find how many words are on each line
$words[$size-1] =~ s/,\s+$//; # remove comma and any trailing WS
$end = 1;
if ($size > 3) {
printf "PROBLEM: @words\n";
printf "Please enter number of words in the last name: ";
$end = <stdin>;
}
for ($i = 0; $i < $end; $i++) {
print fOutFile "$words[$size - $end + $i]";# print last name
if ($i != ($end - 1)) {
print fOutFile " ";
}
}
for ($i = 0; $i < $size - $end; $i++) {# print rest of the name
print fOutFile " $words[$i]"; # print rest of the name
}
print fOutFile ",\n"; # print a comma and a newline
}
close($fOutFile);
|
Request for Answer Clarification by
savemore-ga
on
18 Jun 2002 20:35 PDT
Hi...
I tried to answer your response, but was locked out for some reason.
This is what I wrote therefore in response to your original reply:
I must confess Im totally naïve when it comes to scripts and even
clicking onto the hyperlink you provided, I am unsure what I am
supposed to do here. Is there something I am supposed to download?
"Scripts aside, I have to tell you that you truly have a good
understanding of what Im trying to achieve, and youre doing it in a
manner that I had never ever thought about. My understanding is that
youre effectively reversing the information in the Customer Name MS
Excel cell
is that correct? I further understand that the revised
information will be provided in just one cell and not split into
two
is that correct as well? If that is the case, Im confident
thisll work just as effectively.
Now turning to the subject of name suffixes. To be honest, there
arent that many Juniors, Seniors, or 2nd, 3rd, 4th generation
listings
so it isnt that critical to proverbially make the tail wag
the dog in order to make this absolutely, positively happen.
As for the order of the data, this is always in block type, for which
I use the PROPER command in order to correct the cells upper and lower
case properties. An example of the original format the data takes is:
JOHN A DOE
If possible, it would be really handy (if were working on all the
data being formatted into just one MS Excel cell) if the output could
look something like this:
Doe, John A.
Remember, I can use the PROPER command although it would be nice to
place a comma between the last name and first name. Also is its
possible to place a period after any middle initials?
Thank you so much for your reply by the way. I hope my
clarification makes my foggy question a little clearer. :-)"
|
Clarification of Answer by
jeanluis-ga
on
19 Jun 2002 03:30 PDT
Great, lets get perl on your machine 1st: to do this you must click on
the following link:
"ActivePerl Download"
http://www.activestate.com/Products/Download/Download.plex?id=ActivePerl
That is a direct link to the download page for Perl, you should go
there, and you should see a little table with "Windows | MSI | 8.5MB"
on the 1st row of the table. You should click on MSI, and run it. That
should install perl for you (Note: I can't test it, because I don't
have Windows, but hopefully it will work without a hitch)
Once you have that done, let me know, or if you have some problem let
me know, let me know what the problem is, and we will figure it out. I
will work on getting the script up to spec as you stated. When I am
done I will post the new script here, and you can try it out.
Good luck,
--jld
|
Clarification of Answer by
jeanluis-ga
on
19 Jun 2002 03:58 PDT
Ok here is the new and improved script, it takes a list of names list
the following:
FRED A SMITH,
BILL DOE,
and produce the following:
Smith, Fred A.
Doe, Bill
Keep in mind that if the input file is in lower case, or a mixture, it
will not affect the script, the output will be the same.
Anyhow here is the script, again let me know if you have any problems:
#!/usr/bin/perl
open(fInFile, $ARGV[0]) || die "Use: list.pl <in.csv> <out.csv>\n";
@data = <fInFile>; # reads the file
close(fInFile); # closes the file
open(fOutFile, ">$ARGV[1]")||die "Use: list.pl <in.csv> <out.csv>\n";
foreach $line (@data) { # goes through each line in the file
@words = split(/ /,$line); # splits each line on a white space
$size = @words; # find how many words are on each line
$words[$size-1] =~ s/,\s+$//; # remove comma and any trailing WS
$end = 1;
if ($size > 3) {
printf "PROBLEM: @words\n";
printf "Please enter number of words in the last name: ";
$end = <stdin>;
}
for ($i = 0; $i < $end; $i++) {
$words[$size - $end + $i] =~ tr/A-Z/a-z/; # format the case
$tmp = ucfirst($words[$size - $end + $i]);# correctly: DOE->Doe
if (length($tmp) == 1) {
$tmp .= ".";
}
print fOutFile "$tmp"; # print last name
if ($i != ($end - 1)) {
print fOutFile " "; # print space after name
}
else {
print fOutFile ","; # print comma after last name
}
}
for ($i = 0; $i < $size - $end; $i++) {# print rest of the name
$words[$i] =~ tr/A-Z/a-z/; # format the case
$tmp = ucfirst($words[$i]);# correctly: DOE->Doe
if (length($tmp) == 1) {
$tmp .= ".";
}
print fOutFile " $tmp"; # print rest of the name
}
print fOutFile "\n"; # print a comma and a newline
}
close($fOutFile);
|
Request for Answer Clarification by
savemore-ga
on
19 Jun 2002 08:27 PDT
O.K., PERL is loaded, and I have created a directory on my C drive
called Lists. Into that directory I have copied and pasted your
script [Thank you by the way :-)] called list.pl and two *.csv
files, one called input.csv, which contains the data I wish to work
with and an empty *.csv file called ouput.csv.
When I go to run the script by typing-in C:\Lists> perl list.pl
input.csv output.csv, my computer (Windows 2000 by the way) gives me
the following error message:
This file does not have a program associated with it for performing
this action. Create an association in the Folder Options control
panel
Now Ive checked Folder Options and see there are already programs
associated with both file extensions:
*.pl opens with PERL command Line Interpreter
*.csv opens with MS Excel
...so Im a little flummoxed now about what to do next. Any
suggestions? :-)
P.S. I have also tried calling the scripts file "perl list.pl" and
tried running the script without an output.csv file in the directory.
|
Clarification of Answer by
jeanluis-ga
on
19 Jun 2002 09:09 PDT
Man, times like this makes me wish I had Windows2k...
Here are some ideas but they are a shot in the dark:
First of all, you dont need to have an output.csv file in the
directory, the script will create it, but it won't hurt if you do, it
will just get overwritten.
Try just running perl like this:
C:\Lists\> Perl
If it runs it should give you a prompt like the following:
#
or maybe
@
or maybe nothing, just a cursor. If it is successful, it will not give
an error message, and you should be able to type in the following:
printf "Hello World\n"
<CTRL+D>
Upon pressing CTRL+D it should say "Hello World" and return you to the
dos prompt.
If that works, then try to goto the directory where you installed
perl, under that directory you should see a subdir called "eg", in
this eg directory there should be a perl script called example.pl, try
and run that script from the eg directory by typeing the following:
C:\...\eg\> perl example.pl
If it works it should print:
"Hello from ActivePerl!"
To read more about this example see the following webpage:
"Getting Started with Perl"
http://aspn.activestate.com/ASPN/Reference/Products/ActivePerl/readme.html
If all of this works, then we are in business, and the perl script I
wrote *should* ;) work.
If this doesn't work, then I am at a loss, things I would check are to
make sure the perl.exe program is in your PATH variable, which can be
set ControlPanel->System->Environment. OR Try the ever popular windows
3-Fingered salute, aka, reboot and try again.
If it still doesn't work then this is one of those problems that
someone local geek can probably fix pretty easily, I will also ask
other researchers to chime in, maybe they can figure it out remotely.
Let me know what happens... we are gettin there...
--jld
|
Clarification of Answer by
jeanluis-ga
on
19 Jun 2002 09:59 PDT
You might also want to try and just type the following command,
because if .pl is associated with perl, then maybe you just need to
type in list.pl:
C:\list\> list.pl input.csv output.csv
Just to see what it does.
|
Request for Answer Clarification by
savemore-ga
on
19 Jun 2002 11:06 PDT
Im getting the same error prompt Im afraid when trying the
\eg\example.pl script even though the three fingered salute has
been applied.
In Edit System Variables, the Variable Name: Path, has a Variable
Value: C:\Perl\bin\;C:\WINNT\system32;C:\WINNT;C:\WINNT\System32\Wbem;C:\Program
Files\Common Files\Adaptec Shared\System.
By the way, this probably means nothing...but when I double-click the
mouse onto an actual script file in Explorer, I get a lightning fast
screen (looks like a dos prompt screen) open and close before my eyes.
Does this signify anything? :-)
One final point I wanted to make you aware of is that the error prompt
is titled C:\List> or C:\Perl\eg\> depending on which script I try
to use.
|
Clarification of Answer by
jeanluis-ga
on
19 Jun 2002 11:25 PDT
Rats... I am afraid that I am out of ideas, I suggest to get a local
resident geek to take a look at it, (i.e. someone from the IS staff,
or some computer geek office mate) If they can get perl running on
your machine then you should be able to run the script and format your
data pretty quick...
I have also been informed by Xemion-ga that you may be able to use
Excel programming and Macros to accomplish this same task,
unfortunately I know nothing about that stuff... If this doesn't work
for you then you can request a refund from google, repost the
question, and let someone else have a crack at it...
Good luck, feel free to prod me for more info on this if you think you
can make some progress...
--jld
|
Clarification of Answer by
jeanluis-ga
on
19 Jun 2002 11:58 PDT
Something else has been brought to my attention (iain-ga) when I say
execute the following command:
C:\lists> perl lists.pl input.csv output.csv
I mean open a DOS prompt, and goto the c:\Lists directory (CD
C:\Lists)
then type the following command from that directory:
perl lists.pl input.csv output.csv
You don't type in "C:\Lists\>", that is the DOS prompt. I am not sure
if this is what you are doing, but it has been known to happen. If
this is the case let me know, and we can (hopefully) move on.
|