Hi Jellybeans
I have tried to make this script as user-friendly as possible and to
make it as easy to read as possible.
It will read in a file called "test.txt" and output a file called
"test2.txt". To change these names simply change them where they
appear in the script.
The number of columns is also easily changed and altered - you should
be able to work this out from the script quite easily.
To change which column is used to sort look for this line: push
@unsorted, [$fifth_column[$i], $i]; and change the $xyz_column.
Here is the code you need:
#START
#!/usr/bin/perl -w
use strict;
use warnings;
# make it web friendly (so it can be run from the web, remove the
following line if it does not need to be.
print "Content-type: text/html\n\n";
# set up variables
my (@info_from_file,$LINEBYLINE,$counter);
# read in the file to change called test.txt
open (LINEBYLINE,"test.txt");
while ( <LINEBYLINE> ) {
chomp;
push(@info_from_file,$_);
$counter++;
}; # end while - end reading the file line-by-line
# each line is now in a list
# split the data using the ;
my (@put_in_lists,@first_column,@second_column,@third_column,@fourth_column,@fifth_column);
for( my $i=0; $i < $counter; $i++) {
@put_in_lists = split(";",$info_from_file[$i]);
$first_column[$i] = $put_in_lists[0];
$second_column[$i] = $put_in_lists[1];
$third_column[$i] = $put_in_lists[2];
$fourth_column[$i] = $put_in_lists[3];
$fifth_column[$i] = $put_in_lists[4];
}
#sort the arrays so they are in the correct order
my @unsorted;
for( my $i=0; $i < $counter; $i++) {
# change the $xyz_column here to whichever column you need to sort by
push @unsorted, [$fifth_column[$i], $i];
}
my @sorted = sort { $a->[0] cmp $b->[0] } @unsorted;
my @sortorder = map { $_->[1] } @sorted;
@first_column = @first_column[@sortorder];
@second_column = @second_column[@sortorder];
@third_column = @third_column[@sortorder];
@fourth_column = @fourth_column[@sortorder];
@fifth_column = @fifth_column[@sortorder];
# save the sorted file as test2.txt
open(OUT, ">test2.txt");
for( my $i=0; $i < $counter; $i++) {
print OUT $first_column[$i] . ';' . $second_column[$i] . ';' .
$third_column[$i] . ';' . $fourth_column[$i] . ';' .
$fifth_column[$i];
print OUT "\n";
}
close OUT;
# end program
exit(0);
#END
If you need any further clarifications or have any questions let me
know and I will post the answers as soon as possible. |
Request for Answer Clarification by
jellybeans-ga
on
16 May 2004 17:06 PDT
hi there, and thanks for answering my question. Before I review it in
depth, can you let me know if this perl script will work in a Windows
operating system environment and if so are there any necessary changes
to the script? (I noticed that the usr/bin/perl thing mainly pertains
to UNIX based operating systems).
Let me know -- thanks!
Jellybeans
|
Request for Answer Clarification by
jellybeans-ga
on
16 May 2004 17:22 PDT
hey, it's me again.
I'm lucky enough to have access to a UNIX server, so I went ahead and
tested your script.
I created a test.txt file in my /cgi-bin folder with the following
(same as provided in my example) :
John;USA;3-1-2004;Blue;john@hotmail.com
Chris;USA;1-15-2004;Red;chris@domain.net
Jane;UK;2-2-2004;Pink;jane@doe.com
I then uploaded the script, set the relevant permissions, and ran the
script from a web browser. I got a blank page (aren't I supposed to
be given the sorted data? not sure). I then refreshed my /cgi-bin
folder to see that the script had in fact created a "test2.txt" . I
then viewed this text file and this was inside the file:
;;;;
Chris;USA;1-15-2004;Red;chris@domain.net
Jane;UK;2-2-2004;Pink;jane@doe.com
John;USA;3-1-2004;Blue;john@hotmail.com
Notice how there is a ;;;; near the top - did I do something wrong, or
forget to edit something in the script?
Please get back to me on my previous Windows question - and also,
whether or not this script would run if run via a .pl extension as
opposed to .cgi
Thanks
|
Clarification of Answer by
palitoy-ga
on
17 May 2004 02:37 PDT
Hi Jellybeans
With regards to your clarifications:
1) Yes, this script will work on a Windows system as long as you have
Perl installed (I would recommend Active Perl). The usr/bin/perl bit
is just ignored on Windows machines. I actually developed this on a
Windows machine so I know it works!
2) If you want the data to be output to a screen that is simple just
add this line after the (( print OUT "\n"; )) line:
print $first_column[$i] . ';' . $second_column[$i] . ';' .
$third_column[$i] . ';' . $fourth_column[$i] . ';' . $fifth_column[$i]
. '<br />';
It will still create the test2.txt file but this can be stopped if it
is not necessary.
The ;;;; I suspect is from blank lines in the test.txt file. If these
are removed the ;;;; should disappear. I have added another line in
the script below to check that the line as some meaning (that is not
blank in the future). This script is at the bottom of this
clarification (it also outputs to the screen as you requested).
Finally the script should work as either a .pl or .cgi, this really
depends on what your system prefers. Personally I use a .pl but that
is just my choice!
I think I have answered your queries in the clarifications, if you
have any further questions please ask away and I will get back to you
as soon as possible (I am in the UK).
#START REVISED SCRIPT
#!/usr/bin/perl -w
use strict;
use warnings;
# make it web friendly (so it can be run from the web, remove the
following line if it does not need to be.
print "Content-type: text/html\n\n";
# set up variables
my (@info_from_file,$LINEBYLINE,$counter);
# read in the file to change called test.txt
open (LINEBYLINE,"test.txt");
while ( <LINEBYLINE> ) {
chomp;
push(@info_from_file,$_);
$counter++;
}; # end while - end reading the file line-by-line
# each line is now in a list
# split the data using the ;
my (@put_in_lists,@first_column,@second_column,@third_column,@fourth_column,@fifth_column);
for( my $i=0; $i < $counter; $i++) {
@put_in_lists = split(";",$info_from_file[$i]);
$first_column[$i] = $put_in_lists[0];
$second_column[$i] = $put_in_lists[1];
$third_column[$i] = $put_in_lists[2];
$fourth_column[$i] = $put_in_lists[3];
$fifth_column[$i] = $put_in_lists[4];
}
#sort the arrays so they are in the correct order
my @unsorted;
for( my $i=0; $i < $counter; $i++) {
# change the $xyz_column here to whichever column you need to sort by
push @unsorted, [$fifth_column[$i], $i] if (
defined($fifth_column[$i]) && length($fifth_column[$i]) > 0 );
}
my @sorted = sort { $a->[0] cmp $b->[0] } @unsorted;
my @sortorder = map { $_->[1] } @sorted;
@first_column = @first_column[@sortorder];
@second_column = @second_column[@sortorder];
@third_column = @third_column[@sortorder];
@fourth_column = @fourth_column[@sortorder];
@fifth_column = @fifth_column[@sortorder];
# save the sorted file as test2.txt
open(OUT, ">test2.txt");
for( my $i=0; $i < $counter; $i++) {
if (defined($first_column[$i]) && length($first_column[$i]) > 0) {
print OUT $first_column[$i] . ';' . $second_column[$i] . ';' .
$third_column[$i] . ';' . $fourth_column[$i] . ';' .
$fifth_column[$i];
print OUT "\n";
print $first_column[$i] . ';' . $second_column[$i] . ';' .
$third_column[$i] . ';' . $fourth_column[$i] . ';' . $fifth_column[$i]
. '<br />';
}
}
close OUT;
# end program
exit(0);
#END
|
Request for Answer Clarification by
jellybeans-ga
on
17 May 2004 12:40 PDT
hi again,
when you say that "To change which column is used to sort look for
this line: push @unsorted, [$fifth_column[$i], $i]; and change the
$xyz_column."
Isn't it the 3rd column that is being sorted in the example though?
The fifth column is the last one?
e.g.
Chris;USA;1-15-2004;Red;chris@domain.net <<--- date occurs in 3rd column?
|
Request for Answer Clarification by
jellybeans-ga
on
17 May 2004 14:22 PDT
hi, me again.
Can you please check if your script works. I have a feeling you are
sorting by the fifth column in my example, which is the email address?
Here is my original test data set:
John;USA;3-1-2004;Blue;john@hotmail.com
Chris;USA;1-15-2004;Red;chris@domain.net
Jane;UK;2-2-2004;Pink;jane@doe.com
The third field here is what needs to be sorted, and the script needs
to be able to sort by date. The fifth field is the email address.
After sorting, the resulting text2 file should look like:-
Chris;USA;1-15-2004;Red;chris@domain.net
Jane;UK;2-2-2004;Pink;jane@doe.com
John;USA;3-1-2004;Blue;john@hotmail.com
But here's the thing: it just so happens that this is sorted by date
AND also sorted by email address (pure luck).
If we change the base data set to :
John;USA;3-1-2002;Blue;john@hotmail.com
Chris;USA;1-15-2001;Red;chris@domain.net
Jane;UK;2-2-2004;Pink;jane@doe.com
Your script incorrectly returns the following sorted text:
Chris;USA;1-15-2004;Red;chris@domain.net
Jane;UK;2-2-2004;Pink;jane@doe.com
John;USA;5-1-2003;Blue;john@hotmail.com
Note that this is not sorted by the date (third column), it's sorted
by the fifth field (email address).
I need the script to sort by a specified column in the case that data
in that column is a date.
Please advise,
Thanks
jellybeans
|
Request for Answer Clarification by
jellybeans-ga
on
17 May 2004 15:55 PDT
hey - another thing...
Since we're dealing with dates in the format 3-1-2004 and 10-12-2001
type (first digit is month, second is day, third is year) then it
needs to know how to sort those.
I think you missed this in your first attempt. The formatting of the
date is important, in terms of the script recognizing it and sorting
accordingly.
thanks,
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 01:17 PDT
You are right, the script is currently being sorted by the fifth
column but this is easily changed by altering this line:
push @unsorted, [$fifth_column[$i], $i] if (
defined($fifth_column[$i]) && length($fifth_column[$i]) > 0 );
to:
push @unsorted, [$fifth_column[$i], $i] if (
defined($third_column[$i]) && length($third_column[$i]) > 0 );
I thought that initially that you wanted something that could be
sorted on any term in any column which is what the script does at the
moment. In my last revision I must've been checking that the sorting
function was working correctly.
I will post the fix for the date later today. It is a relatively simple fix.
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 01:45 PDT
Here is the revised script with the correct date checking AND sorting
on column 3. If you need to alter the column that the date is in then
please read the comments in the script.
You CAN alter the script to check any column by simply changing a couple of lines.
If you are still unsure of how to do this I can comment the whole
script more (I am unsure how much you know of Perl/programming as to
how much I need to do this).
If you have any further questions or queries, please ask again.
#START
#!/usr/bin/perl -w
use strict;
use warnings;
# make it web friendly (so it can be run from the web, remove the
following line if it does not need to be.
print "Content-type: text/html\n\n";
# set up variables
my (@info_from_file,$LINEBYLINE,$counter);
# read in the file to change called test.txt
open (LINEBYLINE,"test.txt");
while ( <LINEBYLINE> ) {
chomp;
push(@info_from_file,$_);
$counter++;
}; # end while - end reading the file line-by-line
# each line is now in a list
# split the data using the ;
my (@put_in_lists,@first_column,@second_column,@third_column,@fourth_column,@fifth_column);
for( my $i=0; $i < $counter; $i++) {
@put_in_lists = split(";",$info_from_file[$i]);
$first_column[$i] = $put_in_lists[0];
$second_column[$i] = $put_in_lists[1];
$third_column[$i] = $put_in_lists[2];
$fourth_column[$i] = $put_in_lists[3];
$fifth_column[$i] = $put_in_lists[4];
# change here the $xyz_column to the column that the date is in
# this puts the date in a form that is easily sorted
(yearmonthday) with leading zeros
my @split_date = split("-",$third_column[$i]);
$split_date[0] = '0' . $split_date[0] if $split_date[0] < 10;
$split_date[1] = '0' . $split_date[1] if $split_date[1] < 10;
$third_column[$i] = $split_date[2] . $split_date[0] . $split_date[1];
}
#sort the arrays so they are in the correct order
my @unsorted;
for( my $i=0; $i < $counter; $i++) {
# change the $xyz_column here to whichever column you need to sort by
push @unsorted, [$third_column[$i], $i] if (
defined($third_column[$i]) && length($third_column[$i]) > 0 );
}
my @sorted = sort { $a->[0] cmp $b->[0] } @unsorted;
my @sortorder = map { $_->[1] } @sorted;
@first_column = @first_column[@sortorder];
@second_column = @second_column[@sortorder];
@third_column = @third_column[@sortorder];
@fourth_column = @fourth_column[@sortorder];
@fifth_column = @fifth_column[@sortorder];
# save the sorted file as test2.txt
open(OUT, ">test2.txt");
my ($theday,$themonth,$theyear);
for( my $i=0; $i < $counter; $i++) {
if (defined($first_column[$i]) && length($first_column[$i]) > 0) {
# replace the $xyz_column with the column the date is in for the next 3 lines
$theday = substr($third_column[$i],6,2);
$themonth = substr($third_column[$i],4,2);
$theyear = substr($third_column[$i],0,4);
$theday = substr($theday,1,1) if substr($theday,0,1) == 0;
$themonth = substr($themonth,1,1) if substr($themonth,0,1) == 0;
# replace $xyz_column that contains the date with $themonth . '-'
. $theday . '-' . $theyear in the next 3 print statements
print OUT $first_column[$i] . ';' . $second_column[$i] . ';' .
$themonth . '-' . $theday . '-' . $theyear . ';' . $fourth_column[$i]
. ';' . $fifth_column[$i];
print OUT "\n";
print $first_column[$i] . ';' . $second_column[$i] . ';' .
$themonth . '-' . $theday . '-' . $theyear . ';' . $fourth_column[$i]
. ';' . $fifth_column[$i] . '<br />';
}
}
close OUT;
# end program
exit(0);
#END
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 01:47 PDT
I have just noticed there was a typo in my earlier clarification. The
alteration should of course be:
push @unsorted, [$fifth_column[$i], $i] if (
defined($fifth_column[$i]) && length($fifth_column[$i]) > 0 );
to:
push @unsorted, [$third_column[$i], $i] if (
defined($third_column[$i]) && length($third_column[$i]) > 0 );
ALL INSTANCES of $fifth_column should be changed to $third_column for
the sorting to take place properly!
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 05:16 PDT
Have you checked if this revised sript works? I tested it and
received the following message when running it via a browser:
'E:\Inetpub\wwwroot\cgi-bin\google.pl' script produced no output
I then ran the script through a command prompt and it contains several
syntax errors. Can you re-test your script and clear these up?
An example of the errors are:
(yearmonthday) with leading zeros was not commented out in the script
causing an error
it's also having problems with the
. $theday . '-' . $theyear in the next 3 print statements
due to the "." symbol
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 05:54 PDT
The script does work. The problem you are experiencing is because of
the way google has formatted the script and how you have copied into
your text editor. The lines you are referring to are actually
comments (they start with a #). In Perl a command line is finished
with a ; symbol and a . is a means of adding something on to an
expression.
These lines, which I guess are causing the errors, should all be on one line:
# make it web friendly (so it can be run from the web, remove the
following line if it does not need to be.
# replace $xyz_column that contains the date with $themonth . '-' .
$theday . '-' . $theyear in the next 3 print statements
# this puts the date in a form that is easily sorted
(yearmonthday) with leading zeros
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 05:57 PDT
Similarly the 3 print statements should all also be on a line of their own.
For example:
print $first_column[$i] . ';' . $second_column[$i] . ';' .
$themonth . '-' . $theday . '-' . $theyear . ';' . $fourth_column[$i]
. ';' . $fifth_column[$i] . '<br />';
THIS IS ONE LINE, NOT 3!
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 09:24 PDT
Hi, what is the significance of the "6,2" and "4,2" and "0,4" in this
part of your script below?
$theday = substr($third_column[$i],6,2);
$themonth = substr($third_column[$i],4,2);
$theyear = substr($third_column[$i],0,4);
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 09:34 PDT
hi, me again-
Also, what is the significance of
$split_date[0] = '0' . $split_date[0] if $split_date[0] < 10;
$split_date[1] = '0' . $split_date[1] if $split_date[1] < 10;
specifically the "10" number? Is this something related to the
properties of a date?
I am checking because I need to know if I need to edit any of the
above numbers when I introduce, say, 50 columns to a text file.
I have tested the script and it appears to be working, but just want
clarification on the above
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 09:42 PDT
When comparing a date I have read in what is in the file, say
1-15-2004, and then converted it to yearmonthday (or 20040115 in this
case, note the extra zero). This is always a 8-digit number.
It is this yearmonthday number that I use when sorting the items as it
is always a larger number for dates later in the year.
When I come to convert the date back to your format I use the lines
you stated in your clarification request:
$theday = substr($third_column[$i],6,2);
$themonth = substr($third_column[$i],4,2);
$theyear = substr($third_column[$i],0,4);
The Perl function substr returns a section of a string beginning at a
set point. Therefore the $the_day line would return the 2 letters
starting at position 6 in $third_column[$i] (the substr function
begins counting at zero).
So if you had the date 20040115, $theday would be 15 (2 letters from
position 6 - remember start counting from zero!), $themonth would be
01 (2 letters from position 4) and $theyear would be 2004 (4 letters
from position 0).
The "6,2", "4,2" and "0,4" should therefore never need to be changed.
I hope this makes sense!
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 09:58 PDT
The significance of the 10 number is date related as you guessed.
As I alluded to in my previous clarification I change the date to an
eight figure value of the format YYYYMMDD. If the month or day is
less than TEN then I need to add a zero to the front of that value.
This means that 9 would become 09, 5 would become 05 etc.
The following 2 lines simply do that:
$split_date[0] = '0' . $split_date[0] if $split_date[0] < 10;
$split_date[1] = '0' . $split_date[1] if $split_date[1] < 10;
I hope this explains this.
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 10:49 PDT
Hello, thanks for your clarification. I had problems having the script
work when modifying it to handle 99 columns. In my specific case, I
need to have it search by the date which will be in the 9th column.
I tried modifying your script without success. Here was my script (unsuccessful)
http://www.mhtinc.com/cgi-bin/test2.txt (text of the perl script)
Please take a look and let me know what you think. I received all
sorts of error messages when running the script through the DOS
command on the server.
If you can help me on this one final step this question is closed and
I will provide you with a very generous tip.
Kinds regards
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 10:53 PDT
note that I decided to shorten "column" with "c" to conserve space in
the script. I also commented out the very last items since I wasn't
too concerned with having the results displayed as I was with the
script creating the sorted file.
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 10:59 PDT
I have had a quick look at the script and everything looks OK.
The only thing I am not sure about is the naming of the lists/arrays.
I am not sure whether in Perl you can call something @5_c or $5_c (but
@c_5 or $c_5 would be OK), it may be one of the few rules in Perl. I
will have to check this.
Do you have a copy of the test.txt file you are using? This will make
it quicker for me to debug the script.
I probably will not have time to work on this tonight (UK time) but
should have a solution for you by tomorrow (04:00PDT).
|
Clarification of Answer by
palitoy-ga
on
18 May 2004 11:07 PDT
I have just tried my old script (the one that worked) and changed
$first_column to $1_c and it produced lots of errors. If I then
changed it to $c_1 it all worked again.
I guess therefore your solution to this to change all your $1_c or
@1_c to $c_1 and @c_1.
Let me know if this works as everything else looks fine.
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 13:26 PDT
hey there,
thanks for your responses. Please take a look at my 'updated' script
based on your observations regarding @1_c and $1_c (to @c_1 and $c_1,
etc...)
http://www.mhtinc.com/cgi-bin/perl_script.txt
My test file, test.txt, is located at
http://www.mhtinc.com/cgi-bin/test.txt
I noticed, when running my script via the command prompt on the
server, that I did NOT receive any errors after I performed the
changes. The script created a "test2.txt", but it was empty. Now of
course this is because if you look at the perl_script.txt file above,
I've commented out those PRINT commands :
#print OUT $first_column[$i] . ';' . $second_column[$i] . ';'
.$themonth . '-' . $theday . '-' . $theyear . ';' .
$fourth_column[$i]. ';' . $fifth_column[$i];
#print OUT "\n";
#print $first_column[$i] . ';' . $second_column[$i] . ';'
.$themonth . '-' . $theday . '-' . $theyear . ';' .
$fourth_column[$i]. ';' . $fifth_column[$i] . '<br />';
I did not put in the relevant $1_c , $2_c etc because I was completely
confused by the format I needed to enter them in.
so I guess the script is working, but it's just that presently it's
not creating an output file.
Is the first of the print commands the command that actually writes to
the data to the test2.txt file? I assume the third print command is
the command that writes the data to the users screen when they run it
via a web browser. This is not so important as having the data written
to the actual test2.txt file, of course.
Either way, can you check it out -- and let me know what I can do to
make introducing my variables into those print commands easier. Maybe
you have a quick macro or something to paste it in there without
messing up the formatting.
Also note that in my test.txt file above (see URL), some of the fields
are blank (however, none of the date fields in the 9th column are
blank). Will this cause any problems with the script?
I appreciate your efforts with this
jellybeans
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 13:58 PDT
Me again.
Please look at the revised perl script with the first of the print
commands taken into account:
http://www.mhtinc.com/cgi-bin/perl_script_revised.txt
Running the above script created a test2.txt which you can see at
http://www.mhtinc.com/cgi-bin/test2.txt
It's all jumbled up and doesn't appear to be working.
Can you debug this and see what you can do ? I have no clue why it's not working.
|
Request for Answer Clarification by
jellybeans-ga
on
18 May 2004 18:32 PDT
hi, me again.
Now it appears to be working -- that is, it creates a test2.txt which
appears to be the sorted equivalent of test.txt
The only thing is, I tried a test.txt file which was of size 217KB and
run the script on that, and then the script creates a file which is of
size 230KB. Shouldn't the sizes be exactly the same? (I compared the
number of lines in each, which appeared to be the same)
Also, when I access the script directly using a web browser, I get the
following message:
'E:\Inetpub\wwwroot\cgi-bin\google.pl' script produced no output
How can I have it so that it returns something on screen?
Yet, the script does create the test2.txt file as instructed and
appears to have its values sorted.
I also checked running the script via the command line on the server,
and I received a bunch of error (?) messages, stating "Use of
uninitialized value in concatenation" at the beginning of the print
OUT line after we save the file as test2.txt.
Is there any way to check into the syntax of the script?
|
Clarification of Answer by
palitoy-ga
on
19 May 2004 02:28 PDT
Hi Jellybeans
I will start first with the "script produced no output" error, this
appears on Windows systems when trying to run a CGI/Perl
script that is not correctly set up and there can be a few reasons for
this. This webpage addresses some of these problems:
http://support.discusware.com/center/resources/errors/spno.php
Although this page is geared toward their product it does give you the
solutions in quite an easy manner.
I suspect adding this line to the script will sort it out (you may
need to alter the file path depending on where your copy
of perl.exe lies on your server):
#!c:/perl/bin/perl.exe
The "Use of uninitialized value in concatenation" is a simple Perl
error. This error occurs when you try to use a variable
that has no value, as some of the fields/columns in your text file are
blank this error will occur. It DOES NOT stop the
program from running or producing false values it is simply a warning to the user.
This error can be resolved in two ways:
1) Remove the line "use warnings;"
2) Address the problem by ensuring that the variables always have a
value. This can be done by adding the following to the
lines where the variable is initialised - || ''; This means "or make
the value ''".
So:
$c_1[$i] = $put_in_lists[0];
would become:
$c_1[$i] = $put_in_lists[0] || '';
The difference in file sizes was a little puzzling... but then I
checked your print statement towards the end and you repeat
$c_27[$i] a few times in one of them. Removing these repeats solves
this problem. (The test2.txt file will be 2 bits longer
because we add a new line at the end of the data.)
I hope this solves your problems. Let me know if there is anything
else that I haven't addressed.
#START
use strict;
use warnings;
# set up variables
my (@info_from_file,$LINEBYLINE,$counter);
# read in the file to change called test.txt
open (LINEBYLINE,"test.txt");
while ( <LINEBYLINE> ) {
chomp;
push(@info_from_file,$_);
$counter++;
}; # end while - end reading the file line-by-line
# each line is now in a list
# split the data using the ;
my (@put_in_lists,@c_1,@c_2,@c_3,@c_4,@c_5,@c_6,@c_7,@c_8,@c_9,@c_10,@c_11,@c_12,@c_13,@c_14,@c_15,@c_16,@c_17,@c_18,@c_19,@c_20,@c_21,@c_22,@c_23,@c_24,@c_25,@c_26,@c_27,@c_28,@c_29,@c_30,@c_31,@c_32,@c_33,@c_34,@c_35,@c_36,@c_37,@c_38,@c_39,@c_40,@c_41,@c_42,@c_43,@c_44,@c_45,@c_46,@c_47,@c_48,@c_49,@c_50,@c_51,@c_52,@c_53,@c_54,@c_55,@c_56,@c_57,@c_58,@c_59,@c_60,@c_61,@c_62,@c_63,@c_64,@c_65,@
c_66,@c_67,@c_68,@c_69,@c_70,@c_71,@c_72,@c_73,@c_74,@c_75,@c_76,@c_77,@c_78,@c_79,@c_80,@c_81,@c_82,@c_83,@c_84,@c_85,@c_86,@c_87,@c_88,@c_89,@c_90,@c_91,@c_92,@c_93,@c_94,@c_95,@c_96,@c_97,@c_98,@c_99);
for( my $i=0; $i < $counter; $i++) {
@put_in_lists = split(";",$info_from_file[$i]);
$c_1[$i] = $put_in_lists[0] || '';
$c_2[$i] = $put_in_lists[1] || '';
$c_3[$i] = $put_in_lists[2] || '';
$c_4[$i] = $put_in_lists[3] || '';
$c_5[$i] = $put_in_lists[4] || '';
$c_6[$i] = $put_in_lists[5] || '';
$c_7[$i] = $put_in_lists[6] || '';
$c_8[$i] = $put_in_lists[7] || '';
$c_9[$i] = $put_in_lists[8] || '';
$c_10[$i] = $put_in_lists[9] || '';
$c_11[$i] = $put_in_lists[10] || '';
$c_12[$i] = $put_in_lists[11] || '';
$c_13[$i] = $put_in_lists[12] || '';
$c_14[$i] = $put_in_lists[13] || '';
$c_15[$i] = $put_in_lists[14] || '';
$c_16[$i] = $put_in_lists[15] || '';
$c_17[$i] = $put_in_lists[16] || '';
$c_18[$i] = $put_in_lists[17] || '';
$c_19[$i] = $put_in_lists[18] || '';
$c_20[$i] = $put_in_lists[19] || '';
$c_21[$i] = $put_in_lists[20] || '';
$c_22[$i] = $put_in_lists[21] || '';
$c_23[$i] = $put_in_lists[22] || '';
$c_24[$i] = $put_in_lists[23] || '';
$c_25[$i] = $put_in_lists[24] || '';
$c_26[$i] = $put_in_lists[25] || '';
$c_27[$i] = $put_in_lists[26] || '';
$c_28[$i] = $put_in_lists[27] || '';
$c_29[$i] = $put_in_lists[28] || '';
$c_30[$i] = $put_in_lists[29] || '';
$c_31[$i] = $put_in_lists[30] || '';
$c_32[$i] = $put_in_lists[31] || '';
$c_33[$i] = $put_in_lists[32] || '';
$c_34[$i] = $put_in_lists[33] || '';
$c_35[$i] = $put_in_lists[34] || '';
$c_36[$i] = $put_in_lists[35] || '';
$c_37[$i] = $put_in_lists[36] || '';
$c_38[$i] = $put_in_lists[37] || '';
$c_39[$i] = $put_in_lists[38] || '';
$c_40[$i] = $put_in_lists[39] || '';
$c_41[$i] = $put_in_lists[40] || '';
$c_42[$i] = $put_in_lists[41] || '';
$c_43[$i] = $put_in_lists[42] || '';
$c_44[$i] = $put_in_lists[43] || '';
$c_45[$i] = $put_in_lists[44] || '';
$c_46[$i] = $put_in_lists[45] || '';
$c_47[$i] = $put_in_lists[46] || '';
$c_48[$i] = $put_in_lists[47] || '';
$c_49[$i] = $put_in_lists[48] || '';
$c_50[$i] = $put_in_lists[49] || '';
$c_51[$i] = $put_in_lists[50] || '';
$c_52[$i] = $put_in_lists[51] || '';
$c_53[$i] = $put_in_lists[52] || '';
$c_54[$i] = $put_in_lists[53] || '';
$c_55[$i] = $put_in_lists[54] || '';
$c_56[$i] = $put_in_lists[55] || '';
$c_57[$i] = $put_in_lists[56] || '';
$c_58[$i] = $put_in_lists[57] || '';
$c_59[$i] = $put_in_lists[58] || '';
$c_60[$i] = $put_in_lists[59] || '';
$c_61[$i] = $put_in_lists[60] || '';
$c_62[$i] = $put_in_lists[61] || '';
$c_63[$i] = $put_in_lists[62] || '';
$c_64[$i] = $put_in_lists[63] || '';
$c_65[$i] = $put_in_lists[64] || '';
$c_66[$i] = $put_in_lists[65] || '';
$c_67[$i] = $put_in_lists[66] || '';
$c_68[$i] = $put_in_lists[67] || '';
$c_69[$i] = $put_in_lists[68] || '';
$c_70[$i] = $put_in_lists[69] || '';
$c_71[$i] = $put_in_lists[70] || '';
$c_72[$i] = $put_in_lists[71] || '';
$c_73[$i] = $put_in_lists[72] || '';
$c_74[$i] = $put_in_lists[73] || '';
$c_75[$i] = $put_in_lists[74] || '';
$c_76[$i] = $put_in_lists[75] || '';
$c_77[$i] = $put_in_lists[76] || '';
$c_78[$i] = $put_in_lists[77] || '';
$c_79[$i] = $put_in_lists[78] || '';
$c_80[$i] = $put_in_lists[79] || '';
$c_81[$i] = $put_in_lists[80] || '';
$c_82[$i] = $put_in_lists[81] || '';
$c_83[$i] = $put_in_lists[82] || '';
$c_84[$i] = $put_in_lists[83] || '';
$c_85[$i] = $put_in_lists[84] || '';
$c_86[$i] = $put_in_lists[85] || '';
$c_87[$i] = $put_in_lists[86] || '';
$c_88[$i] = $put_in_lists[87] || '';
$c_89[$i] = $put_in_lists[88] || '';
$c_90[$i] = $put_in_lists[89] || '';
$c_91[$i] = $put_in_lists[90] || '';
$c_92[$i] = $put_in_lists[91] || '';
$c_93[$i] = $put_in_lists[92] || '';
$c_94[$i] = $put_in_lists[93] || '';
$c_95[$i] = $put_in_lists[94] || '';
$c_96[$i] = $put_in_lists[95] || '';
$c_97[$i] = $put_in_lists[96] || '';
$c_98[$i] = $put_in_lists[97] || '';
$c_99[$i] = $put_in_lists[98] || '';
# change here the $xyz_column to the column that the date is in
# this puts the date in a form that is easily sorted
(yearmonthday) with leading zeros
if ( length($c_9[$i]) > 0 ) {
my @split_date = split("-",$c_9[$i]);
$split_date[0] = '0' . $split_date[0] if $split_date[0] < 10;
$split_date[1] = '0' . $split_date[1] if $split_date[1] < 10;
$c_9[$i] = $split_date[2] . $split_date[0] . $split_date[1];
};
}
#sort the arrays so they are in the correct order
my @unsorted;
for( my $i=0; $i < $counter; $i++) {
# change the $xyz_column here to whichever column you need to sort by
push @unsorted, [$c_9[$i], $i] if (
defined($c_9[$i]) && length($c_9[$i]) > 0 );
}
my @sorted = sort { $a->[0] cmp $b->[0] } @unsorted;
my @sortorder = map { $_->[1] } @sorted;
@c_1 = @c_1[@sortorder];
@c_2 = @c_2[@sortorder];
@c_3 = @c_3[@sortorder];
@c_4 = @c_4[@sortorder];
@c_5 = @c_5[@sortorder];
@c_6 = @c_6[@sortorder];
@c_7 = @c_7[@sortorder];
@c_8 = @c_8 [@sortorder];
@c_9 = @c_9[@sortorder];
@c_10 = @c_10[@sortorder];
@c_11 = @c_11[@sortorder];
@c_12 = @c_12[@sortorder];
@c_13 = @c_13[@sortorder];
@c_14 = @c_14[@sortorder];
@c_15 = @c_15[@sortorder];
@c_16 = @c_16[@sortorder];
@c_17 = @c_17[@sortorder];
@c_18 = @c_18[@sortorder];
@c_19 = @c_19[@sortorder];
@c_20 = @c_20[@sortorder];
@c_21 = @c_21[@sortorder];
@c_22 = @c_22[@sortorder];
@c_23 = @c_23[@sortorder];
@c_24 = @c_24[@sortorder];
@c_25 = @c_25[@sortorder];
@c_26 = @c_26[@sortorder];
@c_27 = @c_27[@sortorder];
@c_28 = @c_28[@sortorder];
@c_29 = @c_29[@sortorder];
@c_30 = @c_30[@sortorder];
@c_31 = @c_31[@sortorder];
@c_32 = @c_32[@sortorder];
@c_33 = @c_33[@sortorder];
@c_34 = @c_34[@sortorder];
@c_35 = @c_35[@sortorder];
@c_36 = @c_36[@sortorder];
@c_37 = @c_37[@sortorder];
@c_38 = @c_38[@sortorder];
@c_39 = @c_39[@sortorder];
@c_40 = @c_40[@sortorder];
@c_41 = @c_41[@sortorder];
@c_42 = @c_42[@sortorder];
@c_43 = @c_43[@sortorder];
@c_44 = @c_44[@sortorder];
@c_45 = @c_45[@sortorder];
@c_46 = @c_46[@sortorder];
@c_47 = @c_47[@sortorder];
@c_48 = @c_48[@sortorder];
@c_49 = @c_49[@sortorder];
@c_50 = @c_50[@sortorder];
@c_51 = @c_51[@sortorder];
@c_52 = @c_52[@sortorder];
@c_53 = @c_53[@sortorder];
@c_54 = @c_54[@sortorder];
@c_55 = @c_55[@sortorder];
@c_56 = @c_56[@sortorder];
@c_57 = @c_57[@sortorder];
@c_58 = @c_58[@sortorder];
@c_59 = @c_59[@sortorder];
@c_60 = @c_60[@sortorder];
@c_61 = @c_61[@sortorder];
@c_62 = @c_62[@sortorder];
@c_63 = @c_63[@sortorder];
@c_64 = @c_64[@sortorder];
@c_65 = @c_65[@sortorder];
@c_66 = @c_66[@sortorder];
@c_67 = @c_67[@sortorder];
@c_68 = @c_68[@sortorder];
@c_69 = @c_69[@sortorder];
@c_70 = @c_70[@sortorder];
@c_71 = @c_71[@sortorder];
@c_72 = @c_72[@sortorder];
@c_73 = @c_73[@sortorder];
@c_74 = @c_74[@sortorder];
@c_75 = @c_75[@sortorder];
@c_76 = @c_76[@sortorder];
@c_77 = @c_77[@sortorder];
@c_78 = @c_78[@sortorder];
@c_79 = @c_79[@sortorder];
@c_80 = @c_80[@sortorder];
@c_81 = @c_81[@sortorder];
@c_82 = @c_82[@sortorder];
@c_83 = @c_83[@sortorder];
@c_84 = @c_84[@sortorder];
@c_85 = @c_85[@sortorder];
@c_86 = @c_86[@sortorder];
@c_87 = @c_87[@sortorder];
@c_88 = @c_88[@sortorder];
@c_89 = @c_89[@sortorder];
@c_90 = @c_90[@sortorder];
@c_91 = @c_91[@sortorder];
@c_92 = @c_92[@sortorder];
@c_93 = @c_93[@sortorder];
@c_94 = @c_94[@sortorder];
@c_95 = @c_95[@sortorder];
@c_96 = @c_96[@sortorder];
@c_97 = @c_97[@sortorder];
@c_98 = @c_98[@sortorder];
@c_99 = @c_99[@sortorder];
# save the sorted file as test2.txt
open(OUT, ">test2.txt");
my ($theday,$themonth,$theyear);
for( my $i=0; $i < $counter; $i++) {
if (defined($c_1[$i]) && length($c_1[$i]) > 0) {
# replace the $xyz_column with the column the date is in for the next 3 lines
$theday = substr($c_9[$i],6,2);
$themonth = substr($c_9[$i],4,2);
$theyear = substr($c_9[$i],0,4);
$theday = substr($theday,1,1) if substr($theday,0,1) == 0;
$themonth = substr($themonth,1,1) if substr($themonth,0,1) == 0;
# replace $xyz_column that contains the date with $themonth .
'-'. $theday . '-' . $theyear in the next 3 print statements
print OUT $c_1[$i] . ';' . $c_2[$i] . ';' . $c_3[$i] . ';' .
$c_4[$i] . ';' . $c_5[$i] . ';' . $c_6[$i] . ';' . $c_7[$i] . ';' .
$c_8[$i] . ';' .$themonth . '-' . $theday . '-' . $theyear . ';' .
$c_10[$i] . ';' . $c_11[$i] . ';' . $c_12[$i] . ';' . $c_13[$i] . ';'
. $c_14[$i] . ';' . $c_15[$i] . ';' . $c_16[$i] . ';' . $c_17[$i] .
';' . $c_18[$i] . ';' . $c_19[$i] . ';' . $c_20[$i] . ';' . $c_21[$i]
. ';' . $c_22[$i] . ';' . $c_23[$i] . ';' . $c_24[$i] . ';' .
$c_25[$i] . ';' . $c_26[$i] . ';' . $c_27[$i] . ';' . $c_28[$i] . ';'
. $c_29[$i] . ';' . $c_30[$i] . ';' . $c_31[$i] . ';' . $c_32[$i] .
';' . $c_27[$i] . ';' . $c_33[$i] . ';' . $c_34[$i] . ';' . $c_35[$i]
. ';' . $c_36[$i] . ';' . $c_37[$i] . ';' . $c_38[$i] . ';' .
$c_39[$i] . ';' . $c_40[$i] . ';' . $c_41[$i] . ';' . $c_42[$i] . ';'
. $c_43[$i] . ';' . $c_44[$i] . ';' . $c_45[$i] . ';' . $c_46[$i] .
';' . $c_47[$i] . ';' . $c_48[$i] . ';' . $c_49[$i] . ';' . $c_50[$i]
. ';' . $c_51[$i] . ';' . $c_52[$i] . ';' . $c_53[$i] . ';' .
$c_54[$i] . ';' . $c_55[$i] . ';' . $c_56[$i] . ';' . $c_57[$i] . ';'
. $c_58[$i] . ';' . $c_59[$i] . ';' . $c_60[$i] . ';' . $c_61[$i] .
';' . $c_62[$i] . ';' . $c_63[$i] . ';' . $c_64[$i] . ';' . $c_65[$i]
. ';' . $c_66[$i] . ';' . $c_67[$i] . ';' . $c_68[$i] . ';' .
$c_69[$i] . ';' . $c_70[$i] . ';' . $c_71[$i] . ';' . $c_72[$i] . ';'
. $c_73[$i] . ';' . $c_74[$i] . ';' . $c_75[$i] . ';' . $c_76[$i] .
';' . $c_77[$i] . ';' . $c_78[$i] . ';' . $c_79[$i] . ';' . $c_80[$i]
. ';' . $c_81[$i] . ';' . $c_82[$i] . ';' . $c_83[$i] . ';' .
$c_84[$i] . ';' . $c_85[$i] . ';' . $c_86[$i] . ';' . $c_87[$i] . ';'
. $c_88[$i] . ';' . $c_89[$i] . ';' . $c_90[$i] . ';' . $c_91[$i] .
';' . $c_92[$i] . ';' . $c_93[$i] . ';' . $c_94[$i] . ';' . $c_95[$i]
. ';' . $c_96[$i] . ';' .$c_97[$i] . ';' .$c_98[$i] . ';' .$c_99[$i];
print OUT "\n";
#print $c_1[$i] . ';' . $c_2[$i] . ';' . $c_3[$i] . ';' .
$c_4[$i] . ';' . $c_5[$i] . ';' . $c_6[$i] . ';' . $c_7[$i] . ';' .
$c_8[$i] . ';' .$themonth . '-' . $theday . '-' . $theyear . ';' .
$c_10[$i] . ';' . $c_11[$i] . ';' . $c_12[$i] . ';' . $c_13[$i] . ';'
. $c_14[$i] . ';' . $c_15[$i] . ';' . $c_16[$i] . ';' . $c_17[$i] .
';' . $c_18[$i] . ';' . $c_19[$i] . ';' . $c_20[$i] . ';' . $c_21[$i]
. ';' . $c_22[$i] . ';' . $c_23[$i] . ';' . $c_24[$i] . ';' .
$c_25[$i] . ';' . $c_26[$i] . ';' . $c_27[$i] . ';' . $c_27[$i] . ';'
. $c_27[$i] . ';' . $c_27[$i] . ';' . $c_28[$i] . ';' . $c_29[$i] .
';' . $c_30[$i] . ';' . $c_31[$i] . ';' . $c_32[$i] . ';' . $c_27[$i]
. ';' . $c_33[$i] . ';' . $c_34[$i] . ';' . $c_35[$i] . ';' .
$c_36[$i] . ';' . $c_37[$i] . ';' . $c_38[$i] . ';' . $c_39[$i] . ';'
. $c_40[$i] . ';' . $c_41[$i] . ';' . $c_42[$i] . ';' . $c_43[$i] .
';' . $c_44[$i] . ';' . $c_45[$i] . ';' . $c_46[$i] . ';' . $c_47[$i]
. ';' . $c_48[$i] . ';' . $c_49[$i] . ';' . $c_50[$i] . ';' .
$c_51[$i] . ';' . $c_52[$i] . ';' . $c_53[$i] . ';' . $c_54[$i] . ';'
. $c_55[$i] . ';' . $c_56[$i] . ';' . $c_57[$i] . ';' . $c_58[$i] .
';' . $c_59[$i] . ';' . $c_60[$i] . ';' . $c_61[$i] . ';' . $c_62[$i]
. ';' . $c_63[$i] . ';' . $c_64[$i] . ';' . $c_65[$i] . ';' .
$c_66[$i] . ';' . $c_67[$i] . ';' . $c_68[$i] . ';' . $c_69[$i] . ';'
. $c_70[$i] . ';' . $c_71[$i] . ';' . $c_72[$i] . ';' . $c_73[$i] .
';' . $c_74[$i] . ';' . $c_75[$i] . ';' . $c_76[$i] . ';' . $c_77[$i]
. ';' . $c_78[$i] . ';' . $c_79[$i] . ';' . $c_80[$i] . ';' .
$c_81[$i] . ';' . $c_82[$i] . ';' . $c_83[$i] . ';' . $c_84[$i] . ';'
. $c_85[$i] . ';' . $c_86[$i] . ';' . $c_87[$i] . ';' . $c_88[$i] .
';' . $c_89[$i] . ';' . $c_90[$i] . ';' . $c_91[$i] . ';' . $c_92[$i]
. ';' . $c_93[$i] . ';' . $c_94[$i] . ';' . $c_95[$i] . ';' .
$c_96[$i] . ';' .$c_97[$i] . ';' .$c_98[$i] . ';' .$c_99[$i] . '<br
/>';
}
}
close OUT;
# end program
exit(0);
#END
#END
|