Hello Mick
I have made one assumption in solving this problem for you which I was
unable to tell from your data set, this is that the columns are
separated by a tab. That is they go columb1{tab}columb2{tab} etc. If
this is not the case please ask for clarification and I can sort this
out for you.
I have tried to write the script in a way that is most easily read -
there are probably ways that it could be written to make it smaller
but you requested it was as user-friendly as possible. Perl is
notorious for being able to be made unreadable!
Here is the script:
# START
#!/usr/bin/perl
# make it web-friendly so that it can be run on the internet
print "Content-type: text/html\n\n";
# open file and read it into an array to process
# note you may need to alter the position and name of the test.txt file
open (TXTFILE, "test.txt");
my @lines = <TXTFILE>;
close(TXTFILE);
# sort the array alphabetically - this is done so that each columb2 is
# in alphabetical order
@lines = sort(@lines);
# set up a counter to count the number of loops
my $counter = 0;
# create two hashes to hold the information about the highest and lowest
# numbers
my (%mincolumbs, %maxcolumbs);
# create an array that holds all the different columb2 names
my @namesofcolumbs;
# create a variable that holds the last columb2 name found
my $columnname = "";
# loop through the lines in the file
foreach $newline (@lines) {
# if it is the first line then ignore it as this is just column headers
if ( $counter != 0 ) {
# start processing the line into a new array
# this splits the line at every tab character and assumes that
# the columns are separated by a tab character
my @splitline = split("\t",$newline);
# this is where the data is held that we need to process
# $splitline[1] = columb2
# $splitline[2] = columb3
# $splitline[3] = columb4
# if this number is smaller than the stored value we have
# (or we have no vale stored then store it)
if ( $splitline[2] < $mincolumbs[$splitline[1]] ||
$mincolumbs{$splitline[1]} == '' ) {
$mincolumbs{$splitline[1]} = $splitline[2];
}
# if this number is smaller than the stored value we have
# (or we have no vale stored then store it)
if ( $splitline[3] < $maxcolumbs[$splitline[1]] ||
$maxcolumbs{$splitline[1]} == '' ) {
$maxcolumbs{$splitline[1]} = $splitline[3];
}
# if we have a new columb2 name then store it in
# the namesofcolumbs array
if ( $columnname eq "" || $columnname ne $splitline[1] ) {
push @namesofcolumbs, $splitline[1];
$columnname = $splitline[1];
}
}
# add one to the counter
$counter++;
}
# outputting the data
# for each name in the namesofcolumbs array get the highest and lowest values
# stored and output it to a text file named output.txt
open (TXTFILE, ">output.txt");
foreach $printout ( @namesofcolumbs ) {
print TXTFILE $printout . "\t" . $mincolumbs{$printout} .
"\t" . $maxcolumbs{$printout} . "\n";
}
close(TXTFILE);
# END
The script reads in a file called test.txt and this should be placed
in the directory holding the script. The script outputs a file called
output.txt in this directory also.
If you have any questions or require any additional help please ask
for clarification and I will do all I can to help. |
Clarification of Answer by
palitoy-ga
on
22 Jun 2004 11:36 PDT
Hello Mick
I must apologise, there is a small typo in the script I posted.
You need to change this line:
if ( $splitline[2] < $mincolumbs[$splitline[1]] ||
$mincolumbs{$splitline[1]} eq '' ) {
TO:
if ( $splitline[2] < $mincolumbs{$splitline[1]} ||
$mincolumbs{$splitline[1]} eq '' ) {
(NOTE THE { instead of [ and } instead of ] ).
Similarly change:
if ( $splitline[3] < $maxcolumbs[$splitline[1]] ||
$maxcolumbs{$splitline[1]} eq '' ) {
TO:
if ( $splitline[3] < $maxcolumbs{$splitline[1]} ||
$maxcolumbs{$splitline[1]} eq '' ) {
Sorry for the small typo once again. If there is anything else please
let me know by asking for clarification.
|
Request for Answer Clarification by
mickr-ga
on
22 Jun 2004 13:34 PDT
Hi,
Cant try this till tomorrow UK time but it looks great and just what I asked
for. However, I forgot to say I would like the output file to be sorted on
columb 3 (smallest first) so as -1.256 is less than 8.23 the order is still
here/1/2 -1.256 12.56
here/3/2 8.23 1.56
$10 bonus for the extra work. If it can't be done easily with the
built in perl sort command I would be just as happy with an example
of calling the unix sort command within perl to do it.
PS rather than tabs the input file is seperated by multiple spaces
so I think I can just use
my @splitline = split(" ",$newline);
is that correct.
Thanks,
Mick
|
Request for Answer Clarification by
mickr-ga
on
23 Jun 2004 00:57 PDT
Hi,
Tried it this morning worked great - Thanks!
I am still on for the $10 bonus for sorting
the output.
Thanks,
Mick
|
Clarification of Answer by
palitoy-ga
on
23 Jun 2004 01:26 PDT
Hi Mick
I will work on those corrections for you and post the solution nearer
lunchtime (I am in the UK also).
|
Clarification of Answer by
palitoy-ga
on
23 Jun 2004 03:02 PDT
Hello Mick
Here is the solution you require. If you have any further questions
on this please ask for further clarification.
# START
#!/usr/bin/perl
# make it web-friendly
print "Content-type: text/html\n\n";
# open file and read it into an array to process
# note you may need to alter the position and name of the test.txt file
open (TXTFILE, "test.txt");
my @lines = <TXTFILE>;
close(TXTFILE);
# sort the array alphabetically - this is done so that each columb2 is
# in alphabetical order
@lines = sort(@lines);
# set up a counter to count the number of loops
my $counter = 0;
# create two hashes to hold the information about the highest and lowest numbers
my (%mincolumbs, %maxcolumbs);
# create an array that holds all the different columb2 names
my @namesofcolumbs;
# create a variable that holds the last columb2 name found
my $columnname = "";
# loop through the lines in the file
foreach $newline (@lines) {
# if it is the first line then ignore it as this is just column headers
if ( $counter != 0 ) {
# start processing the line into a new array
# remove any multiple spaces
$newline =~ s/\s{1,}/ /g ;
# this splits the line at every tab character and assumes that
# the columns are separated by a tab character
my @splitline = split(" ",$newline);
# this is where the data is held that we need to process
# $splitline[1] = columb2
# $splitline[2] = columb3
# $splitline[3] = columb4
# if this number is smaller than the stored value we have
# (or we have no vale stored then store it)
if ( $splitline[2] < $mincolumbs{$splitline[1]} ||
$mincolumbs{$splitline[1]} eq '' ) {
$mincolumbs{$splitline[1]} = $splitline[2];
}
# if this number is smaller than the stored value we have
# (or we have no vale stored then store it)
if ( $splitline[3] < $maxcolumbs{$splitline[1]} ||
$maxcolumbs{$splitline[1]} eq '' ) {
$maxcolumbs{$splitline[1]} = $splitline[3];
}
# if we have a new columb2 name then store it in
# the namesofcolumbs array
if ( $columnname eq "" || $columnname ne $splitline[1] ) {
push @namesofcolumbs, $splitline[1];
$columnname = $splitline[1];
}
}
# add one to the counter
$counter++;
}
# sort the arrays so they are in the correct order and output the data
# to a file
@keys = sort { $mincolumbs{$a} cmp $mincolumbs{$b} } ( keys %mincolumbs );
open (TXTFILE, ">output.txt");
foreach $key ( @keys ) {
print TXTFILE $key . " " . $mincolumbs{$key} . " " . $maxcolumbs{$key} . "\n";
}
close(TXTFILE);
# close the program
exit(0);
#END
|
Request for Answer Clarification by
mickr-ga
on
23 Jun 2004 05:01 PDT
Hi,
Thanks very much. I will add the $10 as a tip when I rate the question.
The sort almost worked but by default it gives the larger number first
so I got
here/3/2 8.23 1.56
here/1/2 -1.256 12.56
instead of
here/1/2 -1.256 12.56
here/3/2 8.23 1.56
I couldn't find a sort -r switch in perl so I just did
reverse (sort { $mincolumbs{$a} cmp $mincolumbs{$b} } ( keys %mincolumbs ) ) ;
is that OK or is there a better way.
Thanks,
Mick
|
Clarification of Answer by
palitoy-ga
on
23 Jun 2004 05:18 PDT
Thanks for the 5-star rating and tip, they are much appreciated.
If you need any further help please ask. The reverse() solution you
came up with is the method I would have used also as it is the easiest
and most common sense one when reading the script through. I always
try to write my scripts in a way that they are most readable as it
makes them much easier to edit when anyone tries to edit them, I am
glad you appreciate this!
Thanks again!
|
Request for Answer Clarification by
mickr-ga
on
19 Jul 2004 02:29 PDT
Hi,
I would like to get another perl script please. I have posted the
question if you would like to do it.
Thanks,
Mick
|
Clarification of Answer by
palitoy-ga
on
19 Jul 2004 05:14 PDT
Hi Mick
I have just completed your other script question. Hopefully you will
find it works to your needs, if not just ask for clarification on that
question and I will work it through with you again.
|