Hello Spangler,
The program listed at the end will combine the two files as you
specified and generate output to "standard output" which can be
directed to a new file. I kept the program simple to make it easier to
explain but I am also describing several "improvements" that may be
helpful as well.
The first line
#!/usr/bin/perl
tells a Unix (or Linux) system that you want to perl to process the
following commands. This assumes that perl is installed in a system
directory (/usr/bin) - the typical location. On systems that do not
support this, it should be processed as a comment and ignored.
The next few lines are comments to help explain what the program does.
The line starting with "die" is executed only if there are less than
two parameters (the file names) on the command line. The $0 is
replaced by the name of your script. The @ARGV refers to the last
index in the $ARGV array. If die is executed, it puts out an
informative message to help the user run the perl script and the
script quits. If you want a third parameter (say the output file) -
change the 2 to a 3.
The line starting with $tab creates a string variable and assigns the
"tab" character to it. This is used only to aid in readability in the
print statement later.
The two open statements, create file handles for FILE1 and FILE2 using
the first two parameters of the command line. As-is these do not do
any error checking - if that is desired rewrite to something like this
open(FILE1, $ARGV[0]) || die "Cannot open $ARGV[0]: $!\n";
which will execute "die" only if the open fails and explains the error
message (usually something like "file not found"). Note that the array
starts with zero (0), so the references are to $ARGV[0] for the first
argument and $ARGV[1] for the second argument.
If you need to generate output into a new file, add a statement like
open(FILE3, "> " . $ARGV[2]);
[or with an added "die"] to create a file using the third argument.
The open function is defined such that
"filename" - open for reading
"> filename" - open for writing
">> filename" - open for append
and so on.
The while loop repeats, reading a line at a time from the filehandle
FILE1 until that first file is exhausted. The next statement strips
off the newline from the end of the line - otherwise the output would
look something like
File1
File2
File1
File2
and so on. The last statement concatenates the line from FILE1, a tab,
and a line from FILE2 and prints the output to the standard output. If
you added a third open statement to create an output file, change
STDOUT to FILE3 [that file will be closed automatically when the
script exits - or use "close" to close it].
As a side comment, the program should also run to completion if FILE1
has more lines than FILE2. In this case, the output will still look
OK. If FILE2 is longer than FILE1, and it is important that all the
lines get output - add something like this to the end
while ($line = <FILE2>) {
print (FILE3 $tab . <FILE2>);
}
which will walk through the rest of FILE2 and preceed the output of
each line with a tab.
Let me also provide you with some good text and on line references as well.
"Programmin Perl" by Larry Wall and Randal L Schwartz has an excellent
explanation of perl. I used a first edition of that book extensively
in preparing this answer. For example:
- use of the die statement in the chapter titled "Real Perl Programs"
- the chapters on "The Gory Details" and "Functions" for operator and function use
- the chapter on "An Overview of Perl" for a refresher on the
language to put it together
On line, see
http://www.comp.leeds.ac.uk/Perl/
or
http://www.ebb.org/PickingUpPerl/ (this link did not work, but the next one did)
http://www.linuxtopia.org/online_books/perl/index.html
for a couple good Perl tutorials or search with phrases such as
perl tutorial
perl sample scripts
or check out the learning resource at perl.org at
http://learn.perl.org/
If any part of the answer is incomplete or unclear, please make a
clarification request.
Good luck with your work.
--Maniac
#!/usr/bin/perl
# A program to concatenate two files.
# For files FILE1 & FILE2, the output is
# FILE1 <tab> FILE2
# on each line.
die "Usage: $0 file1 file2\nwhere file1 and file2 are files with equal
number of lines.\n" if @ARGV < 2;
$tab = "\t";
open(FILE1, $ARGV[0]);
open(FILE2, $ARGV[1]);
while ($line = <FILE1>) {
$line =~ s/\n//;
print STDOUT $line . $tab . <FILE2>
} |