Hello Kr
This solution is crying out for a small Perl script so I have tackled
it this way. If you do not have Perl installed on your system you can
download it at http://www.activestate.com. You will also need to have
installed the LWP and HTML::TokeParser modules.
ActivePerl Download: http://www.activestate.com/Products/ActivePerl/
How to install modules: http://forums.devshed.com/archive/t-143549
The source code for the Perl script is:
#========================================================
#!/usr/bin/perl
# so what are we using?
use strict;
use warnings;
use LWP;
use HTML::TokeParser;
# set up other variables
my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$text,$output);
# initialise variables for the pretend browser and start the browser
$browser = LWP::UserAgent->new();
# read in contents of file
my $filename = "stocks.txt";
open( FILE, "< $filename" ) or die "Can't open $filename : $!";
while( <FILE> ) {
chomp($_);
my ($stockname,$code) = split("\t",$_);
# where are we searching?
$url = "http://localhost/" . $code;
#$url = "http://finance.yahoo.com/q/ks?s=" . $code;
print $code;
# ok, get the page we are searching
$response = $browser->get($url);
# parse the page tag by tag
$stream = HTML::TokeParser->new( $response->content_ref );
$stream->{'textify'} = {}; # remove [img] etc entities
# current price
$response->content =~ /<big><b>(.*)<\/b><\/big>/i
|| warn "Not matched";
$current = $1;
# get the other data
while ( $tag = $stream->get_tag('td') ) {
$text = $stream->get_trimmed_text('/td');
if ( $text eq 'Price/Sales (ttm):' ) {
$tag = $stream->get_tag('td');
$ttm = $stream->get_trimmed_text('/td');
}
if ( substr($text,0,11) eq '52-Week Low' ) {
$tag = $stream->get_tag('td');
$low = $stream->get_trimmed_text('/td');
}
}; # end while
# calculate appreciation
$appr = $current-$low;
$appr = $appr/$low;
$appr = sprintf("%.2f",$appr*100);
$output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";
}
close FILE;
open (FILEHANDLE, ">output.txt") or die "no such file";
print FILEHANDLE $output;
close (FILEHANDLE);
exit(0);
#========================================================
It will read in a file called stocks.txt which contains the stocks you
wish to check on separate lines in the format STOCKNAME<tab>STOCKCODE
It then output a file called output.txt.
Stocks.txt should be placed in the same folder as the above perl
script, output.txt will also be created in this same folder.
If you have any questions or queries please ask for clarification and
I will do my best to help you. |
Request for Answer Clarification by
kr-ga
on
06 Jun 2004 10:52 PDT
Hello Palitoy,
Thank you for a quick answer. However, I don't want to do a 100 MB of
perl installation for such a simple solution. Maybe, I made a mistake
of not mentioning that I need a simple .exe file which can take a
command line input or ask for input and output file and do the
required parsing. Is it possible to provide me with a simple .exe file
that does the job - maybe this needs to be done in VBasic or Delphi ??
Thanks,
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 11:09 PDT
The solution I have provided you matches your question requirements
and runs from the command line. The ActivePerl installation file is
only 8.3Mb (not 100Mb).
Sorry but I do not have access to VB or Delphi to provide you with a
.exe solution and I am afraid no-one else at Google Answers would be
able to provide you with an .exe file either as we can only post
textual answers or links to items on the internet. We cannot post
attachment-type replies...
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 11:20 PDT
As another thought, there is also a program that can create an .exe
file from a perl script at http://www.indigostar.com/perl2exe.htm
I have this program and can recommend it (but as I indicated before I
have no means of getting the .exe file to you as email contact is
forbidden at Google Answers).
|
Request for Answer Clarification by
kr-ga
on
06 Jun 2004 11:26 PDT
Hello Palitoy
It is a 8.3 Mb of download and expands to 100 Mb of installation with
Windows Installer. As for your response of Anyway, I installed the
package and the modules as suggested and I get the following output
when I try to run the script.
C:\app\ActivePerl\bin>perl stocks.pl
Use of uninitialized value in concatenation (.) or string at stocks.pl line 24,
<FILE> line 1.
Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1.
Not matched at stocks.pl line 35, <FILE> line 1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1.
Illegal division by zero at stocks.pl line 54, <FILE> line 1.
Let me know how to proceed.
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 11:34 PDT
You will not regret the installation I assure you, Perl is a very
useful language for this type of thing.
Sorry there is a small typo in the script that I have not removed
prior to posting it. Look for these two lines:
$url = "http://localhost/" . $code;
#$url = "http://finance.yahoo.com/q/ks?s=" . $code;
Change them to:
#$url = "http://localhost/" . $code;
$url = "http://finance.yahoo.com/q/ks?s=" . $code;
You could in fact remove the "localhost" line completely by deleting
it as it only applies to my testing system. It was trying to look for
the files in the wrong location.
Doh! Sorry about that!
|
Request for Answer Clarification by
kr-ga
on
06 Jun 2004 11:39 PDT
I have changed the url lines as suggested. After changing the lines, I
get the following message:
C:\app\ActivePerl\bin>perl stocks.pl
Use of uninitialized value in concatenation (.) or string at stocks.pl line 25,
<FILE> line 1.
Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1.
Not matched at stocks.pl line 35, <FILE> line 1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1.
Illegal division by zero at stocks.pl line 54, <FILE> line 1.
Please advise
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 11:41 PDT
Can you tell me what you have in your stocks.txt file?
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 11:44 PDT
Please also ensure that this is on one line not two:
$output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";
(It is near the bottom of the script.)
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 11:56 PDT
The only way I can replicate your error messages at my end is by NOT
having the STOCKNAME and STOCKCODE separated by a tab. In your
initial question, this was the format you suggested... have you
followed this in your stocks.txt file?
|
Request for Answer Clarification by
kr-ga
on
06 Jun 2004 11:56 PDT
I made sure that the following is on one line.
$output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";
However, I still get the same response
C:\app\ActivePerl\bin>perl stocks.pl
Use of uninitialized value in concatenation (.) or string at stocks.pl line 25,
<FILE> line 1.
Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1.
Not matched at stocks.pl line 35, <FILE> line 1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1.
Illegal division by zero at stocks.pl line 54, <FILE> line 1.
Here is the content of test stocks.txt
Walmart WMT
Microsoft MSFT
There is a <tab> between stock name and stock symbol
|
Request for Answer Clarification by
kr-ga
on
06 Jun 2004 12:13 PDT
Got It! My mistake - the tabs were missing in the stocks.txt file. A
request, could you add one more variable to the parsing module? I also
need the market capatilization of the company. The information is
there at same url as provided earlier and is indicated as Market Cap:
I will appreciate if you can update the script to have following
output (delimeted with tab)
Stock Name
Stock Symbol
Market Cap
Price Sales Ratio
52 Week Low Price
Current Price
Thanks
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 12:19 PDT
Try adding this line after chomp($_);
$_ =~ s/\t+/\t/g;
This removes the possibility of multiple tabs in the stocks.txt file.
I still believe this error is because of this as it is the only way I
can replicate it on my computer.
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 12:33 PDT
Another way to replicate those error messages is if you have any blank
files in the stocks.txt file.
I have added a couple of lines to check for this eventuality.
Try the code in my next clarification (it is all code - no waffle from me!).
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 12:33 PDT
#!/usr/bin/perl
# so what are we using?
use strict;
use warnings;
use LWP;
use HTML::TokeParser;
# set up other variables
my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$text,$output);
# initialise variables for the pretend browser and start the browser
$browser = LWP::UserAgent->new();
# read in contents of file
my $filename = "stocks.txt";
open( FILE, "< $filename" ) or die "Can't open $filename : $!";
while( <FILE> ) {
chomp($_);
$_ =~ s/\t+/\t/g;
my ($stockname,$code) = split("\t",$_);
if ( defined($stockname) && defined($code) ) {
# where are we searching?
$url = "http://finance.yahoo.com/q/ks?s=" . $code;
# ok, get the page we are searching
$response = $browser->get($url);
# parse the page tag by tag
$stream = HTML::TokeParser->new( $response->content_ref );
$stream->{'textify'} = {}; # remove [img] etc entities
# current price
$response->content =~ /<big><b>(.*)<\/b><\/big>/i
|| warn "Not matched";
$current = $1;
# get the other data
while ( $tag = $stream->get_tag('td') ) {
$text = $stream->get_trimmed_text('/td');
if ( $text eq 'Price/Sales (ttm):' ) {
$tag = $stream->get_tag('td');
$ttm = $stream->get_trimmed_text('/td');
}
if ( substr($text,0,11) eq '52-Week Low' ) {
$tag = $stream->get_tag('td');
$low = $stream->get_trimmed_text('/td');
}
}; # end while
# calculate appreciation
$appr = $current-$low;
$appr = $appr/$low;
$appr = sprintf("%.2f",$appr*100);
$output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";
};
}
close FILE;
open (FILEHANDLE, ">output.txt") or die "no such file";
print FILEHANDLE $output;
close (FILEHANDLE);
exit(0);
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 12:37 PDT
Our last clarifications crossed in the ether I think! I would still
go with my updated script as it adds that extra check for null data.
Do you still want the Appreciation in the output file?
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 12:47 PDT
I am going offline now for 12 hours so I will post the clarification
WITH the appreciation and new market cap information.
If you are still having problems please post for clarification and I
will deal with it as soon as I am back online.
I hope you agree the perl solution is fast and efficient once you have it working!
|
Clarification of Answer by
palitoy-ga
on
06 Jun 2004 12:47 PDT
#!/usr/bin/perl
# so what are we using?
use strict;
use warnings;
use LWP;
use HTML::TokeParser;
# set up other variables
my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$cap,$text,$output);
# initialise variables for the pretend browser and start the browser
$browser = LWP::UserAgent->new();
# read in contents of file
my $filename = "stocks.txt";
open( FILE, "< $filename" ) or die "Can't open $filename : $!";
while( <FILE> ) {
chomp($_);
$_ =~ s/\t+/\t/g;
my ($stockname,$code) = split("\t",$_);
if ( defined($stockname) && defined($code) ) {
# where are we searching?
$url = "http://finance.yahoo.com/q/ks?s=" . $code;
# ok, get the page we are searching
$response = $browser->get($url);
# parse the page tag by tag
$stream = HTML::TokeParser->new( $response->content_ref );
$stream->{'textify'} = {}; # remove [img] etc entities
# current price
$response->content =~ /<big><b>(.*)<\/b><\/big>/i
|| warn "Not matched";
$current = $1;
# get the other data
while ( $tag = $stream->get_tag('td') ) {
$text = $stream->get_trimmed_text('/td');
if ( $text eq 'Price/Sales (ttm):' ) {
$tag = $stream->get_tag('td');
$ttm = $stream->get_trimmed_text('/td');
}
if ( substr($text,0,11) eq '52-Week Low' ) {
$tag = $stream->get_tag('td');
$low = $stream->get_trimmed_text('/td');
}
if ( substr($text,0,10) eq 'Market Cap' ) {
$tag = $stream->get_tag('td');
$cap = $stream->get_trimmed_text('/td');
}
}; # end while
# calculate appreciation
$appr = $current-$low;
$appr = $appr/$low;
$appr = sprintf("%.2f",$appr*100);
$output .= $stockname . "\t" . $code . "\t" . $cap . "\t" .
$ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n";
};
}
close FILE;
open (FILEHANDLE, ">output.txt") or die "no such file";
print FILEHANDLE $output;
close (FILEHANDLE);
exit(0);
|