Google Answers Logo
View Question
 
Q: Simple program needed to read and parse information from finance.yahoo.com ( Answered 5 out of 5 stars,   0 Comments )
Question  
Subject: Simple program needed to read and parse information from finance.yahoo.com
Category: Computers > Programming
Asked by: kr-ga
List Price: $10.00
Posted: 06 Jun 2004 01:07 PDT
Expires: 06 Jul 2004 01:07 PDT
Question ID: 357048
I need a very simple programming project to be completed. Here are the
requirements:

The program will read a list of stock name and stock symbol from a
text file (tab delimeted)
and output the following data for each stock

Stock Name
Stock Symbol
Price Sales Ratio
52 Week Low Price
Current Price 
Appreciation of Stock Price (current price compared to 52 Week Low
Price) express in % - The equation for this is
(current price - 52 Week Lo)/52 Week Lo *100

The output file needs to be tab delimated

Example 

the input file is:

Walmart(TAB)WMT

the output file should be

Walmart(TAB)WMT(TAB)0.94(TAB)50.50(TAB)56.59(TAB)12.06

Basically this data is all there on http://finance.yahoo.com. All that
is required is
a program to read the data off finance.yahoo.com and parse the
relevant information. For example, above information on Walmart is
available at http://finance.yahoo.com/q?s=WMT and
http://finance.yahoo.com/q/ks?s=WMT. Notice that WMT is stock symbol
for Walmart.

The program needs to run under Windows 98 & above. Elaborate interface
is not required. It can be a simple command line program. I will need
the source code for the program.
Answer  
Subject: Re: Simple program needed to read and parse information from finance.yahoo.com
Answered By: palitoy-ga on 06 Jun 2004 04:11 PDT
Rated:5 out of 5 stars
 
Hello Kr

This solution is crying out for a small Perl script so I have tackled
it this way.  If you do not have Perl installed on your system you can
download it at http://www.activestate.com.  You will also need to have
installed the LWP and HTML::TokeParser modules.

ActivePerl Download: http://www.activestate.com/Products/ActivePerl/
How to install modules: http://forums.devshed.com/archive/t-143549

The source code for the Perl script is:

#========================================================
#!/usr/bin/perl

# so what are we using?
use strict;
use warnings;
use LWP;
use HTML::TokeParser;

# set up other variables
my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$text,$output);

# initialise variables for the pretend browser and start the browser
$browser = LWP::UserAgent->new();

# read in contents of file
my $filename = "stocks.txt";
open( FILE, "< $filename" ) or die "Can't open $filename : $!";
    while( <FILE> ) {
        chomp($_);
        my ($stockname,$code) = split("\t",$_);

        # where are we searching?
        $url = "http://localhost/" . $code;
        #$url = "http://finance.yahoo.com/q/ks?s=" . $code;
        print $code;
        # ok, get the page we are searching
        $response = $browser->get($url);

        # parse the page tag by tag
        $stream = HTML::TokeParser->new( $response->content_ref );
        $stream->{'textify'} = {}; # remove [img] etc entities

        # current price
        $response->content =~ /<big><b>(.*)<\/b><\/big>/i
          || warn "Not matched";
        $current = $1;

        # get the other data
        while ( $tag = $stream->get_tag('td') ) {
          $text = $stream->get_trimmed_text('/td');
          if ( $text eq 'Price/Sales (ttm):' ) {
                $tag = $stream->get_tag('td');
                $ttm = $stream->get_trimmed_text('/td');
          }
          if ( substr($text,0,11) eq '52-Week Low' ) {
                $tag = $stream->get_tag('td');
                $low = $stream->get_trimmed_text('/td');
          }
        }; # end while

        # calculate appreciation
        $appr = $current-$low;
        $appr = $appr/$low;
        $appr = sprintf("%.2f",$appr*100);

        $output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";

    }
close FILE;

open (FILEHANDLE, ">output.txt") or die "no such file";
print FILEHANDLE $output;
close (FILEHANDLE);

exit(0);
#========================================================

It will read in a file called stocks.txt which contains the stocks you
wish to check on separate lines in the format STOCKNAME<tab>STOCKCODE

It then output a file called output.txt.

Stocks.txt should be placed in the same folder as the above perl
script, output.txt will also be created in this same folder.

If you have any questions or queries please ask for clarification and
I will do my best to help you.

Request for Answer Clarification by kr-ga on 06 Jun 2004 10:52 PDT
Hello Palitoy,

Thank you for a quick answer. However, I don't want to do a 100 MB of 
perl installation for such a simple solution. Maybe, I made a mistake
of not mentioning that I need a simple .exe file which can take a
command line input or ask for input and output file and do the
required parsing. Is it possible to provide me with a simple .exe file
that does the job - maybe this needs to be done in VBasic or Delphi ??

Thanks,

Clarification of Answer by palitoy-ga on 06 Jun 2004 11:09 PDT
The solution I have provided you matches your question requirements
and runs from the command line.  The ActivePerl installation file is
only 8.3Mb (not 100Mb).

Sorry but I do not have access to VB or Delphi to provide you with a
.exe solution and I am afraid no-one else at Google Answers would be
able to provide you with an .exe file either as we can only post
textual answers or links to items on the internet.  We cannot post
attachment-type replies...

Clarification of Answer by palitoy-ga on 06 Jun 2004 11:20 PDT
As another thought, there is also a program that can create an .exe
file from a perl script at http://www.indigostar.com/perl2exe.htm

I have this program and can recommend it (but as I indicated before I
have no means of getting the .exe file to you as email contact is
forbidden at Google Answers).

Request for Answer Clarification by kr-ga on 06 Jun 2004 11:26 PDT
Hello Palitoy

It is a 8.3 Mb of download and expands to 100 Mb of installation with
Windows Installer. As for your response of Anyway, I installed the
package and the modules as suggested and I get the following output
when I try to run the script.

C:\app\ActivePerl\bin>perl stocks.pl
Use of uninitialized value in concatenation (.) or string at stocks.pl line 24,
<FILE> line 1.
Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1.
Not matched at stocks.pl line 35, <FILE> line 1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1.
Illegal division by zero at stocks.pl line 54, <FILE> line 1.

Let me know how to proceed.

Clarification of Answer by palitoy-ga on 06 Jun 2004 11:34 PDT
You will not regret the installation I assure you, Perl is a very
useful language for this type of thing.

Sorry there is a small typo in the script that I have not removed
prior to posting it.  Look for these two lines:

$url = "http://localhost/" . $code;
#$url = "http://finance.yahoo.com/q/ks?s=" . $code;

Change them to:

#$url = "http://localhost/" . $code;
$url = "http://finance.yahoo.com/q/ks?s=" . $code;

You could in fact remove the "localhost" line completely by deleting
it as it only applies to my testing system.  It was trying to look for
the files in the wrong location.

Doh!  Sorry about that!

Request for Answer Clarification by kr-ga on 06 Jun 2004 11:39 PDT
I have changed the url lines as suggested. After changing the lines, I
get the following message:

C:\app\ActivePerl\bin>perl stocks.pl
Use of uninitialized value in concatenation (.) or string at stocks.pl line 25,
<FILE> line 1.
Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1.
Not matched at stocks.pl line 35, <FILE> line 1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1.
Illegal division by zero at stocks.pl line 54, <FILE> line 1.

Please advise

Clarification of Answer by palitoy-ga on 06 Jun 2004 11:41 PDT
Can you tell me what you have in your stocks.txt file?

Clarification of Answer by palitoy-ga on 06 Jun 2004 11:44 PDT
Please also ensure that this is on one line not two:

$output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";

(It is near the bottom of the script.)

Clarification of Answer by palitoy-ga on 06 Jun 2004 11:56 PDT
The only way I can replicate your error messages at my end is by NOT
having the STOCKNAME and STOCKCODE separated by a tab.  In your
initial question, this was the format you suggested... have you
followed this in your stocks.txt file?

Request for Answer Clarification by kr-ga on 06 Jun 2004 11:56 PDT
I made sure that the following is on one line.

$output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";

However, I still get the same response


C:\app\ActivePerl\bin>perl stocks.pl
Use of uninitialized value in concatenation (.) or string at stocks.pl line 25,
<FILE> line 1.
Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1.
Not matched at stocks.pl line 35, <FILE> line 1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line
1.
Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1.
Illegal division by zero at stocks.pl line 54, <FILE> line 1.


Here is the content of test stocks.txt

Walmart WMT
Microsoft       MSFT

There is a <tab> between stock name and stock symbol

Request for Answer Clarification by kr-ga on 06 Jun 2004 12:13 PDT
Got It! My mistake - the tabs were missing in the stocks.txt file. A
request, could you add one more variable to the parsing module? I also
need the market capatilization of the company. The information is
there at same url as provided earlier and is indicated as Market Cap:

I will appreciate if you can update the script to have following
output (delimeted with tab)

Stock Name
Stock Symbol
Market Cap
Price Sales Ratio
52 Week Low Price
Current Price 

Thanks

Clarification of Answer by palitoy-ga on 06 Jun 2004 12:19 PDT
Try adding this line after chomp($_);

$_ =~ s/\t+/\t/g;

This removes the possibility of multiple tabs in the stocks.txt file.

I still believe this error is because of this as it is the only way I
can replicate it on my computer.

Clarification of Answer by palitoy-ga on 06 Jun 2004 12:33 PDT
Another way to replicate those error messages is if you have any blank
files in the stocks.txt file.

I have added a couple of lines to check for this eventuality.

Try the code in my next clarification (it is all code - no waffle from me!).

Clarification of Answer by palitoy-ga on 06 Jun 2004 12:33 PDT
#!/usr/bin/perl

# so what are we using?
use strict;
use warnings;
use LWP;
use HTML::TokeParser;

# set up other variables
my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$text,$output);

# initialise variables for the pretend browser and start the browser
$browser = LWP::UserAgent->new();

# read in contents of file
my $filename = "stocks.txt";
open( FILE, "< $filename" ) or die "Can't open $filename : $!";
    while( <FILE> ) {
        chomp($_);
        $_ =~ s/\t+/\t/g;
        my ($stockname,$code) = split("\t",$_);

        if ( defined($stockname) && defined($code) ) {
        # where are we searching?
        $url = "http://finance.yahoo.com/q/ks?s=" . $code;

        # ok, get the page we are searching
        $response = $browser->get($url);

        # parse the page tag by tag
        $stream = HTML::TokeParser->new( $response->content_ref );
        $stream->{'textify'} = {}; # remove [img] etc entities

        # current price
        $response->content =~ /<big><b>(.*)<\/b><\/big>/i
          || warn "Not matched";
        $current = $1;

        # get the other data
        while ( $tag = $stream->get_tag('td') ) {
          $text = $stream->get_trimmed_text('/td');
          if ( $text eq 'Price/Sales (ttm):' ) {
                $tag = $stream->get_tag('td');
                $ttm = $stream->get_trimmed_text('/td');
          }
          if ( substr($text,0,11) eq '52-Week Low' ) {
                $tag = $stream->get_tag('td');
                $low = $stream->get_trimmed_text('/td');
          }
        }; # end while

        # calculate appreciation
        $appr = $current-$low;
        $appr = $appr/$low;
        $appr = sprintf("%.2f",$appr*100);

        $output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" .
$low . "\t" . $current . "\t" . $appr . "\n";
        };

    }
close FILE;

open (FILEHANDLE, ">output.txt") or die "no such file";
print FILEHANDLE $output;
close (FILEHANDLE);

exit(0);

Clarification of Answer by palitoy-ga on 06 Jun 2004 12:37 PDT
Our last clarifications crossed in the ether I think!  I would still
go with my updated script as it adds that extra check for null data.

Do you still want the Appreciation in the output file?

Clarification of Answer by palitoy-ga on 06 Jun 2004 12:47 PDT
I am going offline now for 12 hours so I will post the clarification
WITH the appreciation and new market cap information.

If you are still having problems please post for clarification and I
will deal with it as soon as I am back online.

I hope you agree the perl solution is fast and efficient once you have it working!

Clarification of Answer by palitoy-ga on 06 Jun 2004 12:47 PDT
#!/usr/bin/perl

# so what are we using?
use strict;
use warnings;
use LWP;
use HTML::TokeParser;

# set up other variables
my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$cap,$text,$output);

# initialise variables for the pretend browser and start the browser
$browser = LWP::UserAgent->new();

# read in contents of file
my $filename = "stocks.txt";
open( FILE, "< $filename" ) or die "Can't open $filename : $!";
    while( <FILE> ) {
        chomp($_);
        $_ =~ s/\t+/\t/g;
        my ($stockname,$code) = split("\t",$_);

        if ( defined($stockname) && defined($code) ) {
        # where are we searching?
        $url = "http://finance.yahoo.com/q/ks?s=" . $code;

        # ok, get the page we are searching
        $response = $browser->get($url);

        # parse the page tag by tag
        $stream = HTML::TokeParser->new( $response->content_ref );
        $stream->{'textify'} = {}; # remove [img] etc entities

        # current price
        $response->content =~ /<big><b>(.*)<\/b><\/big>/i
          || warn "Not matched";
        $current = $1;

        # get the other data
        while ( $tag = $stream->get_tag('td') ) {
          $text = $stream->get_trimmed_text('/td');
          if ( $text eq 'Price/Sales (ttm):' ) {
                $tag = $stream->get_tag('td');
                $ttm = $stream->get_trimmed_text('/td');
          }
          if ( substr($text,0,11) eq '52-Week Low' ) {
                $tag = $stream->get_tag('td');
                $low = $stream->get_trimmed_text('/td');
          }
          if ( substr($text,0,10) eq 'Market Cap' ) {
                $tag = $stream->get_tag('td');
                $cap = $stream->get_trimmed_text('/td');
          }

        }; # end while

        # calculate appreciation
        $appr = $current-$low;
        $appr = $appr/$low;
        $appr = sprintf("%.2f",$appr*100);

        $output .= $stockname . "\t" . $code . "\t" . $cap . "\t" .
$ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n";
        };

    }
close FILE;

open (FILEHANDLE, ">output.txt") or die "no such file";
print FILEHANDLE $output;
close (FILEHANDLE);

exit(0);
kr-ga rated this answer:5 out of 5 stars
Perfect!!

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy