Google Answers: Simple program needed to read and parse information from finance.yahoo.com

View Question

Q: Simple program needed to read and parse information from finance.yahoo.com ( Answered 5 out of 5 stars

, 0 Comments )

Question

Subject: Simple program needed to read and parse information from finance.yahoo.com
Category: Computers > Programming
Asked by: kr-ga
List Price: $10.00

Posted: 06 Jun 2004 01:07 PDT
Expires: 06 Jul 2004 01:07 PDT
Question ID: 357048

I need a very simple programming project to be completed. Here are the
requirements:

The program will read a list of stock name and stock symbol from a
text file (tab delimeted)
and output the following data for each stock

Stock Name
Stock Symbol
Price Sales Ratio
52 Week Low Price
Current Price 
Appreciation of Stock Price (current price compared to 52 Week Low
Price) express in % - The equation for this is
(current price - 52 Week Lo)/52 Week Lo *100

The output file needs to be tab delimated

Example 

the input file is:

Walmart(TAB)WMT

the output file should be

Walmart(TAB)WMT(TAB)0.94(TAB)50.50(TAB)56.59(TAB)12.06

Basically this data is all there on http://finance.yahoo.com. All that
is required is
a program to read the data off finance.yahoo.com and parse the
relevant information. For example, above information on Walmart is
available at http://finance.yahoo.com/q?s=WMT and
http://finance.yahoo.com/q/ks?s=WMT. Notice that WMT is stock symbol
for Walmart.

The program needs to run under Windows 98 & above. Elaborate interface
is not required. It can be a simple command line program. I will need
the source code for the program.

Answer

Subject: Re: Simple program needed to read and parse information from finance.yahoo.com
Answered By: palitoy-ga on 06 Jun 2004 04:11 PDT
Rated: 5 out of 5 stars

Hello Kr This solution is crying out for a small Perl script so I have tackled it this way. If you do not have Perl installed on your system you can download it at http://www.activestate.com. You will also need to have installed the LWP and HTML::TokeParser modules. ActivePerl Download: http://www.activestate.com/Products/ActivePerl/ How to install modules: http://forums.devshed.com/archive/t-143549 The source code for the Perl script is: #======================================================== #!/usr/bin/perl # so what are we using? use strict; use warnings; use LWP; use HTML::TokeParser; # set up other variables my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$text,$output); # initialise variables for the pretend browser and start the browser $browser = LWP::UserAgent->new(); # read in contents of file my $filename = "stocks.txt"; open( FILE, "< $filename" ) or die "Can't open $filename : $!"; while( <FILE> ) { chomp($_); my ($stockname,$code) = split("\t",$_); # where are we searching? $url = "http://localhost/" . $code; #$url = "http://finance.yahoo.com/q/ks?s=" . $code; print $code; # ok, get the page we are searching $response = $browser->get($url); # parse the page tag by tag $stream = HTML::TokeParser->new( $response->content_ref ); $stream->{'textify'} = {}; # remove [img] etc entities # current price $response->content =~ /<big><b>(.)<\/b><\/big>/i \|\| warn "Not matched"; $current = $1; # get the other data while ( $tag = $stream->get_tag('td') ) { $text = $stream->get_trimmed_text('/td'); if ( $text eq 'Price/Sales (ttm):' ) { $tag = $stream->get_tag('td'); $ttm = $stream->get_trimmed_text('/td'); } if ( substr($text,0,11) eq '52-Week Low' ) { $tag = $stream->get_tag('td'); $low = $stream->get_trimmed_text('/td'); } }; # end while # calculate appreciation $appr = $current-$low; $appr = $appr/$low; $appr = sprintf("%.2f",$appr100); $output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n"; } close FILE; open (FILEHANDLE, ">output.txt") or die "no such file"; print FILEHANDLE $output; close (FILEHANDLE); exit(0); #======================================================== It will read in a file called stocks.txt which contains the stocks you wish to check on separate lines in the format STOCKNAME<tab>STOCKCODE It then output a file called output.txt. Stocks.txt should be placed in the same folder as the above perl script, output.txt will also be created in this same folder. If you have any questions or queries please ask for clarification and I will do my best to help you.
Request for Answer Clarification by kr-ga on 06 Jun 2004 10:52 PDT Hello Palitoy, Thank you for a quick answer. However, I don't want to do a 100 MB of perl installation for such a simple solution. Maybe, I made a mistake of not mentioning that I need a simple .exe file which can take a command line input or ask for input and output file and do the required parsing. Is it possible to provide me with a simple .exe file that does the job - maybe this needs to be done in VBasic or Delphi ?? Thanks,
Clarification of Answer by palitoy-ga on 06 Jun 2004 11:09 PDT The solution I have provided you matches your question requirements and runs from the command line. The ActivePerl installation file is only 8.3Mb (not 100Mb). Sorry but I do not have access to VB or Delphi to provide you with a .exe solution and I am afraid no-one else at Google Answers would be able to provide you with an .exe file either as we can only post textual answers or links to items on the internet. We cannot post attachment-type replies...
Clarification of Answer by palitoy-ga on 06 Jun 2004 11:20 PDT As another thought, there is also a program that can create an .exe file from a perl script at http://www.indigostar.com/perl2exe.htm I have this program and can recommend it (but as I indicated before I have no means of getting the .exe file to you as email contact is forbidden at Google Answers).
Request for Answer Clarification by kr-ga on 06 Jun 2004 11:26 PDT Hello Palitoy It is a 8.3 Mb of download and expands to 100 Mb of installation with Windows Installer. As for your response of Anyway, I installed the package and the modules as suggested and I get the following output when I try to run the script. C:\app\ActivePerl\bin>perl stocks.pl Use of uninitialized value in concatenation (.) or string at stocks.pl line 24, <FILE> line 1. Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1. Not matched at stocks.pl line 35, <FILE> line 1. Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line 1. Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line 1. Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1. Illegal division by zero at stocks.pl line 54, <FILE> line 1. Let me know how to proceed.
Clarification of Answer by palitoy-ga on 06 Jun 2004 11:34 PDT You will not regret the installation I assure you, Perl is a very useful language for this type of thing. Sorry there is a small typo in the script that I have not removed prior to posting it. Look for these two lines: $url = "http://localhost/" . $code; #$url = "http://finance.yahoo.com/q/ks?s=" . $code; Change them to: #$url = "http://localhost/" . $code; $url = "http://finance.yahoo.com/q/ks?s=" . $code; You could in fact remove the "localhost" line completely by deleting it as it only applies to my testing system. It was trying to look for the files in the wrong location. Doh! Sorry about that!
Request for Answer Clarification by kr-ga on 06 Jun 2004 11:39 PDT I have changed the url lines as suggested. After changing the lines, I get the following message: C:\app\ActivePerl\bin>perl stocks.pl Use of uninitialized value in concatenation (.) or string at stocks.pl line 25, <FILE> line 1. Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1. Not matched at stocks.pl line 35, <FILE> line 1. Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line 1. Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line 1. Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1. Illegal division by zero at stocks.pl line 54, <FILE> line 1. Please advise
Clarification of Answer by palitoy-ga on 06 Jun 2004 11:41 PDT Can you tell me what you have in your stocks.txt file?
Clarification of Answer by palitoy-ga on 06 Jun 2004 11:44 PDT Please also ensure that this is on one line not two: $output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n"; (It is near the bottom of the script.)
Clarification of Answer by palitoy-ga on 06 Jun 2004 11:56 PDT The only way I can replicate your error messages at my end is by NOT having the STOCKNAME and STOCKCODE separated by a tab. In your initial question, this was the format you suggested... have you followed this in your stocks.txt file?
Request for Answer Clarification by kr-ga on 06 Jun 2004 11:56 PDT I made sure that the following is on one line. $output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n"; However, I still get the same response C:\app\ActivePerl\bin>perl stocks.pl Use of uninitialized value in concatenation (.) or string at stocks.pl line 25, <FILE> line 1. Use of uninitialized value in print at stocks.pl line 26, <FILE> line 1. Not matched at stocks.pl line 35, <FILE> line 1. Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line 1. Use of uninitialized value in subtraction (-) at stocks.pl line 53, <FILE> line 1. Use of uninitialized value in division (/) at stocks.pl line 54, <FILE> line 1. Illegal division by zero at stocks.pl line 54, <FILE> line 1. Here is the content of test stocks.txt Walmart WMT Microsoft MSFT There is a <tab> between stock name and stock symbol
Request for Answer Clarification by kr-ga on 06 Jun 2004 12:13 PDT Got It! My mistake - the tabs were missing in the stocks.txt file. A request, could you add one more variable to the parsing module? I also need the market capatilization of the company. The information is there at same url as provided earlier and is indicated as Market Cap: I will appreciate if you can update the script to have following output (delimeted with tab) Stock Name Stock Symbol Market Cap Price Sales Ratio 52 Week Low Price Current Price Thanks
Clarification of Answer by palitoy-ga on 06 Jun 2004 12:19 PDT Try adding this line after chomp($_); $_ =~ s/\t+/\t/g; This removes the possibility of multiple tabs in the stocks.txt file. I still believe this error is because of this as it is the only way I can replicate it on my computer.
Clarification of Answer by palitoy-ga on 06 Jun 2004 12:33 PDT Another way to replicate those error messages is if you have any blank files in the stocks.txt file. I have added a couple of lines to check for this eventuality. Try the code in my next clarification (it is all code - no waffle from me!).
Clarification of Answer by palitoy-ga on 06 Jun 2004 12:33 PDT #!/usr/bin/perl # so what are we using? use strict; use warnings; use LWP; use HTML::TokeParser; # set up other variables my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$text,$output); # initialise variables for the pretend browser and start the browser $browser = LWP::UserAgent->new(); # read in contents of file my $filename = "stocks.txt"; open( FILE, "< $filename" ) or die "Can't open $filename : $!"; while( <FILE> ) { chomp($_); $_ =~ s/\t+/\t/g; my ($stockname,$code) = split("\t",$_); if ( defined($stockname) && defined($code) ) { # where are we searching? $url = "http://finance.yahoo.com/q/ks?s=" . $code; # ok, get the page we are searching $response = $browser->get($url); # parse the page tag by tag $stream = HTML::TokeParser->new( $response->content_ref ); $stream->{'textify'} = {}; # remove [img] etc entities # current price $response->content =~ /<big><b>(.)<\/b><\/big>/i \|\| warn "Not matched"; $current = $1; # get the other data while ( $tag = $stream->get_tag('td') ) { $text = $stream->get_trimmed_text('/td'); if ( $text eq 'Price/Sales (ttm):' ) { $tag = $stream->get_tag('td'); $ttm = $stream->get_trimmed_text('/td'); } if ( substr($text,0,11) eq '52-Week Low' ) { $tag = $stream->get_tag('td'); $low = $stream->get_trimmed_text('/td'); } }; # end while # calculate appreciation $appr = $current-$low; $appr = $appr/$low; $appr = sprintf("%.2f",$appr100); $output .= $stockname . "\t" . $code . "\t" . $ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n"; }; } close FILE; open (FILEHANDLE, ">output.txt") or die "no such file"; print FILEHANDLE $output; close (FILEHANDLE); exit(0);
Clarification of Answer by palitoy-ga on 06 Jun 2004 12:37 PDT Our last clarifications crossed in the ether I think! I would still go with my updated script as it adds that extra check for null data. Do you still want the Appreciation in the output file?
Clarification of Answer by palitoy-ga on 06 Jun 2004 12:47 PDT I am going offline now for 12 hours so I will post the clarification WITH the appreciation and new market cap information. If you are still having problems please post for clarification and I will deal with it as soon as I am back online. I hope you agree the perl solution is fast and efficient once you have it working!
Clarification of Answer by palitoy-ga on 06 Jun 2004 12:47 PDT #!/usr/bin/perl # so what are we using? use strict; use warnings; use LWP; use HTML::TokeParser; # set up other variables my ($browser,$url,$stream,$tag,$response,$ttm,$low,$current,$appr,$cap,$text,$output); # initialise variables for the pretend browser and start the browser $browser = LWP::UserAgent->new(); # read in contents of file my $filename = "stocks.txt"; open( FILE, "< $filename" ) or die "Can't open $filename : $!"; while( <FILE> ) { chomp($_); $_ =~ s/\t+/\t/g; my ($stockname,$code) = split("\t",$_); if ( defined($stockname) && defined($code) ) { # where are we searching? $url = "http://finance.yahoo.com/q/ks?s=" . $code; # ok, get the page we are searching $response = $browser->get($url); # parse the page tag by tag $stream = HTML::TokeParser->new( $response->content_ref ); $stream->{'textify'} = {}; # remove [img] etc entities # current price $response->content =~ /<big><b>(.)<\/b><\/big>/i \|\| warn "Not matched"; $current = $1; # get the other data while ( $tag = $stream->get_tag('td') ) { $text = $stream->get_trimmed_text('/td'); if ( $text eq 'Price/Sales (ttm):' ) { $tag = $stream->get_tag('td'); $ttm = $stream->get_trimmed_text('/td'); } if ( substr($text,0,11) eq '52-Week Low' ) { $tag = $stream->get_tag('td'); $low = $stream->get_trimmed_text('/td'); } if ( substr($text,0,10) eq 'Market Cap' ) { $tag = $stream->get_tag('td'); $cap = $stream->get_trimmed_text('/td'); } }; # end while # calculate appreciation $appr = $current-$low; $appr = $appr/$low; $appr = sprintf("%.2f",$appr100); $output .= $stockname . "\t" . $code . "\t" . $cap . "\t" . $ttm . "\t" . $low . "\t" . $current . "\t" . $appr . "\n"; }; } close FILE; open (FILEHANDLE, ">output.txt") or die "no such file"; print FILEHANDLE $output; close (FILEHANDLE); exit(0);

kr-ga rated this answer: 5 out of 5 stars

Perfect!!

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy