Google Answers Logo
View Question
 
Q: Multiple Linear Regression Calculations, minute by minute ( Answered,   2 Comments )
Question  
Subject: Multiple Linear Regression Calculations, minute by minute
Category: Business and Money > Finance
Asked by: pianoplayer-ga
List Price: $100.00
Posted: 31 Oct 2002 18:06 PST
Expires: 30 Nov 2002 18:06 PST
Question ID: 94897
by: pianoplayer-ga 
List Price: $100.00  Posted: 31 Oct 2002 11:30 PST 
 For every single minute during the day we track the open, high, low
and close of a certain stock index like the S&P 500 and also run
calculations that involve these prices for the last 40 minutes.
Consequently we have a total of 8-40 variables and a single
independent variable(in this case the close five minutes hence) for
each minute of the day.

The goal is to forecast the price five minutes away. Here's and
example
of how the data is used: The data from minutes 2-6, the data from
minutes 7-11, the data from minutes 12-16 constitute three separate
sets and the forecast price five minutes away is calculated along with
R-squared, the coefficient of correlation.

Based upon the above three sets of data (we could use literally dozens
of sets of data)the forecast price five minutes away is calculated all
the time during the present minute.

Be advised: The program must run should run within Excel's last two
verions up to and including Office XP. And...the British program,
Analyse-It, itself an add-in to Excel does not lend itself to this
minute-by-minute type of calculation.

The answer must be MLR that will accept up to 40 independent variables
under the above conditions. This is obviously a statistics question.

All the data are in Microsoft Excel. The problem is always the same
with different numbers.
----

Request for Question Clarification by calebu2-ga on 31 Oct 2002 19:09 PST
pianoplayer,

I need to know a little bit more about what you are looking for to
know what is the best way to answer your question.

I assume that what you want at the end of the day is an excel
spreadsheet which takes your 40 observations and returns a single
forecasted price for 5 periods hence along with some diagnostic
statistics such as Rē to tell you how good of a model you have.

My question to you is what your assumption is on the behavior of stock
prices on a minute by minute basis. From your mention of a MLR, I'm
assuming you have a simple notion of momentum and just want to regress
returns on past returns to extend the best fit line out 5 more
periods. By taking into account the interaction of the 8-40 securities
with each other we can perform a joint prediction of future returns
(and hence future prices). When you say that the observations are
independent, I assume that you mean they are distinct observations
which might still affect each other statistically. Did you want to
just treat the 40 variables as unrelated stocks or did you want to
simplify the correlations between stocks by assuming some kind of
factor model for returns?

What I wasn't sure about was what you meant by splitting the data into
minutes 2-6, 7-11 and 12-16. Was the aim to use the returns from 2-6
to estimate the return in period 11 and see how well your model is
working?

Let me know a little more about your intentions and I'll see what I
can program in excel (doesn't seem like the problem would be out of
the realm of the computational power of excel).

Regards

calebu2-ga

Clarification of Question by pianoplayer-ga on 31 Oct 2002 19:54 PST
Knew this would take more clarification: OK, here's an example:

The first data set has prices and calculated numbers, all in a row and
from the first three minutes of trading, as an example.
Open1,Hi1,Low1,Close1,
Calc1,Average1,Open2,Hi2,Low2,Close2,Calc2,Average2,Open3,Hi3,Lo3,Close3,Calc3,Average3,
Future Price

All are independent variables, with the exception of future price(i.e.
the price five minutes after the time of the close3 price). This data
set along with other data sets from other periods of time during the
trading have independent variables that are NOT duplicated in any
other data sets.

What I mean by this is that another data set could run:
Open12,Hi12...such that a price or calculation from one data set is
not used in any other data set.

The MLR must calculate the future price five minutes hence based on
the last three minutes of data and calcs. During the current minute,
the future or forecast price is constantly being calculated.

I'd gladly call you on the phone, since this would be easier, but the
Google police go gaga when I use personal information.

Request for Question Clarification by calebu2-ga on 31 Oct 2002 20:40 PST
Is there a location on the web to which you could load a sample of the
data that you want to be processed. I'm still a little lost as to how
your data is arranged (could be the time of day getting to me
finally!)

And given that you have open, high, low, close and average prices - is
there a particular model you had in mind that relates these numbers to
the future price (or do you simply want to regress the prices you have
on all variables and see where that leaves you?)

If you can arrange an xls spreadsheet with "data found here" and
"insert answer here" cells that might be a good way of getting the
concept across clearly. You should be OK uploading the file to a
tripod or other free webspace as long as you don't put any personal
contact information in it.

Regards

calebu2-ga

Clarification of Question by pianoplayer-ga on 01 Nov 2002 08:05 PST
Obviously I know little about web hosting, Tripod from Lycos and the
like. And although the boobs at Google dont like personal data given
out, I could care less...some people at Google will stop you, others
will not, a real dicotomy!

But, yes, I can show you enough of a spreadsheet so you can see
exactly what I need. We'll just use the open, hi, low, close and
NC(the price five minutes hence). As an example the data for three
minutes are strung out on a row, and three minutes of data are also
sampled from previous periods during the day.

So let me show you EXACTLY

Clarification of Question by pianoplayer-ga on 01 Nov 2002 08:09 PST
Obviously I know little of web hosting and the tripod stuff through
Lycos etc. If you show me where to post a spreadsheet, Ill do so.

The Google bozos, or at least some of them, will stop me from posting
personal information which is totally ridiculous, if I want to do so.
At any rate I can give you a portion of a spreadsheet which shows
exactly what I need. I can even email you with the spreadsheet portion
as an attachment.

Let me know either or.

Request for Question Clarification by calebu2-ga on 01 Nov 2002 13:47 PST
pianoplayer,

The more I think about your question, the less convinced I am that I
can give you what you want. There are still a lot of issues that are
unresolved - one of the key ones being the motive behind doing this -
is it just for fun (to see whether stock prediction is possible), a
college project or do you actually hope to use it to make money? It
seems a little high priced (bet you never thought you'd hear someone
say that) for the first two possibilities and I think you'd be
disappointed with the results for the third possibility.

You can get a free webhosting site at http://www.tripod.lycos.com/

Regards

calebu2-ga

Clarification of Question by pianoplayer-ga on 01 Nov 2002 17:58 PST
calebu2-ga 

Our motivation is crystal clear. We trade SP index futures to make
money and nothing else. Ive been around various aspects of this
business since 1973.Right now we routinely run MLR statically. But we
cannot get forecasting to update minute by minute using the constantly
changing data...i.e. dynamically.

Once I can upload the spreadsheet to tripod, you'll see precisely what
I'm talking about. Of course the final solution has to be able to
calculate MLR for each minute. Whether the forcasts are accurate is my
problem here...just the implementation of a dynamically calculated
regression is our only goal right now.

Clarification of Question by pianoplayer-ga on 04 Nov 2002 18:56 PST
OK. Per your suggestion I have uploaded a spreadsheet to
http://pianoplayer9.tripod.com/ for you to examine. Changing the
current static method of calculation to being dynamic, minute by
minute will probably entail some macros.

Request for Question Clarification by calebu2-ga on 05 Nov 2002 06:37 PST
OK. Almost there. The only thing we need now is either the name of
your file (because http://pianoplayer9.tripod.com/ does not produce a
directory listing) or a link to it from an index.html file in the same
directory.

Sorry to make life so complicated for you - I'm just trying to do
everything in a somewhat official manner.

Once I have the file I'll see what I think and let you know.

Regards

calebu2-ga

Clarification of Question by pianoplayer-ga on 07 Nov 2002 10:23 PST
OK. Have opened and indexed web page, after much trouble. Go to
http://pianoplayer9.tripod.com/sptrading and you'll see the file
listed. At least I can see the file when I go to that site.

Let me know.
Answer  
Subject: Re: Multiple Linear Regression Calculations, minute by minute
Answered By: calebu2-ga on 15 Nov 2002 11:54 PST
 
pianoplayer-ga,

I have downloaded the spreadsheet and spent some time working on it.
You can access the spreadsheet here :

http://www2.bc.edu/~lawrenst/google/sptrading.xls

Let me give a brief explanation of how the spreadsheet works now :

You need to specify the number of observations to use for the
regression, as well as the number of periods forward to predict (in
this case 5).

The spreadsheet takes the data from your input and instantaneously
runs a MLR to provide you with the output.

The key to doing MLR in excel is the LINEST function.

It takes the following form :

=LINEST(Array of y values, Array of x values, True if constant is
needed in regression, True if require diagnostic statistics)

The output is in the form of an array (so you need to select 13 x 6
cells, type the formula and then press ctrl-shift-enter).

The cells returned are in the format :
coeff1, coeff2, ..., coeff12, const
s.e.1, s.e.2, ..., s.e.12, s.e. of const.
Rē, s.e. of y
F statistic, degrees of freedom
sum of squares (regression), sum of squares (residuals)

The majority of the answer is contained within the spreadsheet I
uploaded (let me know if you want me to explain the spreadsheet and I
will clarify further).

One final remark I should make before I am done is a word of caution
(I take it from your question that you are experienced in this area,
in which case this comment is designed more for other people reading
this answer) : past performance is no guarantee of future performance.

You've seen this comment hundreds of times on financial adverts and
documents, but it is extremely important here. You can use historical
data and create a set of statistics that seems to explain "future"
returns well - based on historical performance. However there is no
guarantee that a high Rē or reasonable coefficients from one period
should provide good estimation for the next period.

The only way you can be even slightly comfortable with your model is
to see how well the estimation works by "back-testing" your results.
So use historical returns to estimate returns and then compare your
predictions with the true value. You won't get an exact match, but you
can tell by the difference between predicted and true values how well
your model is tracking. It is up to you to then decide whether this is
suitable for your needs.

One strong recommendation I have is the book by Burton Malkiel "A
Random Walk Down Wall Street".

http://www.amazon.com/exec/obidos/tg/detail/-/0393320405/qid=1037389808/sr=8-1/ref=sr_8_1/103-9734787-1441453?v=glance&s=books&n=507846

While he is disparaging about most forms of price prediction, he
provides a fairly balanced opinion of what works, what doesn't and
where people have failed in the past. What is most disturbing about
the book is that his advice, which was originally written before the
tech-bubble talks strongly against the psychology that prevailed
during the late 1990s. So even if nobody listened to him, he has still
shown an ability to hit the nail on the head when it comes to sane
investing.

So good luck with the spreadsheet and anybody else thinking of trying
this should proceed with extreme caution!

Regards

calebu2-ga

Search strategy :
Under the excel help system :
LINEST()
FORECAST()

Google searches :
"stock price prediction" day-trading
"multiple linear regressions"
Comments  
Subject: Re: Multiple Linear Regression Calculations, minute by minute
From: asiatechnicals-ga on 11 Nov 2002 04:28 PST
 
NOT INVESTMENT ADVICE:

I'm not a Google answer-guy and am just trying to help out and steer
you both in the right direction.

I believe you are trying to forecast forward short-term stock market
prices from a variety of economic/quantitative/technical inputs.  If
so, I suggest you approach a qualified technical analyst or financial
advisor before you use this technique.  Market Technicians Association
http://www.mta.org/

Yours is a common academic approach and not one that generally
performs well in the markets.  Such an approach is reliant upon the
assumption of (inter alia):

* Constant stock price Momentum (zero acceleration of rate of change
of price)
* Constant relationship between inputs and outputs
* Constant importance (weighting) of inputs
* Randomly distributed errors, no seasonality etc.
* Plenty more requirements - any good statistical textbook will tell
you the requirements for multivariate regression forecasting.

These relationships must remain constant for the duration of the
window (historical testing) period and forecast period for your system
to have the vaguest chance of predictive power.  Markets are dynamic
and rarely behave in this way.  This will be especially true with a
market model that employs 40+ inputs.  You will also need to
walk-forward optimise your model frequently.  That will be a major
challenge for a real-time system.

An alternative, but similar, strategy/tool is to use Artificial
Intelligence (Neural Networks) to train a market model.  Ward Systems
http://www.wardsystems.com/ and other software vendors offer third
party systems that may suit your purposes.  Bear in mind that many of
the same limitations found in multiple regression apply.

Assuming that you are totally committed to testing multiple
regression, in MS Excel, you can use the TREND worksheet function from
the Analysis Tool Pack that comes with Excel. 
=TREND(known_y's,known_x's,new_x's,const).  See the Help file for more
details.

If you exceed TREND's maximum number of variables, you may have to buy
powerful mathematical software (e.g. Mathematica
http://www.wolfram.com/) or statistical software (SPSS
http://www.spss.com/, Minitab http://www.minitab.com/)

Hope this helps.


Additional WWW References:  

A guide to Multiple Regression
http://cne.gmu.edu/modules/dau/stat/regression/multregsn/multregsn_frm.html

A guide to Multiple Regression
http://www.statsoftinc.com/textbook/stmulreg.html

Modelling and Interpreting Interactions in Multiple Regression
http://www.minitab.com/resources/whitepapers/burril2.htm
Subject: Re: Multiple Linear Regression Calculations, minute by minute
From: finipi-ga on 07 Dec 2004 15:54 PST
 
I agree with the above comment - for linear regression to work you
assume that there are a finite number of inputs and that what happened
in the past will happen again in the future with the same inputs. Can
this be said for financial markets?

If you want to auto generate models in real time from live data with
many variables, these guys have a product called Response Surface
Visualizer which I have used to do instant regression analysis.

<a href="http://www.mode2.com.au/DOE.htm">http://www.mode2.com.au/DOE.htm</a>

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy