Hello Jpbischke,
For the values you provided, the dependent variable is the second
column (cash flow) and the independent variable is the first column
(year). This is based on the assumption that there is a linear
relationship between year and cash flow, so you can predict cash flow
for other values of year. There is a brief explanation of these terms
and the least squares regression method at:
http://zebu.uoregon.edu/~js/glossary/correlation.html
Note in particular the caution in the last paragraph - just because a
couple variables have a strong correlation (near + / - 1), does not
mean the variables are truly related.
Excel's Regression Tool, will do a linear least squares fit for one or
more independent variables to predict a depedent variable. This
assumes you have a linear relationship between the variables; if not,
you have to use another method. A nice explanation of the Regression
Tool is at:
http://www.jeremymiles.co.uk/regressionbook/extras/appendix2/excel/
which walks through the dialog boxes and explanations in a step by
step basis. There is a link near the top if you want to download the
data and follow along with his example as well.
Let's work through your example assuming your data is organized as follows:
Year Cash Flow
1 1000
2 1140
3 1410
4 1620
where the data is in the first two columns / five rows of a spreadsheet.
Load the analysis tool kit (if not already loaded) and using Tools ->
Data Analysis to get the dialog box with the list of analysis tools.
Scroll down to Regression and select it. You should get another dialog
box similar to that in the example page.
The range of Y (dependent variable) is $B$1:$B$5, the range of X
(independent variable) is $A$1:$A$5. Also be sure to select Labels
(since the range includes the column labels) and set the other values
as you need them. I would also take the default - put the results onto
a new sheet in the workbook. Click OK when ready to do the analysis.
The result is a big pile of data in a new sheet. When I did this, I
got a section starting with
Regression Statistics
Multiple R 0.9933876
R Square 0.986818923
Adjusted R Square 0.980228385
Standard Error 38.92300091
Observations 4
and followed by much more detail. You also get the charts (you may
have to scroll right) in a small form. I usually move / resize them to
see the results.
Getting to details on what all the values are / how to interpret them
is a pretty extensive process and appears to be beyond what you are
looking for. However, a few brief comments should point you on the
right path.
[1] Standard error gives you an idea of the range of variation in new
predictions. In this case, about two thirds of the values will be + /
- 38.92 of the predicted value. If you need better confidence (about
95%) in the estimate, multiply the standard error by 3.
[2] Look at the "Year Line Fit Plot" and "Year Residual Plot". I grew
the charts to look in more detail. With your data, I don't see any
specific trends in the residuals (or difference between the actual and
predicted values). If you did see a trend - perhaps an oscillation or
small errors at one end, larger at the other, that would indicate that
your assumption of a linear relationship is likely incorrect. Let me
show a simple example:
X X*X
1 1
2 4
3 9
4 16
5 25
6 36
7 49
8 64
9 81
Feeding this set of data to the same analysis gets a result that
illustrates my point. The residual plot goes up at both ends, down in
the middle. The predicted and actual lines don't match closely either.
To find more references, I suggest searching with phrases such as:
describe regression
regression independent dependent variable
describe excel regression tool
and similar items.
I've gone pretty quickly through this kind of analysis. If you need
more details, please make a clarification request so I can address
your needs.
--Maniac |