Hi k9queen!
Let's first run the regression in order to find the trend forecast
that includes the dummy variable. When including the dummy, the data
set becomes:
quarters sales
(in thousands of $) Tax
=========================================
1 23000 0
2 27000 0
3 28500 0
4 30200 0
5 32600 0
6 35210 0
7 37100 0
8 40000 0
9 41450 0
10 47210 0
11 51250 0
12 55780 0
13 56210 0
14 57290 0
15 43000 1
16 45210 1
17 47621 1
18 49821 1
19 51200 1
20 54780 1
21 63210 0
22 67801 0
23 72345 0
24 75321 0
25 80321 0
As you can see, the tax dummy takes on the value 1 when the tax is
being applied and 0 when it's not. In the regression, the coefficient
of this variable will show how sales are reduced by the inclusion of
the tax. Now let's run the regression. We will take the dependent
variable to be sales per quarter (the 2nd column), and we'll use 2
explanatory variables: the quarter number (so this will be a trend
model, that is, one that explains the dependent variable with the
passage of time), and the tax dummy. The tax dummy will allow the
model to adjust a different trend line to the periods where the tax is
in effect. If we don't include the dummy, the trend line will try to
fit all the values equally, while it's clear that sales in the tax
period are substantially lower than sales in other periods. The result
of this regression is the following:
Y = 22993 + 2204*t - 12961*Tax
where t is the time index (the number of the quarter) and Tax is the
tax dummy. The R-squared of this regression is 0.97. Since the
coefficient of the dummy is negative, the model tells us that the tax
has a negative effect on sales. In fact, since Tax=0 when there is no
tax, and Tax=1 when the tax is in effect, we can think of this
equation as two trend lines:
Trend Line without tax:
Y = 22993 + 2204*t - 12961*0 = 22993 + 2204*t
Trend line with tax:
Y = 22993 + 2204*t - 12961*1 = 10032 + 2204*t
These two trend lines are substantially differnt, meaning that it's
true that the trend is different in periods with or without taxes. If
we try to fit a trend line to the data without including the dummy,
then the model, when trying to fit these substantially different data
points, would have given us an "average" trend, which would have been
a worse fit for both tax and non-tax periods. Let's run the regression
withouth the dummy in order to compare them:
Y = 23382 + 1935*t
The R-squared of this regression is 0.85. The R-squared from the other
regression was much higher, implying that it was a better model.
You can also see graphically that the dummy model is better. I plotted
a graph and uploaded it to:
http://www.angelfire.com/alt/elmarto
The red cirlces are the data points, the blue (discontinuous) line is
the trend forecast of the model with the dummy and the green line is
the trend forecast of the model without a dummy. It's clear that the
blue line is a better fit for the data points.
I hope this helps! If you have any doubts regarding my answer, please
don't hesitate to request a clarification before rating it. Otherwise
I await your rating and final comments.
Best wishes!
elmarto |