Google Answers Logo
View Question
 
Q: Statistical economy ( Answered 5 out of 5 stars,   1 Comment )
Question  
Subject: Statistical economy
Category: Business and Money > Economics
Asked by: vitaminc-ga
List Price: $20.00
Posted: 04 Oct 2003 12:35 PDT
Expires: 03 Nov 2003 11:35 PST
Question ID: 262740
Q. You want to do a comparative study of mortality rates for smokers
and non-smokers. You collect information on Y (mortality rate over
10000 individuals-year) and on age X. There are seven observations for
smokers and seven for non-smokers. The basic model is
                       lnY = b0 + b1X + u
Then you include a dummy variable d in this regression to separate
smokers (d=1) from non-smokers (d=0). The coefficient associated with
this variable is r(gamma). Additionally, you include an interaction
variable, equal to the product of d with X, whose coefficient is
alpha. The results obtained from running several regressions are
summarized in the following table:


Regression  b0       b1     r(gamma)  alpha    R^2     F     SSR  
   I       -0.8080  0.0955                    0.8924  99.5   1.5395
           (-1.33)  (9.98)
  II       0.0104   0.0873                    0.9922  637.7  0.0418
           (0.05)   (25.25) 
 III       -1.6264  0.1037                    0.9868  373.1  0.1009
           (-4.79)  (19.32)                   
  IV       -1.1130  0.0955   0.61             0.9834  326.3  0.2372
           (-4.43)  (24.33)  (7.77)           
   V       -1.6264  0.1037   1.6368  -0.0164  0.99    330.8  0.1427
           (-5.59)  (22.97)  (4.05)  (-2.57) 

           Degree of Freedom
               12
                5
                5
               11
               10                
(Numbers in parentheses are t statistics.)
(b0=0 is the subscript of b; b1=1 is the subscript of b.)

a) Explain briefly what information was used in each regression and
explain the connection between the estimated coefficients.

b) Explain the relevant hypothesis for the F-test in each regression.
Under what conditions is the t-test equivalent to the F-test? What can
you say about the relationship between mortality rate and age.

c) Is there a signigicant difference between smokers and non-smokers
in the relationship between mortality rate and age? Explain.

Clarification of Question by vitaminc-ga on 07 Oct 2003 00:32 PDT
Anyone? Any idea?

Request for Question Clarification by elmarto-ga on 07 Oct 2003 14:17 PDT
Hi again vitaminic!
I might be able to answer your question, but I need some clarification
first.

1) Am I right to assume that equation II is for smokers and equation
III is for non-smokers?
2) Could you please check that the -1.1130 figure in equation IV is
right? It's the only coefficient I can't connect with any other (for
question 1).

Thanks a lot,
elmarto

Clarification of Question by vitaminc-ga on 08 Oct 2003 23:53 PDT
hi, elmarto:
1) U know what, i was wondering that too. Well, i guess that
regressionII should be for smokers and regIII is for non-smokers.
2) YES. It is -1.1130  

:)
vitaminC

Request for Question Clarification by elmarto-ga on 09 Oct 2003 06:26 PDT
Hi vitaminic!
OK, then I can answer your question except for one thing: I don't know
how the coefficient I mentioned is connected with the others. If you
are willing to accept an answer without explaining that, then let me
know and I'll answer it. If you are not, I'll leave it open for other
Researchers who might want to give it a try.

Best wishes,
elmarto

Clarification of Question by vitaminc-ga on 11 Oct 2003 21:03 PDT
Hi, elmarto:
sure, i would like to see what you have done for the question so far. 
  :)
vitaminC

Request for Question Clarification by elmarto-ga on 12 Oct 2003 16:27 PDT
Thanks for allowing me an opportunity to partially answer the
question! I have no time right now, but I will be posting the answer
by Tuesday.

Best wishes!
elmarto

Clarification of Question by vitaminc-ga on 12 Oct 2003 23:09 PDT
Hi, elmarto:
Thanks! But, is that possible being posted before Tuesday? I will be appreciated. 
:)
vitaminC
Answer  
Subject: Re: Statistical economy
Answered By: elmarto-ga on 13 Oct 2003 19:25 PDT
Rated:5 out of 5 stars
 
Hi vitaminic!
OK, here is your answer, and on Monday night :-)

a) Regressions I-III used only the age in order to explain the
mortality rate. Regression II and III used sub-samples: regression II
used only the group of smokers, while regression III used only the
group of non-smokers. Regression IV again used the whole sample, but
now introduced information on the smoking habits of each person
(observation), and allowed this information to affect the constant
(through a dummy for smoker/non-smoker). Regression V used this same
information, but also allowed for a different coefficient for the age
for smokers and non-smokers.

The connection between the coefficients can be understood as follows:
regressions II and III shows the process of mortality rate as a
function of age for smokers (reg. II) and non-smokers (reg. III).
Regression I pooled these two groups. Since there is an equal number
of smokers and non-smokers in the sample, we can see that the
coefficients in regression I are exactly the average of the
coefficients in equations II and III. For example, the constant in
equation I (-0.8080) is the average between the constant of equation
II (0.0104) and III (-1.6264). The same goes for the coefficient of
age. It's exactly the average only because the number of smokers and
non-smokers is the same. If there were more of any group, the
coeffcients in equation I would be a weighted average of the
coefficients in equations II and III, with more weight applied to the
group with mroe observations. Regarding regression IV, we can see that
since we're taking the whole sample, the coefficient of age is the
same as in equation I.

Regression V allows a different constant and coefficient for smokers
and non-smokers. This is so in the following way. Regression V is:

M =  -1.6264 + 0.1037*X   1.6368*d  -0.0164*d*X

If we want to see the process for mortality for smokers fro mthis
equation, we have to set d=1 (smoker). When d=1, this equation
becomes:

M =  -1.6264 + 0.1037*X + 1.6368  -0.0164*X
  =  0.0104 + 0.0873*X

which takes us back to9 equation II. The same can be seen when we want
to see the process for non-smokers, and thus we set d=0. We will get
the same equation as III.


b) The idea of the F-test is to test if none of the chosen variables
(besides the constant) have any explanatory power over the dependent
variable. It uses the null hypothesis that all the explanatory
variables are equal to zero. The alternative hypothesis is that at
least one is different from zero. More information on this test can be
found at

The F-test
http://biosys.bre.orst.edu/BRE571/regress/f-test.doc

So for equations I-III the null hypothesis is that b1=0
For equation IV, the null is that b1=0 and r=0
For equation V, the null is that b1=0, r=0 and alpha=0

If we can't reject the null hypothesis of thsi test, it means that the
chosen explanatory variables have no relaqtionship with the dependent
variable. Clearly, the F-test is equivalent to the t-test whenever
there is only one explanatory variable besides the constant, such as
in equations I-III. You can check that the values shown for the F-test
are such that these hypothesis are rejected (using an F-distribution
table and the directions provided in the link above). In equations
I-III, this implies that there IS indeed a relationship between the
mortality rate and age.


c) There is evidence that there is there a significant difference
between smokers and non-smokers in the relationship between mortality
rate and age. This is so because the coefficient for alpha in equation
V is significantly different from zero (because its t-value is lower
than -2). Why do we have to look at this coefficient. Let's review
equation V:

M =  -1.6264 + 0.1037*X + 1.6368*d  -0.0164*d*X

From the equation, we see that ALL the coefficients are statistically
different from zero (all the t-values are outside the [-2,2] range),
so the equation stays like this. We can then rewrite this equation as:

M =  -1.6264 + (0.1037-0.0164*d)*X + 1.6368*d

The relationship between age and the mortality rate is given by the
coefficient of X, which shows the effect of an extra year of age on
the mortality rate. Since 0.0164 is statistically different from zero,
we can see this coefficient does change when d=0 or d=1. This shows
that the coefficient is differnet from smokers and non-smokers, thus
implying that the relationship between mortality rate and age is
actually different for smokers and for non-smokers. In particular, we
can see that an extra year of age has a greater impact on the
mortality rate for non-smokers than for smokers.


Google search strategy:
f-test regression
://www.google.com/search?q=f-test+regression&hl=en&lr=&ie=UTF-8&oe=UTF-8&start=0&sa=N


I hope this helps! If you have any doubt regarding this answer, please
don't hesitate to request a clarification. Otherwise, I await your
rating and comments.

Best wishes!
elmarto
vitaminc-ga rated this answer:5 out of 5 stars
finally log in
5 stars 
cheers 
:)

Comments  
Subject: Re: Statistical economy
From: elmarto-ga on 19 Oct 2003 16:55 PDT
 
Thanks for the great rating! I hope to see you around Google Answers soon.

Cheers!

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy