Google Answers Logo
View Question
 
Q: Calculating the change in R squared in Multiple Regression ( Answered 3 out of 5 stars,   1 Comment )
Question  
Subject: Calculating the change in R squared in Multiple Regression
Category: Business and Money
Asked by: marsbrook-ga
List Price: $50.00
Posted: 12 Nov 2003 10:05 PST
Expires: 12 Dec 2003 10:05 PST
Question ID: 275110
I am trying to determine the change in R Squared in a series of
multiple regression models in order to show the impact if any that is
created by introducing a new independent variable in a succeeding
model after having created the first model with control variables. I
am using SPSS 11

Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:10 PST
Hi there,

I'm not quite sure what you are looking for here - R-square values are
typically just numbers between 0 and 1 (where 1 is a perfect
correlation). You typically have one dependant variable in your
equation and multiple independant variables.

It seems that all you need to do is introduce the new independant
variable into the equation and then compare the values from your
previous model to the new one. Perhaps I'm missing something here :)

answerguru-ga

Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:12 PST
Perhaps the ambiguity is in the control models....can you elaborate on this?

answerguru-ga

Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:13 PST
Err...typo - that should be control variables.

Clarification of Question by marsbrook-ga on 12 Nov 2003 14:45 PST
What you are indicating is what I considered to be the logical
approach to to the problem. I entered the information for five
independant/control variables and also one dependant variable. I got
an R squared and a change in R squared for this initial model which we
shall call model 1. I next entered an additional independant variable
and got a revised R squared, with an identical change in R squared in
the second model, which we shall call model 2. My question is twofold:
1) shouldn't the change in R squared in model 2 reflect the difference
between R squared in model 2 and R squared in model 1. 2) If not, can
I just subtract the value of R squared in model 2 from R squared in
model 1 and consider that the change in R squared between the two
models. This is what you seem to be implying and what I would conclude
logically. However, I was told that that was too simplistic a way to
view the results of the equation.

As a corollary, why do I get the same value for R Squared and the
Change in R squared each time, i.e. in both model 1 and model 2.

Request for Question Clarification by answerguru-ga on 14 Nov 2003 09:47 PST
I looked into this a little deeper and I can now say that if you have
added another independant variable it can have two possible effects:

1. Change the overall R-square value
2. Change the contribution of the other independant variables

Therefore, our initial suspicions were incorrect - you cannot directly
compare two multiple regressions this way unless you are using the
same independant variables. Adding a new independant variable
typically improves the R-square (at least a little), but it is an
optimization of sorts that may mean reducing the contribution of
another variable.

If this satisfies your question, please let me know and I'll post it
as an answer. Otherwise let me know if there is any other information
you were after in regards to this problem.

Thanks,
answerguru-ga

Clarification of Question by marsbrook-ga on 15 Nov 2003 09:06 PST
O.K. We are getting there, but we are not quite home yet. What you
have still not told me is how to calculate the change in R squared
from one model to the next. Let me see if I can explain further. I
create model 1 by entering a series of 5 independant variables against
one dependant variable. I then look at the output in SPSS and under
Model Summary, I get a figure for R Square of say .123. I go accross
in that same data set and under the Sub Heading 'Change Statistics' I
get a figure for R Square change which is the same .123.

I next introduce another independant variable; run the regression
again and get a new Model Summary for this new Model which we may term
Model 2. Once again I get an R Square, say this time it is .144. But
the R Square change in the Change Statistics area is once again the
same, viz., .144. Question. Do I calculate a chane in R Square between
model 2 and model 1 of .144 minus .123, which would be .021 or is
there some other more sophisticated manner of calculating the change
from one model to the next.

Bye the way, I understand your point that apart from the change in R
Square the introduction of a new independant variable would have some
effect on the other independant variables in the first model. That is
not my point, I would have thought that the Change Statistic would
reflect the change in R Square. For example you get an F Change and
Degrees of Freedom Change. Why no change in R Square in the model? 
Obviously, I am posing this question as a not statistician. Am I
making any sense to you yet?

Request for Question Clarification by answerguru-ga on 15 Nov 2003 10:19 PST
You're making perfect sense - my last point still holds, however. The
"change in R-sqaure value" is the same as your actual R-square value
because it's not being compared to anything at that point (this may be
why you are being misled by the program)! In other words, its taking
the r-square from the current model and subtracting zero from it
because that is the default value when no comparison is being made.

This makes sense and confirms my point that you cannot compare two
models unless they have the same independant variables. The "change in
R-sqaure value" will only yield a useful value if you are comparing
apples to apples (so to speak). This is more for cases when you are
trying to compare to R-square values for models with:
1. Perfectly corresponding variables
2. Different data for those variables
(these two requirements are key)

You simply cannot compare models with different degrees of freedom.
You can compare models with the same degrees of freedom but different
meanings for each of the variables, but that would not be useful.

Do you see the mathematical restriction that exists on what you are trying to do?

answerguru-ga

Clarification of Question by marsbrook-ga on 15 Nov 2003 11:53 PST
O.K. Thanks. I think I see what you mean. In any event you have done
your part of the job. Perhaps I may need to pose another question
regarding how to get the actual result I need, which I suspect would
entail holding all previously entered independent variables at their
means. However, it is only fair that at this point you post what you
have communicated to me as your answer and claim the funds. Meanwhile,
I'll try to figure out how to achieve the result I need. Thanks once
again for responding so promptly.
Answer  
Subject: Re: Calculating the change in R squared in Multiple Regression
Answered By: answerguru-ga on 15 Nov 2003 13:26 PST
Rated:3 out of 5 stars
 
Hi marsbrook-ga,

As requested, I am posting the highlights of our discussion as an official answer:

By adding an independant variable to a multiple regression model, two
things can potentially occur as a result:
 
1. Change the overall R-square value 
2. Change the contribution of the other independant variables 
 
Therefore, our initial suspicions were incorrect - you cannot directly
compare two multiple regressions this way unless you are using the
same independant variables. Adding a new independant variable
typically improves the R-square (at least a little), but it is an
optimization of sorts that may mean reducing the contribution of
another variable.
 
The "change in R-sqaure value" that you are seeing in your software is
the same as your actual R-square value because it's not being compared
to anything at that point (this may be why you are being misled by the
program)! In other words, its taking the r-square from the current
model and subtracting zero from it because that is the default value
when no comparison is being made.
 
This makes sense and confirms that you cannot compare two models
unless they have the same independant variables. The "change in
R-sqaure value" will only yield a useful value if you are comparing
apples to apples (so to speak). This is more for cases when you are
trying to compare to R-square values for models with:

1. Perfectly corresponding variables 
2. Different data for those variables 
(these two requirements are key) 
 
You simply cannot compare models with different degrees of freedom.
You can compare models with the same degrees of freedom but different
meanings for each of the variables, but that would not be useful.
 
In conclusion, such a mathematical restriction cannot be overcome, and
thus we cannot directly compare two multiple regression models unless
they meet the above stated conditions.

Cheers! 
answerguru-ga
marsbrook-ga rated this answer:3 out of 5 stars
I believe the researcher did a very good job in answering my question,
given the amount I offered for the answer. However, given that I still
have to find a solution to my original problem, I will remain with the
amount originally posted. If the researcher had gone on to provide me
with a specific solution to the problem, I would have considered that
an exceptional answer and increased the amount significantly.

Comments  
Subject: I think I have your solution, since I just did this in class the other day
From: rexdog979-ga on 20 Nov 2003 14:33 PST
 
Hey friend,
Your question seems very clear.  I assume you have a regression model
with p variables.  And then you add a few variables to this to have k
variables.  Obivously k>p.  Moreover we will call the first model the
reduced model since it has fewer terms.  The second model will be the
complete model, since it has all the variable.
For each model you should have an R^2.  For the reduced model we'll
call it Rr^2.  For the complete model we'll call it Rc^2.  N=the
sample size.  With this information we can plug it into a simple
equation, solving for F.

F= [(Rc^2-Rr^2)/(k-p)] / [(1-Rc^2)/(n-(k+1))]

We then compare that F value to 

  k-p
F
  n-(k+1)


I'm not sure if you're familiar with F-tests, but they are in the back
of most statistics text books.  If you're looking at the chart there
is a v1 that you read across and a v2 that you read down.  v1= k-p and
v2= n-(k+1).
So now you have the F that you solved and the F you found in the
textbook.  If the F that you solved is greater than the F in the
textbook, then you can be confident that at least one of the newly
introduced variables made an impact.

If you need any more help, particularly with how to read an F-table in
a book, just give k,p,n and I can do it in a few seconds.
Also if you need any more clarification on your answer, just ask.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy