Google Answers: Calculating the change in R squared in Multiple Regression

View Question

Q: Calculating the change in R squared in Multiple Regression ( Answered 3 out of 5 stars

Question

Subject: Calculating the change in R squared in Multiple Regression
Category: Business and Money
Asked by: marsbrook-ga
List Price: $50.00

Posted: 12 Nov 2003 10:05 PST
Expires: 12 Dec 2003 10:05 PST
Question ID: 275110

I am trying to determine the change in R Squared in a series of multiple regression models in order to show the impact if any that is created by introducing a new independent variable in a succeeding model after having created the first model with control variables. I am using SPSS 11
Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:10 PST Hi there, I'm not quite sure what you are looking for here - R-square values are typically just numbers between 0 and 1 (where 1 is a perfect correlation). You typically have one dependant variable in your equation and multiple independant variables. It seems that all you need to do is introduce the new independant variable into the equation and then compare the values from your previous model to the new one. Perhaps I'm missing something here :) answerguru-ga
Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:12 PST Perhaps the ambiguity is in the control models....can you elaborate on this? answerguru-ga
Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:13 PST Err...typo - that should be control variables.
Clarification of Question by marsbrook-ga on 12 Nov 2003 14:45 PST What you are indicating is what I considered to be the logical approach to to the problem. I entered the information for five independant/control variables and also one dependant variable. I got an R squared and a change in R squared for this initial model which we shall call model 1. I next entered an additional independant variable and got a revised R squared, with an identical change in R squared in the second model, which we shall call model 2. My question is twofold: 1) shouldn't the change in R squared in model 2 reflect the difference between R squared in model 2 and R squared in model 1. 2) If not, can I just subtract the value of R squared in model 2 from R squared in model 1 and consider that the change in R squared between the two models. This is what you seem to be implying and what I would conclude logically. However, I was told that that was too simplistic a way to view the results of the equation. As a corollary, why do I get the same value for R Squared and the Change in R squared each time, i.e. in both model 1 and model 2.
Request for Question Clarification by answerguru-ga on 14 Nov 2003 09:47 PST I looked into this a little deeper and I can now say that if you have added another independant variable it can have two possible effects: 1. Change the overall R-square value 2. Change the contribution of the other independant variables Therefore, our initial suspicions were incorrect - you cannot directly compare two multiple regressions this way unless you are using the same independant variables. Adding a new independant variable typically improves the R-square (at least a little), but it is an optimization of sorts that may mean reducing the contribution of another variable. If this satisfies your question, please let me know and I'll post it as an answer. Otherwise let me know if there is any other information you were after in regards to this problem. Thanks, answerguru-ga
Clarification of Question by marsbrook-ga on 15 Nov 2003 09:06 PST O.K. We are getting there, but we are not quite home yet. What you have still not told me is how to calculate the change in R squared from one model to the next. Let me see if I can explain further. I create model 1 by entering a series of 5 independant variables against one dependant variable. I then look at the output in SPSS and under Model Summary, I get a figure for R Square of say .123. I go accross in that same data set and under the Sub Heading 'Change Statistics' I get a figure for R Square change which is the same .123. I next introduce another independant variable; run the regression again and get a new Model Summary for this new Model which we may term Model 2. Once again I get an R Square, say this time it is .144. But the R Square change in the Change Statistics area is once again the same, viz., .144. Question. Do I calculate a chane in R Square between model 2 and model 1 of .144 minus .123, which would be .021 or is there some other more sophisticated manner of calculating the change from one model to the next. Bye the way, I understand your point that apart from the change in R Square the introduction of a new independant variable would have some effect on the other independant variables in the first model. That is not my point, I would have thought that the Change Statistic would reflect the change in R Square. For example you get an F Change and Degrees of Freedom Change. Why no change in R Square in the model? Obviously, I am posing this question as a not statistician. Am I making any sense to you yet?
Request for Question Clarification by answerguru-ga on 15 Nov 2003 10:19 PST You're making perfect sense - my last point still holds, however. The "change in R-sqaure value" is the same as your actual R-square value because it's not being compared to anything at that point (this may be why you are being misled by the program)! In other words, its taking the r-square from the current model and subtracting zero from it because that is the default value when no comparison is being made. This makes sense and confirms my point that you cannot compare two models unless they have the same independant variables. The "change in R-sqaure value" will only yield a useful value if you are comparing apples to apples (so to speak). This is more for cases when you are trying to compare to R-square values for models with: 1. Perfectly corresponding variables 2. Different data for those variables (these two requirements are key) You simply cannot compare models with different degrees of freedom. You can compare models with the same degrees of freedom but different meanings for each of the variables, but that would not be useful. Do you see the mathematical restriction that exists on what you are trying to do? answerguru-ga
Clarification of Question by marsbrook-ga on 15 Nov 2003 11:53 PST O.K. Thanks. I think I see what you mean. In any event you have done your part of the job. Perhaps I may need to pose another question regarding how to get the actual result I need, which I suspect would entail holding all previously entered independent variables at their means. However, it is only fair that at this point you post what you have communicated to me as your answer and claim the funds. Meanwhile, I'll try to figure out how to achieve the result I need. Thanks once again for responding so promptly.

Answer

Subject: Re: Calculating the change in R squared in Multiple Regression
Answered By: answerguru-ga on 15 Nov 2003 13:26 PST
Rated: 3 out of 5 stars

Hi marsbrook-ga,

As requested, I am posting the highlights of our discussion as an official answer:

By adding an independant variable to a multiple regression model, two
things can potentially occur as a result:
 
1. Change the overall R-square value 
2. Change the contribution of the other independant variables 
 
Therefore, our initial suspicions were incorrect - you cannot directly
compare two multiple regressions this way unless you are using the
same independant variables. Adding a new independant variable
typically improves the R-square (at least a little), but it is an
optimization of sorts that may mean reducing the contribution of
another variable.
 
The "change in R-sqaure value" that you are seeing in your software is
the same as your actual R-square value because it's not being compared
to anything at that point (this may be why you are being misled by the
program)! In other words, its taking the r-square from the current
model and subtracting zero from it because that is the default value
when no comparison is being made.
 
This makes sense and confirms that you cannot compare two models
unless they have the same independant variables. The "change in
R-sqaure value" will only yield a useful value if you are comparing
apples to apples (so to speak). This is more for cases when you are
trying to compare to R-square values for models with:

1. Perfectly corresponding variables 
2. Different data for those variables 
(these two requirements are key) 
 
You simply cannot compare models with different degrees of freedom.
You can compare models with the same degrees of freedom but different
meanings for each of the variables, but that would not be useful.
 
In conclusion, such a mathematical restriction cannot be overcome, and
thus we cannot directly compare two multiple regression models unless
they meet the above stated conditions.

Cheers! 
answerguru-ga

marsbrook-ga rated this answer: 3 out of 5 stars

I believe the researcher did a very good job in answering my question,
given the amount I offered for the answer. However, given that I still
have to find a solution to my original problem, I will remain with the
amount originally posted. If the researcher had gone on to provide me
with a specific solution to the problem, I would have considered that
an exceptional answer and increased the amount significantly.

Comments

Subject: I think I have your solution, since I just did this in class the other day
From: rexdog979-ga on 20 Nov 2003 14:33 PST

Hey friend,
Your question seems very clear.  I assume you have a regression model
with p variables.  And then you add a few variables to this to have k
variables.  Obivously k>p.  Moreover we will call the first model the
reduced model since it has fewer terms.  The second model will be the
complete model, since it has all the variable.
For each model you should have an R^2.  For the reduced model we'll
call it Rr^2.  For the complete model we'll call it Rc^2.  N=the
sample size.  With this information we can plug it into a simple
equation, solving for F.

F= [(Rc^2-Rr^2)/(k-p)] / [(1-Rc^2)/(n-(k+1))]

We then compare that F value to 

  k-p
F
  n-(k+1)


I'm not sure if you're familiar with F-tests, but they are in the back
of most statistics text books.  If you're looking at the chart there
is a v1 that you read across and a v2 that you read down.  v1= k-p and
v2= n-(k+1).
So now you have the F that you solved and the F you found in the
textbook.  If the F that you solved is greater than the F in the
textbook, then you can be confident that at least one of the newly
introduced variables made an impact.

If you need any more help, particularly with how to read an F-table in
a book, just give k,p,n and I can do it in a few seconds.
Also if you need any more clarification on your answer, just ask.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy