View Question
Q: Calculating the change in R squared in Multiple Regression ( Answered ,   1 Comment )
 Question
 Subject: Calculating the change in R squared in Multiple Regression Category: Business and Money Asked by: marsbrook-ga List Price: \$50.00 Posted: 12 Nov 2003 10:05 PST Expires: 12 Dec 2003 10:05 PST Question ID: 275110
 ```I am trying to determine the change in R Squared in a series of multiple regression models in order to show the impact if any that is created by introducing a new independent variable in a succeeding model after having created the first model with control variables. I am using SPSS 11``` Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:10 PST ```Hi there, I'm not quite sure what you are looking for here - R-square values are typically just numbers between 0 and 1 (where 1 is a perfect correlation). You typically have one dependant variable in your equation and multiple independant variables. It seems that all you need to do is introduce the new independant variable into the equation and then compare the values from your previous model to the new one. Perhaps I'm missing something here :) answerguru-ga``` Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:12 PST ```Perhaps the ambiguity is in the control models....can you elaborate on this? answerguru-ga``` Request for Question Clarification by answerguru-ga on 12 Nov 2003 11:13 PST `Err...typo - that should be control variables.` Clarification of Question by marsbrook-ga on 12 Nov 2003 14:45 PST ```What you are indicating is what I considered to be the logical approach to to the problem. I entered the information for five independant/control variables and also one dependant variable. I got an R squared and a change in R squared for this initial model which we shall call model 1. I next entered an additional independant variable and got a revised R squared, with an identical change in R squared in the second model, which we shall call model 2. My question is twofold: 1) shouldn't the change in R squared in model 2 reflect the difference between R squared in model 2 and R squared in model 1. 2) If not, can I just subtract the value of R squared in model 2 from R squared in model 1 and consider that the change in R squared between the two models. This is what you seem to be implying and what I would conclude logically. However, I was told that that was too simplistic a way to view the results of the equation. As a corollary, why do I get the same value for R Squared and the Change in R squared each time, i.e. in both model 1 and model 2.``` Request for Question Clarification by answerguru-ga on 14 Nov 2003 09:47 PST ```I looked into this a little deeper and I can now say that if you have added another independant variable it can have two possible effects: 1. Change the overall R-square value 2. Change the contribution of the other independant variables Therefore, our initial suspicions were incorrect - you cannot directly compare two multiple regressions this way unless you are using the same independant variables. Adding a new independant variable typically improves the R-square (at least a little), but it is an optimization of sorts that may mean reducing the contribution of another variable. If this satisfies your question, please let me know and I'll post it as an answer. Otherwise let me know if there is any other information you were after in regards to this problem. Thanks, answerguru-ga``` Clarification of Question by marsbrook-ga on 15 Nov 2003 09:06 PST ```O.K. We are getting there, but we are not quite home yet. What you have still not told me is how to calculate the change in R squared from one model to the next. Let me see if I can explain further. I create model 1 by entering a series of 5 independant variables against one dependant variable. I then look at the output in SPSS and under Model Summary, I get a figure for R Square of say .123. I go accross in that same data set and under the Sub Heading 'Change Statistics' I get a figure for R Square change which is the same .123. I next introduce another independant variable; run the regression again and get a new Model Summary for this new Model which we may term Model 2. Once again I get an R Square, say this time it is .144. But the R Square change in the Change Statistics area is once again the same, viz., .144. Question. Do I calculate a chane in R Square between model 2 and model 1 of .144 minus .123, which would be .021 or is there some other more sophisticated manner of calculating the change from one model to the next. Bye the way, I understand your point that apart from the change in R Square the introduction of a new independant variable would have some effect on the other independant variables in the first model. That is not my point, I would have thought that the Change Statistic would reflect the change in R Square. For example you get an F Change and Degrees of Freedom Change. Why no change in R Square in the model? Obviously, I am posing this question as a not statistician. Am I making any sense to you yet?``` Request for Question Clarification by answerguru-ga on 15 Nov 2003 10:19 PST ```You're making perfect sense - my last point still holds, however. The "change in R-sqaure value" is the same as your actual R-square value because it's not being compared to anything at that point (this may be why you are being misled by the program)! In other words, its taking the r-square from the current model and subtracting zero from it because that is the default value when no comparison is being made. This makes sense and confirms my point that you cannot compare two models unless they have the same independant variables. The "change in R-sqaure value" will only yield a useful value if you are comparing apples to apples (so to speak). This is more for cases when you are trying to compare to R-square values for models with: 1. Perfectly corresponding variables 2. Different data for those variables (these two requirements are key) You simply cannot compare models with different degrees of freedom. You can compare models with the same degrees of freedom but different meanings for each of the variables, but that would not be useful. Do you see the mathematical restriction that exists on what you are trying to do? answerguru-ga``` Clarification of Question by marsbrook-ga on 15 Nov 2003 11:53 PST ```O.K. Thanks. I think I see what you mean. In any event you have done your part of the job. Perhaps I may need to pose another question regarding how to get the actual result I need, which I suspect would entail holding all previously entered independent variables at their means. However, it is only fair that at this point you post what you have communicated to me as your answer and claim the funds. Meanwhile, I'll try to figure out how to achieve the result I need. Thanks once again for responding so promptly.```
 ```Hi marsbrook-ga, As requested, I am posting the highlights of our discussion as an official answer: By adding an independant variable to a multiple regression model, two things can potentially occur as a result: 1. Change the overall R-square value 2. Change the contribution of the other independant variables Therefore, our initial suspicions were incorrect - you cannot directly compare two multiple regressions this way unless you are using the same independant variables. Adding a new independant variable typically improves the R-square (at least a little), but it is an optimization of sorts that may mean reducing the contribution of another variable. The "change in R-sqaure value" that you are seeing in your software is the same as your actual R-square value because it's not being compared to anything at that point (this may be why you are being misled by the program)! In other words, its taking the r-square from the current model and subtracting zero from it because that is the default value when no comparison is being made. This makes sense and confirms that you cannot compare two models unless they have the same independant variables. The "change in R-sqaure value" will only yield a useful value if you are comparing apples to apples (so to speak). This is more for cases when you are trying to compare to R-square values for models with: 1. Perfectly corresponding variables 2. Different data for those variables (these two requirements are key) You simply cannot compare models with different degrees of freedom. You can compare models with the same degrees of freedom but different meanings for each of the variables, but that would not be useful. In conclusion, such a mathematical restriction cannot be overcome, and thus we cannot directly compare two multiple regression models unless they meet the above stated conditions. Cheers! answerguru-ga```
 marsbrook-ga rated this answer: ```I believe the researcher did a very good job in answering my question, given the amount I offered for the answer. However, given that I still have to find a solution to my original problem, I will remain with the amount originally posted. If the researcher had gone on to provide me with a specific solution to the problem, I would have considered that an exceptional answer and increased the amount significantly.```
 ```Hey friend, Your question seems very clear. I assume you have a regression model with p variables. And then you add a few variables to this to have k variables. Obivously k>p. Moreover we will call the first model the reduced model since it has fewer terms. The second model will be the complete model, since it has all the variable. For each model you should have an R^2. For the reduced model we'll call it Rr^2. For the complete model we'll call it Rc^2. N=the sample size. With this information we can plug it into a simple equation, solving for F. F= [(Rc^2-Rr^2)/(k-p)] / [(1-Rc^2)/(n-(k+1))] We then compare that F value to k-p F n-(k+1) I'm not sure if you're familiar with F-tests, but they are in the back of most statistics text books. If you're looking at the chart there is a v1 that you read across and a v2 that you read down. v1= k-p and v2= n-(k+1). So now you have the F that you solved and the F you found in the textbook. If the F that you solved is greater than the F in the textbook, then you can be confident that at least one of the newly introduced variables made an impact. If you need any more help, particularly with how to read an F-table in a book, just give k,p,n and I can do it in a few seconds. Also if you need any more clarification on your answer, just ask.```