I am trying to determine the change in R Squared in a series of
multiple regression models in order to show the impact if any that is
created by introducing a new independent variable in a succeeding
model after having created the first model with control variables. I
am using SPSS 11 |
Request for Question Clarification by
answerguru-ga
on
12 Nov 2003 11:10 PST
Hi there,
I'm not quite sure what you are looking for here - R-square values are
typically just numbers between 0 and 1 (where 1 is a perfect
correlation). You typically have one dependant variable in your
equation and multiple independant variables.
It seems that all you need to do is introduce the new independant
variable into the equation and then compare the values from your
previous model to the new one. Perhaps I'm missing something here :)
answerguru-ga
|
Request for Question Clarification by
answerguru-ga
on
12 Nov 2003 11:12 PST
Perhaps the ambiguity is in the control models....can you elaborate on this?
answerguru-ga
|
Request for Question Clarification by
answerguru-ga
on
12 Nov 2003 11:13 PST
Err...typo - that should be control variables.
|
Clarification of Question by
marsbrook-ga
on
12 Nov 2003 14:45 PST
What you are indicating is what I considered to be the logical
approach to to the problem. I entered the information for five
independant/control variables and also one dependant variable. I got
an R squared and a change in R squared for this initial model which we
shall call model 1. I next entered an additional independant variable
and got a revised R squared, with an identical change in R squared in
the second model, which we shall call model 2. My question is twofold:
1) shouldn't the change in R squared in model 2 reflect the difference
between R squared in model 2 and R squared in model 1. 2) If not, can
I just subtract the value of R squared in model 2 from R squared in
model 1 and consider that the change in R squared between the two
models. This is what you seem to be implying and what I would conclude
logically. However, I was told that that was too simplistic a way to
view the results of the equation.
As a corollary, why do I get the same value for R Squared and the
Change in R squared each time, i.e. in both model 1 and model 2.
|
Request for Question Clarification by
answerguru-ga
on
14 Nov 2003 09:47 PST
I looked into this a little deeper and I can now say that if you have
added another independant variable it can have two possible effects:
1. Change the overall R-square value
2. Change the contribution of the other independant variables
Therefore, our initial suspicions were incorrect - you cannot directly
compare two multiple regressions this way unless you are using the
same independant variables. Adding a new independant variable
typically improves the R-square (at least a little), but it is an
optimization of sorts that may mean reducing the contribution of
another variable.
If this satisfies your question, please let me know and I'll post it
as an answer. Otherwise let me know if there is any other information
you were after in regards to this problem.
Thanks,
answerguru-ga
|
Clarification of Question by
marsbrook-ga
on
15 Nov 2003 09:06 PST
O.K. We are getting there, but we are not quite home yet. What you
have still not told me is how to calculate the change in R squared
from one model to the next. Let me see if I can explain further. I
create model 1 by entering a series of 5 independant variables against
one dependant variable. I then look at the output in SPSS and under
Model Summary, I get a figure for R Square of say .123. I go accross
in that same data set and under the Sub Heading 'Change Statistics' I
get a figure for R Square change which is the same .123.
I next introduce another independant variable; run the regression
again and get a new Model Summary for this new Model which we may term
Model 2. Once again I get an R Square, say this time it is .144. But
the R Square change in the Change Statistics area is once again the
same, viz., .144. Question. Do I calculate a chane in R Square between
model 2 and model 1 of .144 minus .123, which would be .021 or is
there some other more sophisticated manner of calculating the change
from one model to the next.
Bye the way, I understand your point that apart from the change in R
Square the introduction of a new independant variable would have some
effect on the other independant variables in the first model. That is
not my point, I would have thought that the Change Statistic would
reflect the change in R Square. For example you get an F Change and
Degrees of Freedom Change. Why no change in R Square in the model?
Obviously, I am posing this question as a not statistician. Am I
making any sense to you yet?
|
Request for Question Clarification by
answerguru-ga
on
15 Nov 2003 10:19 PST
You're making perfect sense - my last point still holds, however. The
"change in R-sqaure value" is the same as your actual R-square value
because it's not being compared to anything at that point (this may be
why you are being misled by the program)! In other words, its taking
the r-square from the current model and subtracting zero from it
because that is the default value when no comparison is being made.
This makes sense and confirms my point that you cannot compare two
models unless they have the same independant variables. The "change in
R-sqaure value" will only yield a useful value if you are comparing
apples to apples (so to speak). This is more for cases when you are
trying to compare to R-square values for models with:
1. Perfectly corresponding variables
2. Different data for those variables
(these two requirements are key)
You simply cannot compare models with different degrees of freedom.
You can compare models with the same degrees of freedom but different
meanings for each of the variables, but that would not be useful.
Do you see the mathematical restriction that exists on what you are trying to do?
answerguru-ga
|
Clarification of Question by
marsbrook-ga
on
15 Nov 2003 11:53 PST
O.K. Thanks. I think I see what you mean. In any event you have done
your part of the job. Perhaps I may need to pose another question
regarding how to get the actual result I need, which I suspect would
entail holding all previously entered independent variables at their
means. However, it is only fair that at this point you post what you
have communicated to me as your answer and claim the funds. Meanwhile,
I'll try to figure out how to achieve the result I need. Thanks once
again for responding so promptly.
|