Hi Whosher-ga:
What you ask is fairly straightforward statistics.
Some web sites that have good information on these types of problems
include:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm
http://www.ruf.rice.edu/~bioslabs/tools/stats/chisquare.html
http://www.sjsu.edu/faculty/gerstman/EpiInfo/cat-one.htm
http://osf1.gmu.edu/~alaemmer/biostat/goodness/goodness.html
However, let's look at your particular data and your three-part
question.
First, let's look at the data that we are provided with.
Sample size = 252 students
alpha (significance level) = 0.05 (or 5%)
[I'm also assuming that you made a typo in your observed frequencies -
that where you said "21-24", you really meant "20-24".]
a) To set up a table to compare the expected and observed frequencies
for the 4 given groups, simply do as follows:
Age Group Estimated % Expected Freq. Observed Frequency
< 18 2.7 7 6
18 - 19 29.9 75 118
20 - 24 53.4 135 102
> 24 14.0 35 26
Totals 100.0 252 252
b) In this case the Chi-squared goodness of fit hypothesis is:
H[0]: The data do follow the estimated distribution. (This is the
"null hypothesis".)
H[a]: The data do not follow the estimated distribution. (This is the
"alternate hypothesis".)
c) To perform the goodness of fit test, we must first calculate the
chi-squared value. The formula for that value is:
http://www.itl.nist.gov/div898/handbook/eda/section3/eqns/chisqgf.gif
where,
O[i] is the observed frequency for group i and
E[i] is the expected frequency for group [i].
So, for this data, the chi-squared statistic would be:
(6-7)^2/7 + (118-75)^2/75 + (102-135)^2/135 + (26-35)^2/35 =
.143 + 24.653 + 8.067 + 2.314 =
35.177
Since we have 4 groups, our degrees of freedom equal 3 (4-1).
If we look up in the chi-squared table at
http://www.uvm.edu/~golivett/introbio/lab_reports/chi.html
with three degrees of freedom and a 5% level of significance, we see
that the tabled value is:
7.81
Since the chi-squared statistic (well) exceeds the tabled value, you
can safely reject the null hypothesis and accept the alternate
hypothesis.
Therefore, in this case, the data do not follow the estimated
distribution. It is likely that the administration's distribution
estimates need to be re-examined.
I hope this answers your question. Please ask for clarification if
necessary before rating this answer.
Thanks.
websearcher-ga |