You can actually use Excel to do everything required here.
A. The quickest way is to have Excel run your descriptive statistics
by entering the data, then doing the following:
? click on Data Analysis, then the Descriptive Statistics option
? Click on Summary Statistic; enter the Confidence Interval Level (in
decimals, not percentages). Click okay and you?ll get an array of
statistics, as shown at the bottom of this spreadsheet linked below:
The 99% confidence level for both age and duration of employment uses
an ?alpha? or confidence level of 1% (0.01 in Excel?s statistical
B. As you can see the confidence interval for AGE is 37.75 years +/-
8.76, so you?re 99% confident that employees being hired are between
28.99 and 45.51 years.
Why are the results so bad? There are 5 or almost one-third of our
new hires under 25 alone, not just 1%.
First, we have a small sample size, which is more prone to errors.
But Excel has adjusted for this in using a T-distribution, meant for
small sample sizes of under 30.
The more fundamental problem is that these statistical tests are
designed for ?normal? or bell-shaped distributions. The histogram
that I?ve included on the page for ?Age? on this page shows an unusual
number of new hires clumped at the bottom and another node in the
41-45 age range.
We don?t know why the distribution is skewed ? it may be that the
October, 2001 data reflected a large number of recent college
graduates. Or that the population of Riverside is mostly young
people. But what the test tells us is that it?s a non-normal
?Confidence Intervals? (Waner & Costenoble, September 2000)
C. Because the age distribution is poor, it might have an impact on
the Weeks Employed. Or it might not.
Here, the statistical mean is 18.69 weeks = / - 6.45 weeks or a range
of 12.24 to 25.14 weeks at the 99% confidence level. The state
average of 17 weeks falls within that range ? so it?s impossible to
rule out a hypothesis that Riverside is significantly different than
the state data.
Here too it?s interesting to look at the histogram of the number of
Weeks Employed worked. You can see that, while it?s not a classic
bell curve, it?s at least ?closer? to being bell-shaped than the Age
The summary statistics at the bottom of the page provide a wealth of
information. For example we can see a measure of skew at the bottom
of the page ? and positive skew indicates a distribution with an
asymmetric tail extending toward more positive values. (Negative skew
is a distribution with an asymmetric tail extending toward more
Similarly kurtosis measures flatness. A positive number says that it
looks more like a normal distribution and a negative number says that
it is flatter.
Google search strategy:
?confidence interval? + ?small sample? + test
Clarification of Answer by
31 Jul 2005 07:15 PDT
C. Our statistical mean is 18.69 weeks. If we want to know what
range falls within a 99% confidence interval, it's plus or minus 6.45
weeks, as we can see from the 6.448740214 statistic for alpha = .01.
That makes it a range of 12.24 to 25.14 weeks at the 99% confidence level.
In statistical terms, we'd be asking ourselves, should be be accepting
the hypothesis H0 -- that there's no difference in the means of the
state and Riverside's mean employment period? Or should we accept H1
-- the assumption that they're different?
They're just too close too assume that Riverside's measure is
different from the state's. So Riverside can't reasonably expect to
find a cause for length of employment being 1.69 weeks longer in the
city -- it just may be the small sample size. And, from a practical
standpoint, it may be wise to predict future employees stay 17 weeks,
instead of using the 18.69 from its small sample. Those are just some
conclusions a manager would reach.
On the other hand, a manager might well want to look at why
Riverside's hires in October 2001 had so many young people. For
example, in an employment discrimination case due to age, this would
appear "outside the norm".
I know these last 2 paragraphs are nowhere in the scope of your
question: I'm simply trying to show the usefulness of the statistics.