Google Answers: 4 statistics questions for $40.00

View Question

Q: 4 statistics questions for $40.00 ( Answered 3 out of 5 stars

Question

Subject: 4 statistics questions for $40.00
Category: Reference, Education and News > Homework Help
Asked by: probing-ga
List Price: $40.00

Posted: 27 Feb 2005 16:03 PST
Expires: 29 Mar 2005 16:03 PST
Question ID: 481997

1.  A study of 141 in-home caregivers revealed the average number of
hours worked per week to be 106.9 with a sample standard deviation of
68.2 hours.  Form a confidence interval at the 90% level for the
population mean number of caregiver hours.

2.  According to a survey, the mean consumption of beer per person in
the US is 22.0 gallons per year.  A random sample of 300 Washington DC
residents yielded a sample mean of 27.52 gallons of beer consumed
annually, with a standard deviation of 19.426 gallons.  At the 1%
level, is this sufficient evidence to conclude that the mean annual
consumption of beer per person in the nation?s capital exceeds the
national mean?

3.  According to a recent study, a random sample of 15 accountants
with no certification had a mean salary of $49,827.  A sample of 12
accountants with CPA certification had a mean salary of $61,936. 
Assume the standard deviation for both samples is $500 and the
populations are normally distributed.  Test the hypothesis at the 0.05
level that accountants with CPAs have higher salaries.

4.  A claim is made that GM-powered racecars are faster than those
using Ford engines.  The qualifying times for a sample of 11 Ford
powered cars at a raceway were 119.02 seconds with a standard
deviation of 1.76 seconds.  For a similar sample of 11 GM powered
racecars, the mean qualifying time 118.50 seconds with a standard
deviation of 1.24 seconds.  Test the hypothesis at the 0.05 level of
significance that GM-powered cars are faster (that is, they take less
time to complete the qualifying laps).

Answer

Subject: Re: 4 statistics questions for $40.00
Answered By: elmarto-ga on 28 Feb 2005 11:55 PST
Rated: 3 out of 5 stars

Hi probing!
Here are the answers to your questions.

Question 1

In order to answer this and some of the following questions, we'll
need to make use of the Central Limit Theorem:

"The central limit theorem states that given a distribution with a
mean m and variance s2, the sampling distribution of the mean
approaches a normal distribution with a mean (m) and a variance s2/N
as N, the sample size, increases."
 
Central Limit Theorem
http://davidmlane.com/hyperstat/A14043.html

Sample sizes larger than 30 can usually be considered large enough for
the sample mean to have an almost normal distribution. So the sample
size of 141 in this question is more than enough to use this theorem.

Let's call the sample mean X, the population mean m, and the sample
variance s2 (so the sample standard deviation is s). Let's also call n
the sample size. In order to build a 90% confidence interval around
the sample mean, we need to find a number 'a' such that:

Prob(m < X-a) = 0.05
and
Prob(m > X+a) = 0.05

The intuition behind this is that we want to find 'a' such that the
probability that m is outside the interval [X-a,X+a] is 0.10. Now,
since X follows a normal distribution (approximately), which is
symmetric around its mean, it turns out that you can use either
equation to calculate 'a' and you will get the same result. Let's use
the first one and rearrange it a little bit:

 Prob(m < X-a)
=Prob(X-m>a)
=Prob( (X-m)/(s/sqrt(n)) > a/(s/sqrt(n)) )

where sqrt means "square root". Why write it like this? Because now
the left-hand side of the inequality is known to follow a
t-distribution, for which have probability tables. Notice that its
just X minus its mean (m) divided by its standard deviation (recall
from the Central Limit Theorem that the sample mean has variance s2/n)

Student's t distribution
http://mathworld.wolfram.com/Studentst-Distribution.html

In particular, it follows a t distribution with n-1 (140) degrees of
freedom. Using that sqrt(140)=11.83, we have to solve the equation:

 Prob( t(140) > a/(s/sqrt(n)) )
=Prob( t(140) > a/(68.2/11.83) ) = 0.05

Looking up in a t distribution table

T-Distribution table
http://www.stat.ucla.edu/~dinov/courses_students.dir/Applets.dir/T-table.html

we find that

Prob( t(140) > 1.645 ) = 0.05
[Please request clarification if you don't understand how to use this table]

So now all we have to do is solve:

a/(68.2/11.83) = 2.576

which yields

a = 14.85

Therefore, a 90% confidence interval for the population mean is the interval:

 [106.9-14.85 , 106.9+14.85]
=[92.05 , 121.75]


Question 2

Let's call 'm' to the population mean of annual consumption of beer
per person in Washington. Here, the null and alternatuive hypothesis
can be written in the following way:

Ho : m=22
Ha : m>22

Call X the sample mean, n the population size, and s the standard
deviation. In a similar fashion as the previous question, we need to
find a value 'a' such that:

Prob( X > 22+a ) = 0.01
Prob( X-22 > a ) = 0.01

Once we have obtained 'a' (we'll see how to do that next), we'll
compare it to the actual difference between Washington's sample mean
and the US mean (which is 27.52-22=5.52). If the actual difference is
greater than 'a', we'll reject the hypothesis that Washington's mean
is 22 in favor of the alternative hypothesis that it's larger than 22.
The intuition is that is 5.52 is larger than 'a', then 5.52 is "too
large" a difference with the US mean to assume that Washington has the
same population mean as the US.

Now, in order to find 'a', we follow the same steps we used in the
previous question. We rewrite:

Prob( X-22 > a ) = 0.01
Prob( (X-22)/(s/sqrt(n)) > a/(s/sqrt(n)) ) = 0.01

Again, the left hand side of this equation has a t distribution, for
the same reasons discussed in the previous question. In this case, it
has a t distribution with 299 degrees of freedom (the sample size here
is 300). So:

Prob( t(299) > a/(19.426/sqrt(300)) ) = 0.01

Again, we use the t table exactly as before, obtaining that:

a/(19.426/sqrt(300)) = 2.326
a = 2.608

Finally, using hte reasoning explained above, since 5.52 is greater
than 2.608, we have evidence that the mean annual consumption of beer
per person in Washington is greater than the 22 gallons national
average.


Question 3 and 4

These questions can both be answered using the unpaired t-test for
mean equality. I will explain here the method only for question 3, but
it will be very easily applicable to question 4. Please do request
clarification if you have trouble using the following information for
question 4.

Let's call group A to the group of accountants with CPA and group B to
the group of accountants without CPA. Calling mA and mB to the
population mean of the salary of groups A and B rspectively, we're
interested in testing the following hypothesis:

Ho : mA = mB
Ha : mA > mB

Thus we'll thest the hypothesis that both means are equal versus the 
hypothesis that the mean of group A is greater than the mean of group
B.

We solve this just the same as before. Given the 0.05 level of
significance, we want to find a value 'a' such that

Prob ( Xa - Xb > a ) = 0.05

So, if the observed value of (Xa - Xb) (which is 61936-49827=12109)
turns out to be greater than 'a', we'll reject the null hypothesis
(means are equal) in favor of the alternative one (mean of CPA
certified accountants is greater).

Dividing in the above equation by sqrt(SDa^2 + SDb^2), where SDa is
the sample std. dev of group A and SDb is the sample std. dev. of
group B, we get:

Prob( (Xa - Xb)/sqrt(SDa^2 + SDb^2) > a/sqrt(SDa^2 + SDb^2) ) = 0.05

Now, if both means were equal, the left hand side of the equation
would be a random variable with a Student's t distribution with
(Na+Nb-2) degrees of freedom , where Na is the sample size of
group A and Nb is the sample size of group B. Since Na+Nb-2=25, then
it's a t distribution with 25 df. So we have

Prob( t(25) > a/sqrt(SDa^2 + SDb^2) ) = 0.05

We use the table again to get that:

Prob( t(25) > 1.725 ) = 0.05

Therefore,

a/sqrt(SDa^2 + SDb^2) = 1.725
a/sqrt(500^2 + 500^2) = 1.725
a = 1219.75

Since the observed difference (12109) is greater than than 'a'
(1219.75) we have evidence to conclude that CPA accountants have
higher salaries.


Google search terms
hypothesis testing
://www.google.com/search?hl=en&q=hypothesis+testing
t distribution table
://www.google.com/search?hl=en&lr=&q=t+distribution+table
unpaired t test
://www.google.com/search?hl=es&q=unpaired+t+test&spell=1
mean equality test
://www.google.com/search?sourceid=navclient&q=mean+equality+test


I hope this helps! If you have any questions regarding my answer,
please don't hesitate to request a clarification. Otherwise I await
your rating and final comments.

Best wishes!
elmarto

Request for Answer Clarification by probing-ga on 28 Feb 2005 14:14 PST
Can I have an answer clarification for question 4?

Clarification of Answer by elmarto-ga on 28 Feb 2005 15:36 PST

Hello probing!
Question 4 can be answered using the very same reasoning as question
3, just changing the numbers. Calling A the group of Ford cars and B
the group of GM cars, we want to test:

Ho : mA = mB
Ha : mA > mB

(that is, that Ford cars take longer to complete the laps) Given the
0.05 level of significance, we want to find a value 'a' such that

Prob ( Xa - Xb > a ) = 0.05

So, if the observed value of (Xa - Xb) (which is 119.02-118.5=0.52)
turns out to be greater than 'a', we'll reject the null hypothesis
(means are equal) in favor of the alternative one (Fords take longer).

Dividing in the above equation by sqrt(SDa^2 + SDb^2), where SDa is
the sample std. dev of group A and SDb is the sample std. dev. of
group B, we get:

Prob( (Xa - Xb)/sqrt(SDa^2 + SDb^2) > a/sqrt(SDa^2 + SDb^2) ) = 0.05

Now, if both means were equal, the left hand side of the equation
would be a random variable with a Student's t distribution with
(Na+Nb-2) degrees of freedom , where Na is the sample size of group A
and Nb is the sample size of group B. Since Na+Nb-2=20, then it's a t
distribution with 20 df. So we have

Prob( t(20) > a/sqrt(SDa^2 + SDb^2) ) = 0.05

We use the table again to get that:

Prob( t(20) > 1.725 ) = 0.05

Therefore,

a/sqrt(SDa^2 + SDb^2) = 1.725
a/sqrt(1.76^2 + 1.24^2) = 1.725
a = 3.71

Since the observed difference (0.52) is smaller than than 'a'
(3.71), we can't reject the null hypothesis that Ford and GM are
equally fast. So we don't find any evidence that GM cars are faster.


Incidentially, I found that I made a small mistake in question 3,
which fortunately does not change the final conclusion. I wrote in a
line that:

Prob( t(25) > 1.725 ) = 0.05

but this is wrong. I've just re-checked t distribution table, and
instead of 1.725, that number should be 1.708. So the value of 'a'
actually turns out to be 1207.72 instead of 1219.75. Of course, the
difference between the sample means is still greater than this
corrected 'a' value, so we still conclude that CPA certified
accountants earn higher salaries.


Best wishes!
elmarto

probing-ga rated this answer: 3 out of 5 stars

Comments

Subject: Re: 4 statistics questions for $40.00
From: pkuanko-ga on 28 Feb 2005 18:24 PST

There's a mistake given by elmarto for question 1, which I already
given my answer in your separate question. I'll put here again for
you:
Naturally, you should use the t distribution to solve this question.
However as the sample size is 141, there is very little difference if
you were to use the z distribution, which I am going to use here.
A 90% confidence interval for the population mean is 
= 106.9 +/- z(0.05) x s /[sq root (141)]
= 106.9 +/- 1.645   x 68.2 /[sq root (141)]
= (97.45, 116.35)

Elmarto's mistake is here:
Prob( t(140) > 1.645 ) = 0.05
[Please request clarification if you don't understand how to use this table]

So now all we have to do is solve:

a/(68.2/11.83) = 2.576 <--- ORIGINAL MISTAKE

which yields

a = 14.85 <---FOLLOW ON MISTAKE
Therefore, a 90% confidence interval for the population mean is the interval:

 [106.9-14.85 , 106.9+14.85] <----FOLLOW ON MISTAKE
=[92.05 , 121.75] <---- FOLLOW ON MISTAKE

Good luck to your homework!

Subject: Re: 4 statistics questions for $40.00
From: elmarto-ga on 07 Mar 2005 05:59 PST

Hi pkuanko,
You're absolutely right, I mistakenly copied 2.576 instead of 1.645 in
the next step. Of course, the reasoning is just the same. Thanks for
pointing it out.

Regards,
elmarto

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy