Google Answers Logo
View Question
 
Q: Business Statistics - 3 Solutions ( Answered 5 out of 5 stars,   1 Comment )
Question  
Subject: Business Statistics - 3 Solutions
Category: Reference, Education and News > Homework Help
Asked by: sunshine2-ga
List Price: $40.00
Posted: 04 Jun 2003 17:51 PDT
Expires: 04 Jul 2003 17:51 PDT
Question ID: 213177
1.  Sampling Distribution of the Mean

Keynote systems estimates the time it takes to complete a purchase
transaction on various e-commerce websites.  In June 2000, Keynote
reported the average transaction time to be 23.0 seconds on the Target
website and 16.3 seconds on the Sears website.  Suppose that the
standard deviation for transaction times is 5 seconds for both
websites.

A. If random samples of 30 transaction times on the Target website are
selected, what proportion of the sample means will be between 22 and
24 seconds?

B. If random samples of 30 transaction times on the Target website are
selected, what proportion of the sample means will be between 21 and
25 seconds?

C. If random samples of 30 transaction times on the Target website are
selected, what proportion of the sample means will be greater than 25
seconds?

D. If random samples of 30 transaction times on the Sears website are
selected, what proportion of the sample means will be between 15 and
18 seconds?

E. If random samples of 30 transaction times on the Sears website are
selected, what proportion of the sample means will be between 14 and
19 seconds?

F. If random samples of 30 transaction times on the Sears website are
selected, what proportion of the sample means will be greater than 25
seconds?

G. What assumptions, if any, do you have to make about the probability
distribution of the population of transaction times to complete a-f?

2.  Hypothesis Testing

The FDA is responsible for approving new drugs.  Some consumers feel
that the approval process is too lenient while lobbyists push for
faster approvals.  Consider a null hypothesis that a new, unapproved
drug is unsafe and an alternative hypothesis that a new, unapproved
drug is safe.

A. Explain the risks of committing Type I or Type II error.
B. Which type of error are the consumer groups trying to avoid? 
Explain.
C. Which type of error are the lobbyists trying to avoid? Explain.
D. How would it be possible to lower the chance of both Type I and
Type II errors?

3. Z Test Categorical Data

Assume that n1=100; X1=50; n2=100; and X2=30

A. At the 0.05 level of significance, is there evidence of a
significant difference between the proportion of successes in group 1
and group 2?

B.  Set up a 95% confidence internal estimate of the difference
between the two proportions.

Request for Question Clarification by answerguru-ga on 04 Jun 2003 23:55 PDT
Hi sunshine2-ga,

If no other researcher answers this question, I will be able to answer
it by Thursday afternoon.

answerguru-ga

Clarification of Question by sunshine2-ga on 05 Jun 2003 09:08 PDT
Thank you.  That sounds great.
Answer  
Subject: Re: Business Statistics - 3 Solutions
Answered By: elmarto-ga on 05 Jun 2003 13:49 PDT
Rated:5 out of 5 stars
 
Hello sunshine2!

A. In order to answer these questions, we have to invoke the Central
Limit Theorem:

"The central limit theorem states that given a distribution with a
mean m and variance s2, the sampling distribution of the mean
approaches a normal distribution with a mean (m) and a variance s2/N
as N, the sample size, increases."

http://davidmlane.com/hyperstat/A14043.html

A sample size N=30, like the one stated in the question, is usually
considered "large enough" for the distribution of the sample mean to
be approximately normal. Thus, we have here that the distribution of
sample means in this questions follows a normal distribution with mean
23.0 and variance 25/30 = 5/6 (the 25 comes from the standard
deviation being 5 seconds, so the variance is 5^2=25). The standard
deviation is then the square root (sqrt) of 5/6, which is 0.9128...

Since we have the probability distribution of the sample mean, we can
estimate the probability that the sample mean will fall between 22 and
24 seconds. Let's call X the observed sample mean. We want to
calculate:

Prob( 22 < X < 24 )
=Prob(X<24) - Prob(X<22)

In order to compute the probabilities, the usual method is to use
standard normal distribution tables. A standard normal distrubution is
a normal distribution with mean=0 and variance=1. Also, we have that
if X is a normal distribution with mean m and variance s2, then
(X-m)/sqrt(s2) has a standard normal distribution.

Let's how can we use this to compute the probabilities we need. We had
to compute Prob(X<24) where X is a normal distribution with mean 23.0
and variance 5/6:

Prob(X<24)
=Prob(X-23 < 24-23)
=Prob( (X-23)/0.9128 < (24-23)/0.9128)
=Prob( Z < 1/0.9128)
=Prob( Z < 1.095)

where Z is a standard normal distribution. We now have to check a
table to find the value we're looking for. This table can be found in
most Statistics books. It's also available here:

Standard Normal Probability Table
http://www.stat.psu.edu/~herbison/stat200/stat200_model_demo/supplements/NormalTable.html

Looking up the value 1.09 in this table (they only have 2 decimals) we
find that

Prob( Z < 1.09) = 0.8621

(If you don't understand how to use the table, please look at the
bottom of this answer)

Now we need to find Prob(X<22). Using the same method as before
(substract mean and divide by standard deviation), we find that:

Prob(X<22)
=Prob(Z < -1.09) = 0.1379

Finally, we find the probability we were looking for by substracting
0.1379 from 0.8621:

Prob(22<X<24)
=Prob(X<24) - Prob(X<22)
=0.8261 - 0.1379
=0.7242

The number 0.7242 is the one we were looking for. It means that 72.42%
of the sample means will be between 22 and 24 seconds.


B. The procedure here is exactly the same as before. We have to
compute:
Prob(21 < X < 25) where X (the sample mean) is normally distributed
with mean 23.0 and variance 5/6. So,

Prob(21 < X < 25)
=Prob(X<25) - Prob(X<21)
=Prob(Z < 2.19) - Prob(Z < -2.19)
=0.9857 - 0.0143
=0.9714

Thus, 97.14% of the sample means will be between 21 and 25 seconds.


C. In this question we are asked for the probability of X being
GREATER than some number, rather than *smaller*, as in questions A and
B. Since the table in the link I provided above gives the probability
of Z being smaller than a given number, we need to find a way to
"flip" the inequality. This is easily done. We want to find here
Prob(X>25). But:

Prob(X > 25)
=1-Prob(X < 25)  (notice the change in the direction of the "<")
=1-Prob(Z < 2.19)
=1-0.9857
=0.0143

Thus, only 1.43% percent of the sample means is greater than 25
seconds.

D. For Sears website, we have that X (recall X is the observed sample
mean) is normally distributed with mean 16.3 and variance 5/6
(variance is the same for both websites). Apart from this difference,
the procedure for solving these questions is exactly the same. We have
to find here:

Prob(15<X<18)
=Prob(X<18)-Prob(X<15)

Again, in order to find Prob(X<18) we need to transform X into a
standard normal distribution, so:

Prob(X<18)
=Prob( (X-16.3)/0.9128 < (18-16.3)/0.9128 )
=Prob( Z < 1.7/0.9128)
=Prob( Z < 1.86 )
=0.9686 (looking in the table)

The same procedure gives that Prob(X<15)=0.0778, so:

Prob(15<X<18)
=Prob(X<18)-Prob(X<15)
=0.9686 - 0.0778
=0.8908

Thus, the proportion of Sears sample means that will fall between 15
and 18 seconds is 89.08%

E. At this point, it should be clear how to compute this:

Prob(14<X<19)
=Prob(X<19) - Prob(X<14)
=Prob(Z < 2.95) - Prob(Z < -2.51)
=0.9984 - 0.0060
=0.9924

So, 99.24% of the sample means will fall between 14 and 19 seconds.

F. Just the same as before:

Prob(X>25)
=1-Prob(X<25)
=1-Prob(Z<9.53)
=1-1
=0

Although the number 9.53 is not in the table, the probability of Z
being smaller than numbers above 4 is assumed to be extremely close to
1.

The answer to this question is then that you willl never see a Sears
sample mean of 25 seconds or greater.

G. I'll quote the second paragraph from the link I provided in
question A about the Central Limit Theorem:

"The amazing and counter- intuitive thing about the central limit
theorem is that no matter what the shape of the original distribution,
the sampling distribution of the mean approaches a normal
distribution"

This means that we don't have to make any assumptions about the
probability distributions of the transacation times. We only need to
know their mean and variance, but the sample mean will have a normal
distribution regardless of the distribution of the transaction
times.It can be Uniform, Gamma, Normal, Chi-Square or whatever: in any
case the distribution of the sample means will be normal.

==========================================

Question 2.

A. First of all, let's define what a Type I and Type II errors are.
You can find the definition in the following page:

Type I and II errors
http://davidmlane.com/hyperstat/A18652.html

A Type I error occurs when we reject the Null Hypothesis, but the Null
Hypothesis is true. A Type II error occurs when we don't reject the
Null Hypothesis, but the Null Hypothesis is false.

In the case of the safety of a new drug, let's see what the risks of
commiting either type of error are.

Type I error:
In this case, the null hypothesis is true (i.e., the drug actually is
unsafe) but we reject it (i.e., we say "it's not unsafe"). As you can
see, the risks associated with commiting a Type I error in the testing
of a new drug are enormous. Basically, by commiting a Type I error,
the FDA could approve this drug, allowing the laboratory that
manufactures it to commercialize it, while the truth is that the drug
is not safe to take. If we understand that a drug is "unsafe" when it
almost surely kills or causes permanent damage to the person taking
it, it's cclear that the consequences of commiting a Type I error, and
thus approving the drug can be terrible.

Type II error
In this case the null hypothesis is false (i.e., the drug is safe) but
we don't reject the null hypothesis (i.e., we say "this drug is
"unsafe"). This type of error appears to be less "risky" in this case.
However, it is not without ill consequences. By commiting a Type II
error, the FDA would not allow the commercialization of this drug,
while the truth is that the drug is harmless. This means that if a
laboratory found a truly safe way to cure a disease, the Type II error
of the FDA would not permit this drug to come to the hands of people
that need it. Some time will have to pass until a new drug is tested
and approved, and in the meantime, many people would still be
suffering the disease when a safe drug exists to cure it.

B. According to the statement of this question, the consumer groups
feel that the FDA approves drugs too easily. They want the FDA to be
more strict regarding the approval of drugs. Thus, they want to avoid
a Type I error. They don't want the FDA to say that a drug is safe
while the truth is that it can be harmful. Since it's the consumers
the ones that finally buy and use the drugs, it's understandable that
they require that no unsafe drugs reach the market. However, their
demand may cause drugs that are truly safe to be rejected by the FDA.

C. The lobbyists, on the other hand, want the FDA to approve more
drugs. They don't want the FDA to commit a Type II error: that is,
they don't want the FDA to reject a drug that is truly safe. Since the
lobbyists are the ones that benefit from the sale of these drugs, they
don't want the FDA to reject safe drugs, because it would clearly mean
less utilities for them. However, the lobbyists' demand may cause
unsafe drugs to be approved by the FDA, with the terrible consequences
discussed above.

D. I'll quote a page here:
"There is a tradeoff between Type I and Type II errors. The more an
experimenter protects him or herself against Type I errors by choosing
a low level, the greater the chance of a Type II error. Requiring very
strong evidence to reject the null hypothesis makes it very unlikely
that a true null hypothesis will be rejected. However, it increases
the chance that a false null hypothesis will not be rejected..."

Type I and II errors
http://davidmlane.com/hyperstat/A2917.html

Thus, it's not possible to lower the chance of both errors. If we
reduce the chance of one them, we are necessarily increasing the
chance of the other one.

===========================================

Question 3.

Since the question is stated in terms of "successes", I'll assume X1
and X2 are the number of successes in each group, and that X1 and X2
have a binomial distribution.

In order to answer these questions we first need to build a
t-statistic. You can find how to build it in the following page

Two Sample Test for Equal Means
http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm

The test is:
            Y1 - Y2
T =  -------------------
      sqrt(s1/n1 + s2/n2)

where Y1 and Y2 are the sample means of group 1 and 2 (in your case,
Y1=50/100=0.5, Y2=30/100=0.3 - these are the observed proportions of
success). The terms "s1" and "s2" are the sample variances of groups 1
and 2. The sample variance of the proportion in a binomial
distribution is calculated as follows: if p is the sample mean, then
p*(1-p) is the sample variance.

We now have to de

Request for Answer Clarification by sunshine2-ga on 05 Jun 2003 14:37 PDT
Hello,

The final part to question 3 seems to be missing.  Could you please
verify if the solution is complete.

Thanks so much for answering my question...and so fast :-)

Clarification of Answer by elmarto-ga on 05 Jun 2003 15:25 PDT
I'm very sorry, I messed up the answer. Inadvertently, I didn't paste
the last part of question 3. Here it is.

We now have to determine the degrees of freedom of the T statistic
described above. It turns out that it is quite difficult to find the
exact distribution of this T statistic in a case like yours (where the
population variance of both groups is unkown). However, an
approximation can be done through the formula that is stated in the
page provided above. Let's call v the degrees of freedom of the
t-distribution, then:

                 [(s1/n1)+(s2/n2)]^2
v=   ---------------------------------------------
       ((s1/n1)^2)/(n1-1) + ((s2/n2)^2)/(n2-1) 

Now we have all the elements we need in order to answer the question.

First we calculate s1 and s2

s1=0.5*(1-0.5)=0.5*0.5=0.25
s2=0.3*(1-0.3)=0.3*0.7=0.21

Now, we calculate v. Plugging the s1 and s2 we obtained here and
n1=n2=100 into the formula for v, we obtain that v=196. This will make
things easier, because a t-distribution with more than 30 degrees of
freedom can be approximated with a standard normal distribution. Here
we have that it has 196 degrees of freedom, so we can approximate it.
Thus, the T statistic described before is standard normally
distributed.

Recall that:

            Y1 - Y2 
T =  ------------------- 
      sqrt(s1/n1 + s2/n2) 

Now, plugging here s1=0.25, s2=0.21, n1=n2=100, Y1=0.5 and Y2=0.3, we
obtain T=2.9488.

Finally, what does this T number means? If the two population means
were equal, we would expect Y1 and Y2 to be very similar, and thus T
should be close to zero. We obtained T=2.9488. Is this enough "far
away" from zero to conclude that the population means are in fact not
equal?

In order to answer this, we need to construct the 95% confidence
interval. We know that the T statistic is standard normally
distributed. We want to find a number "a" such that:

Prob(-a<T<a)=0.95

Th interval (-a,a) will be the confidence interval. If 2.9488 falls
between -a and a, then we will conclude that it is not enough "far
away" from zero, and thus we will conclude that the population means
are not different. On the other hand, if 2.9488>a or 2.9488<-a (the
latter is obviously not the case, as 2.9488 is positive and -a is
negative) we will conclude that it is statistically different from
zero, and thus the population means are statistically different.

So, how do we compute this interval? We want the confidence level to
be 0.95. So we need to find a such that Prob(T>a)=0.025. Since a
standard normal distribution is symmetric around 0, the number a that
solves Prob(T>a)=0.025 will also solve Prob(T<-a)=0.025. So, the
probability of a standard normally distributed variable to fall
"outside" of (-a,a) is 0.025+0.025=0.05. Thus, the interval (-a,a)
defines a 95% confidence interval. Now, to compute a:

Prob(T<-a)=0.025

Since T is standard normally disributed, we can use the table we used
before. Looking up 0.025 in the table, we find that the value that
corresponds to this probability is -1.96. So we build the confidence
interval for the T statistic with this number. The confidence interval
is then (-1.96,1.96).

Comparing this interval with 2.9488 we find that the T statistic we
observed is outside the confidence interval, and thus we REJECT the
hypothesis that both means are equal at the 0.05 level. So, yes, there
is a significant difference between the two observed means.


I hope the answers were clear enough. If you have any further
questions regarding my answers, please don't hesitate to request a
clarification. Otherwise, I await your rating and final comments.

Best wishes!
elmarto



http://www.tufts.edu/~gdallal/p.htm
sunshine2-ga rated this answer:5 out of 5 stars and gave an additional tip of: $2.00
Very Fast Solution with detailed step-by-step explanation.

Comments  
Subject: Re: Business Statistics - 3 Solutions
From: elmarto-ga on 06 Jun 2003 11:05 PDT
 
Thanks for the rating and tip! I'll be eagerly waiting for your future questions.

Best luck with your future researches!

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy