Google Answers Logo
View Question
 
Q: Statistical sample size - small population? ( No Answer,   2 Comments )
Question  
Subject: Statistical sample size - small population?
Category: Science > Math
Asked by: wonderingperson-ga
List Price: $20.00
Posted: 24 Jun 2004 20:54 PDT
Expires: 25 Jun 2004 20:33 PDT
Question ID: 365987
What is the proper statistical sample size to estimate the mean time a
person takes to complete task? The full population is only 300 people.
 I'd like to be 90% confident of the answer.  I don't have an estimate
of the either the mean or the standard deviation--the purpose of the
study is to get a baseline measurement.  Would appreciate a general
explanation including assumptions (if needed).

Request for Question Clarification by mathtalk-ga on 25 Jun 2004 10:38 PDT
Hi, wonderingperson-ga:

fj-ga has given pretty good advice here, from a practical point of
view.  If sampling 30 people is not prohibitively expensive in time or
money, then you could go with that.

As mentioned in that Comment, the size of the confidence interval, say
at the 90% level you are looking for, will depend on the sample size,
on the results of the sampling, and on what assumptions you are
willing to make about the underlying population distribution.  In
principle you could develop a 90% confidence interval from (say) 3
observations.  The catch is that such an interval would generally be
wider (and hence less useful) than a confidence interval from 30
observations.

If you like, I can explain the methods, theory, and give illustrative
examples (in part by referring to links), but if you are happy with
the response provided by the Comments, feel free to Close (expire) the
Question.

regards, mathtalk-ga

Clarification of Question by wonderingperson-ga on 25 Jun 2004 20:29 PDT
Thanks, mathtalk-ga, I think this is enough.  It's been years since I
took statistics and don't believe we addressed situations like this in
the class.  I appreciate your response and that of fj-ga.  The
responses give me enough information to be comfortable with the
answer.  Thanks for the help!
Answer  
There is no answer at this time.

Comments  
Subject: Re: Statistical sample size - small population?
From: fj-ga on 25 Jun 2004 07:16 PDT
 
I believe that numerous studies have shown that for continuous type
data (which includes time), a sample size of 30 will provide a good
estimate (within 10% of actual). This should hold true for all types
of distribution, i.e. normal or non-normal. For attribute type data
you need 125 samples. Sorry I can't point you to the theory behind
this, but it does work! Try it yourself by getting Excel to generate
300 random times, then randomly select 30 out of the population and
compare the mean of the 30 to the mean of the whole.

http://www.isixsigma.com/offsite.asp?A=Fr&Url=http://www.mathwizz.com/statistics/help/help4.htm

also http://www.isixsigma.com/library/content/c030506a.asp

hope this helps.
Subject: Re: Statistical sample size - small population?
From: mathtalk-ga on 25 Jun 2004 15:53 PDT
 
Mathematically speaking we cannot make a guarantee of "within 10% of
actual" by using a sample size less than the whole population.  Let me
give an extreme example to illustrate.

Suppose we have 299 people who can perform a task instantly and 1 who
takes 300 minutes.  On average the task is performed in 1 minute then.

Now a sample of 30 people will either include the one "outlier" or
not, so the sample mean will either be 10 minutes (with probability
10%) or 0 minutes (with probability 90%).  It will never happen (given
these admittedly contrived circumstances) that the sample mean is
within 10% of the actual mean.

So, we should try instead to work out a range of values (based on the
sample taken) which has a 90% likelihood of containing the actual
population mean.  If the actual population were "normal" or roughly
so, then an estimation of the population's variance (taken from the
sample variance with adjustment) can be used to construct just such an
interval (symmetric about the sample mean).

A more careful claim about this approach would be, the interval it
produces will contain the population mean for 90% of the samples that
could be taken.  It can't guarantee anything about the estimated mean
from anyone one particular sample, and the 90% is referred to as a
"confidence level" to remind us that it isn't really the same as
asserting a 90% probability from the observed sample.

Other methods, often called "robust" or "distribution-free", can be
used when we have reason to suspect the population distribution isn't
close to normal.  My "binomial distribution" example shows some of the
characteristics at play.  Unlike the normal distribution, which is
symmetric about its mean, that discrete distribution was heavily
"skewed", with all but one individual below average.

regards, mathtalk-ga

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy