|
|
Subject:
error margin and percent cofidence, statistics and probability
Category: Science > Math Asked by: llamaflaca-ga List Price: $40.00 |
Posted:
21 Aug 2005 21:04 PDT
Expires: 20 Sep 2005 21:04 PDT Question ID: 558542 |
Greetings: I need some help doing statistical analysis for an experiment. Basically I am looking to estimate the number of false events within x% confidence level. I say x% because I may not have had enough samples to reach 95% confidence level to begin with. To consider the question fully answered I need the following: 1. Explanation of why a particular model was chosen (i.e. binomial or Poisson or something else). 2. I also need a reference to either a link to a table or other way so I may replicate the results provided by the researcher given enough detail in item # 1 referenced above and subsequent items mentioned below. 3. Formulas used. 4. Calculations for all groups providing the error margin and confidence. Here is the problem: I have changed some of the details but the concept is generally this: I have 4 baskets, each basket has an unknown number of, lets call them apples. The apples may count in the 1000?s or even more per basket. Each basket is separate from one another, that is to say they are independent from each other basket. Now, I sampled 500 apples from each basket. Of those 500 apples, I would like to estimate the number of non-apples (could we call them false apples?) present in both the total 500 population and the larger unknown population. So I took a close look at 75 apples from each bunch of 500. From these closer look I obtained the following results. Group A: Out of 75 apples, 1 non-apple Group B: Out of 75 apples, 9 non-apples Group C: Out of 75 apples, 39 non-apples Group D: Out of 75 apples, 2 non-apples Here are the following issues I see with this. Since I sampled less than 1000 in a population of X size, I don?t think I can use the standard error and confidence formulas since I believe for 95% confidence a minimum of 1000 must be sampled. Additionally, I fail the second rule which is at least 10 false and true events and Groups A, B, and D do not satisfy this. I considered the Binomial model but again, too few hits. I considered rare events and the Poisson model as another option but here is where I get lost. I will say the following conditions apply to my samplings: 1. When sampling 75 out of 500, they are Bernoulli in nature, sampling 1 or 75 does not change the odds of each one and the prior results do not affect the future results of the next sampling. 2. I do not know what are the odds or true probability of getting a non-apple out of 75 or 500 or millions. I only see from my limited sampling that for Group A it was 1 / 75 and so on. 3. I only sampled one instance of 75 out of 500. The ultimate answer I am looking for here is: Given Groups A to D, if for example I have 9 out of 75 non-apples as in Group B, what is the probability of a non-apple in this and the other groups? With what percentage of confidence? What is my margin of error meaning could it be between say 2 and 10 non-apples with 95% confidence? And how can I extend this to the population of 500 and or a larger unknown total population? For example what is the probability of getting a non-apple in this type or basket for say a population of millions of apples (again within this basket ? think of types) |
|
There is no answer at this time. |
|
Subject:
Re: error margin and percent cofidence, statistics and probability
From: raokramer-ga on 18 Sep 2005 22:02 PDT |
My understanding is that we have one basket with m+n=500 apples out of which n are special, and we are drawing N=75 apples at random and we get x=i special apples in the sample. As explained at http://mathworld.wolfram.com/HypergeometricDistribution.html where you can find the formulas for your report, x has a hypergeometric distribution with n as the parameter of interest and x as an observed random value. I?ll follow notations of the link above. For the most of the part, however, we'll need not the formulas but reasoning with understanding of what confidence intervals are. The files with tables and a SAS program for your reference are posted at http://www.geocities.com/raokramer/index.html PLAN: Assuming that you prefer a conventional statistical method, let's compute a maximum likelihood (ML) estimate nHat of n (number of special apples in the basket) for each i=1, ... ,75. Then to obtain confidence intervals (CI), we will apply a standard procedure of inverting Likelihood Ratio Test acceptance regions (LRT, see for example "Statistical Inference" by G.Casella, R.Berger, ch. 'Interval Estimation'). SOLUTION: 1. The easy part is to compute an ML estimate nHat for the number n of special apples in the basket. Since the low counts won't allow us to apply normal approximation to x, we should compute an exact estimate by maximizing likelihood function over the set of all possible values of the estimated parameter n. File "d.txt" contains probabilties (=likelihoods) of oberving each value of x given different n's. For a fixed x, nHat will be the n with the largest likelihood. The results are in file "nHats.txt". E.g. for observed value of x i=1 ML estimate for n is nHat=6. Field estP is the maximum of likelihood function, it will also be used in the second part as a denominator for the LRT statistic. 2. To obtain CI we consider Likelihood Ratio Test. For a fixed value of n in the Null hypothesis ?H0: the number of the special apples in the basket is n?, the upper and lower bounds of LRT acceptance region at 95% confidence limit (UCL(n) and LCL(n)) can be computed as the bounds of the set of nHat values corresponding to the highest LRT statistic for which probabilities sum up to 95% (over nHats for fixed n). File "LRTstats.txt" contains LRT statistics (LRTstat= p / estP) in descending order for each group of n. To obtain the 95% test acceptance bounds, those nHat's will be chosen that correspond to the first several largest LRT stats so that cumulative probability (p) equals or barely exceeds 0.95. File "regionBounds95.txt" contains those test acceptance bounds for each tested n, with CRp as the confidence coefficient for the test confidence region. Now to obtain CI's we are inverting the test regions. Each test region in effect maps (one-to-many) an n to a set of nHat's. All we need is to invert this mapping in the functional sense, so we obtain a map that maps each nHat to a subset of n's. We should also attribute a probability of coverage for each of those mappings that will be the minimum confidence coefficient over all n's that correspond to a specific nHat. The file "map.txt" contains the mentioned mapping. Note that not all nHat's between the limits of the test region are possible by virtue of nHat being a function (incidentally a 1-to-1) of observed x=i as defined in "nHats.txt". Inverting that map to obtain confidence limits for n based on nHat is a simple matter of one SQL statement with the result in "invertedMap.txt". Another SQL statement maps nHats to the observed i's, which brings us to the final solution in file "solution.txt", where i - observed number of special apples in the sample nHat - ML estimate for number n of special apples in the basket nLCL - lower confidence limit for n nUCL - upper confidence limit for n CLp - confidence coefficient (probability of coverage of n by the confidence interval) ANSWER: The requested estimates are: i nHat nLCL nUCL CLp ----------------------- 001 006 001 030 0.95181 002 013 003 040 0.95181 009 060 030 101 0.95007 039 260 207 314 0.95396 Thanks -- RK |
Subject:
Re: error margin and percent cofidence, statistics and probability
From: llamaflaca-ga on 19 Sep 2005 15:40 PDT |
Thank you for this. Since you posted this as a comment, how may I compensate you for your troubles?. Will be checking the numbers and backup data over the next couple of days. |
Subject:
Re: error margin and percent cofidence, statistics and probability
From: raokramer-ga on 19 Sep 2005 20:05 PDT |
No trouble at all, looks like you did a nice job formulating the problem, don't hesitate to ask again. Also there are forums that many qualified statisticians attend, e.g. http://www.listserv.uga.edu/cgi-bin/wa?A0=sas-l. Two things to mention: 1. When you subscribe, somebody will be reviewing your request and grant you access 2. If your question sounds like a homework or a midterm project, I'd bet it will be ignored at a minimum I hope the apples are behaving well :) |
Subject:
Re: error margin and percent cofidence, statistics and probability
From: raokramer-ga on 20 Sep 2005 05:29 PDT |
I just had to re-upload the LRTstats.txt because I noticed I uploaded the wrong file that resulted from a typo, everything else is correct. -- RK |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |