|
|
Subject:
Probability and degree of reliability (accuracy in predicting future results)
Category: Science > Math Asked by: respree-ga List Price: $25.00 |
Posted:
23 Jun 2006 10:39 PDT
Expires: 23 Jul 2006 10:39 PDT Question ID: 740543 |
If you threw 1000 pennies up into the air, about half of them would land heads, with the other hand landing tails ? an approximate 50% probability of one or the other. We know this, because a coin has two sides. However, if you only had two pennies and used only a single toss, the likelihood that one would land hands with the other one landing tails would be somewhat less than in the first example. Repeat the single toss a thousand times, and you?d likely wind up with similar results as if you used 1,000 pennies. My question has to do with the reliability of probability outcomes based on a small sample size. Let?s take a hypothetical situation. Let?s say a clinical study was performed involving 40 patients to see if an experimental drug worked or not. There are only two possible outcomes ? it either worked or it didn?t. On 4 of the patients, the drug did not work and on the remaining 36, it did work. One might deduce from this data that the drug works approximately 10% of the time. But would this be a fair conclusion? Would the reliability of a future predictive assumption be compromised, because the sampling size is too small? Obviously, if one were looking at a failure rate of 10% based on a test on 400,000 patients (40,000 patients had no results), it would lead any reasonable person to conclude, with fair degree of probability, that 10% is in fact, the actual failure rate. It seems to me that there should be a mathematical correlation between sampling size and the predictive reliability, reaching a point where the predictive probability is no longer reliable. What is that point? I?d appreciate any comments or thoughts from someone who has a strong math background or a researcher that can find websites explaining the correlation of sampling size to reliability. The bottom line question. Would a reasonable person conclude in my hypothetical 40 patient study that approximately 10% is a ?fair? basis for predicting future failure rates? | |
| |
| |
| |
|
|
There is no answer at this time. |
|
Subject:
Re: Probability and degree of reliability (accuracy in predicting future results
From: myoarin-ga on 24 Jun 2006 02:58 PDT |
Just a free comment: I once read a delightful and very interesting book about common misunderstanding of statistics. It touched on this very subject in one or two chapters: the true statistical meaning medical testing and how the raw numbers are sometimes misinterpreted by medical researchers. |
Subject:
Re: Probability and degree of reliability (accuracy in predicting future results
From: rracecarr-ga on 26 Jun 2006 13:14 PDT |
The standard deviation (amount of spread) of the number of failures you'll get is roughly equal to the square root of the average number of failures. So, given that you got 4 failures, there's a reasonably good chance (better than half) that the mean number of failures you'd get in a bunch of tests with sample size 40 is 4 +/- sqrt(4), or between 2 and 6. So a good estimate based on this single test is that the failure rate is likely to be between 5 and 15%. Similarly, in the other example, with 40,000 failures, the average number of failures is likely to be between 39,800 and 40,200 (40,000 +/- sqrt(40,000)). So in that case, the failure rate is likely to be between 9.95% and 10.05%. |
Subject:
Re: Probability and degree of reliability (accuracy in predicting future results)
From: respree-ga on 27 Jun 2006 07:39 PDT |
Thanks you both for your comments. Can anybody else confirm rracecarr-ga's comment on standard deviation? Sorry if this seems so basic for the mathemeticians out there, but I'm afraid I'm no math size and am just looking for people to agree that this is the correct way of approaching the answer to my question. Thanks again. =) |
Subject:
Re: Probability and degree of reliability (accuracy in predicting future results
From: neurogeek-ga on 28 Jun 2006 11:42 PDT |
respree, I also thought immediately of standard devation when I read your question. I think there is more to it than that, though. Often when average and standard deviation are reported, the probability that the actual average is outside the predicted range is also reported. Are you still interested in a full answer? I think I could come up with more than is already contained in the comments, with some good links. --neurogeek |
Subject:
Re: Probability and degree of reliability
From: ga_cal-ga on 07 Jul 2006 01:21 PDT |
It's widely known that a proportion estimate --say q--(i.e. the number of occurences of a specific event from a large sample of N individuals), follow a normal law centered on the theoretical proportion --say p--, with a variance of: V(p,N) = p*(1-p)/N usually, we take q as an estimate of p, so you now know that the empirical estimate of your proportion q is centered on p with variance V(q,N) = q*(1-q)/N so a confidence interval on q with a confidence level of 95% is: [q-1.96*sqrt(q*(1-q)/N); q+1.96*sqrt(q*(1-q)/N)] (1.96 is related to 95% through a gaussian distribution, use for instance http://graphpad.com/quickcalcs/probability1.cfm in last section --GAUSSIAN-- use mean=0 and STD=1, on the next page you will read on the last column: 5.49%->1.92 and 4.77%->1.98) For you example with 40 patients, the confidence interval is: [.10-1.96*sqrt(.10*(1-.10)/40), .10+1.96*sqrt(.10*(1-.10)/40)] i.e. [0.7%, 19.3%] i.e. "probability that real proportion is in [0.7%, 19.3%] is 95%" And with 40.000 patients: [9.71%, 10.29%] see for instance http://davidmlane.com/hyperstat/B9168.html or any statistical book http://books.google.com/books?q=proportion+estimate+confidence&lr=&sa=N&start=20 |
Subject:
Re: Probability and degree of reliability (accuracy in predicting future results
From: rracecarr-ga on 07 Jul 2006 17:37 PDT |
The previous comment is not right. The binomial distribution will only be approximately normal for very large N. For example you certainly cannot have a negative number of failures. The stated 95% confidence interval of [0.7% 19.3%] is silly. If the failure rate were really 0.7%, the probability of getting 4 or more failures in 40 trials is only 0.018%. That's less than one chance in 5000. |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |