I'm not sure what (s)he did wrong, but it is clear from a
back-of-the-envelope calculation that Padmapani's answer is not
correct.
For a variable that follows a normal distribution with mean <x> and
standard deviation s, there is a 0.683 probability that the variable
will have a value within 1 standard deviation of the mean, and
conversely, (1.0-0.683). There is a 0.159% probability that it will
be greater than (<x> + s), and a 0.159% probability that it will be
less than (<x> - s). Similarly, there is a 0.955% probability that
the value will lie within +/- 2*s of the mean. (A confidence level of
95%, or 0.95 corresponds to +/- 1.96*s.)
For the case you present, <x> = 0.52, with an error of 0.05 at a
confidence level of 0.95. For simplicity, let's ignore the difference
between 1.96 and 2, and say the standard deviation is equal to 2.5.
That means that there is an 0.841 (1.0-0.159) probability that the
actual value is greater than <x> - s = 0.52 - 0.25 = 0.495. The
probability that the actual value is greater than 0.5 (as opposed to
0.495) will be somewhat less than 0.841, but nowhere near the value of
0.6517 given in the previous comment.
To calculate the actual probability that more than 50% of the
population prefers a candidate if, in a 2-person poll with no
undecideds, the poll results were 52% for the candidate with a 5%
margin of error at the 95% confidence level, we need to calculate the
integral from 0.5 (i.e., 50%) to infinity of the normal distribution
that a mean of 0.52 and a standard deviation of 0.05/1.96 = 0.02551.
One can either look this up in a table, or use one of many programs
that can calculate an integral of the normal distribution (also known
as the cumulative normal distribution).
Excel has a built-in function, NORMDIST, that calculates the integral
of the normal distribution from minus infinity to a specified value.
This yields the probability that the random variable has a value LESS
than the specified value. The probability that the variable has a
value GREATER than the specified value is simply 1 minus the value
returned by the NORMDIST function.
For <x> = 0.52, s = 0.02551, the probability that *less* than 50% of
the population prefers the candidate is 0.2165. That means the
probability that *more* that 50% of the population prefer him/her is
1.0 - 0.2165 = .7835.
The actual value, 0.7835, as expected, is somewhat less than the value
of ~0.84 we obtained from the back-of-the envelope calculation we
started with.
Note that all this assumes that the polling error is entirely due to
the finite size of the sample. It does not take into account any
biases or systematic problems with the polling methodology.
Retrospective studies of the accuracy of political polls over the last
few decades indicates that the reported polling errors underestimate
the actual errors by a factor of ~1.5 to 2.0 (i.e., the margins of
errors cited by the pollsters are too small by a factor of 1.5 to 2). |