Google Answers: Election poll statistics

View Question

Q: Election poll statistics ( No Answer, 3 Comments )

Question

Subject: Election poll statistics
Category: Science > Math
Asked by: popquiz-ga
List Price: $10.00

Posted: 27 Oct 2004 20:20 PDT
Expires: 26 Nov 2004 19:20 PST
Question ID: 421036

If a poll has one candidate at 52% and the other at 48%, with a 5%
margin of error and a 95% confidence interval, how do you calculate
the probability that the leading candidate is actually supported by at
least 50.0% of the entire population?  It seems that the talking heads
on TV report such events as "statistical dead heats" and are missing
the fact that there is a fairly high chance that the leading candidate
is really ahead.

Answer

There is no answer at this time.

Comments

Subject: Re: Election poll statistics
From: padmapani-ga on 28 Oct 2004 12:02 PDT

Due to the Central Limit theorem we can reasonably assume that the
population is "normally" distributed in this case.So we can calculate
the z-score as following

z= x-m/s 

The issue here is that we are dealing with a sample and not a
"population" and hence we substitue proportion of sample for the mean
x and the std. deviation would be sqrt(p*(1-p)/n).Here n is 100 as we
are implicitly using percentages.

there fore our z-score looks like 

z=ps - p / (sqrt(p*(1-p)/n))


z= 0.52 - 0.50/sqrt((0.5 * 0.52)/100) = 0.02/0.05 ~~ 0.39

The area under the bell curve from Z=0 to Z=0.39 is 0.1517 which means
there is a 15.17% chance that the leader gets more than 50%

We now need to find the probability of obtaining a sample proportion
above 50% we simply need to add 0.1517 to 0.5 to yield 0.6517 which is
65.17% really.

That is saying that in reality the chance of obtaining more than 50%
vote for the leading candidate is 65.17%

I am not sure of the calclulation and you might wanna get a google
reasearcher opinion on this.I am just commenting on this piece

Subject: Re: Election poll statistics
From: hfshaw-ga on 29 Oct 2004 00:31 PDT

I'm not sure what (s)he did wrong, but it is clear from a
back-of-the-envelope calculation that Padmapani's answer is not
correct.

For a variable that follows a normal distribution with mean <x> and
standard deviation s, there is a 0.683 probability that the variable
will have a value within 1 standard deviation of the mean, and
conversely, (1.0-0.683).  There is a 0.159% probability that it will
be greater than (<x> + s), and a 0.159% probability that it will be
less than (<x> - s).  Similarly, there is a 0.955% probability that
the value will lie within +/- 2*s of the mean.  (A confidence level of
95%, or 0.95 corresponds to +/- 1.96*s.)

For the case you present, <x> = 0.52, with an error of 0.05 at a
confidence level of 0.95.  For simplicity, let's ignore the difference
between 1.96 and 2, and say the standard deviation is equal to 2.5. 
That means that there is an 0.841 (1.0-0.159) probability that the
actual value is greater than <x> - s = 0.52 - 0.25 = 0.495.  The
probability that the actual value is greater than 0.5 (as opposed to
0.495) will be somewhat less than 0.841, but nowhere near the value of
0.6517 given in the previous comment.

To calculate the actual probability that more than 50% of the
population prefers a candidate if, in a 2-person poll with no
undecideds, the poll results were 52% for the candidate with a 5%
margin of error at the 95% confidence level, we need to calculate the
integral from 0.5 (i.e., 50%) to infinity of the normal distribution
that a mean of 0.52 and a standard deviation of 0.05/1.96 = 0.02551. 
One can either look this up in a table, or use one of many programs
that can calculate an integral of the normal distribution (also known
as the cumulative normal distribution).

Excel has a built-in function, NORMDIST, that calculates the integral
of the normal distribution from minus infinity to a specified value. 
This yields the probability that the random variable has a value LESS
than the specified value.  The probability that the variable has a
value GREATER than the specified value is simply 1 minus the value
returned by the NORMDIST function.

For <x> = 0.52, s = 0.02551, the probability that *less* than 50% of
the population prefers the candidate is 0.2165.  That means the
probability that *more* that 50% of the population prefer him/her is
1.0 - 0.2165 = .7835.

The actual value, 0.7835, as expected, is somewhat less than the value
of ~0.84 we obtained from the back-of-the envelope calculation we
started with.

Note that all this assumes that the polling error is entirely due to
the finite size of the sample.  It does not take into account any
biases or systematic problems with the polling methodology. 
Retrospective studies of the accuracy of political polls over the last
few decades indicates that the reported polling errors underestimate
the actual errors by a factor of ~1.5 to 2.0  (i.e., the margins of
errors cited by the pollsters are too small by a factor of 1.5 to 2).

Subject: Re: Election poll statistics
From: hfshaw-ga on 29 Oct 2004 00:33 PDT

The first sentence of the second paragraph in my comment should end at
the second comma:

(For a variable that follows a normal distribution with mean <x> and
standard deviation s, there is a 0.683 probability that the variable
will have a value within 1 standard deviation of the mean.)

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy