Stats madness

This article explains what that margin of error thing is that journalists take on to survey data. I thought this was pretty interesting because I used to know a fair bit about stats back in my grad school days. I taught intro to behavioral stats to undergrads, and that experience drove me from academics into the real world.

Anyway, according to the article the margin of error is really the 95% confidence interval value. Adding and subtracting that number from the sample mean gives you a range, and there is a 95% probability that the population mean (of which the sample mean is an estimate) falls into that range. For example, say I put out a survey to a sample of likely voters and I find that 45% of them report they will vote for Kerry. I do the math to find the margin of error which comes up 5%. This tells me that there is a 95% probability that in the population of likely voters the average number who report they will vote for Kerry is 40%–50%.

This then seems like a good way to estimate who could win the election, because if the intervals do not overlap then there is at least a 95% probability that the guy with the bigger number will win. But if they do overlap than the probabilities become murky and strange. So this article re-frames the question to say that instead of comparing the numbers and making probabilistic estimates, we should make estimates based on the difference between the numbers. Because if the difference is not 0 then the guy with the bigger number wins. This handy chart was computed:

Margin of error chart

The only drawback to this chart is that is not cut and dried. When is an estimate good and when is it bad? I like the math behind this but it would be some stinky cheese to have to listen to the media argue about whether an 85% chance is worth believing.