Victory probability mapObama lead over time

Thursday, July 31, 2008

Exciting news about parameters

Possibly the most important parameters in my methodology are the assumed volatility of public opinion and the formula for assumed correlations between states. For most of this year, I've been assuming that one standard deviation of one-day national public opinion change is 0.2% and correlations of 95%+ between states, numbers that I chose because they looked roughly right from the data I saw, and because they gave results that made sense to me. However, ever since I saw the graph on this post from 538, I've worried that the volatility assumption was somewhat too low. So, over the weekend, I came up with a reasonable way to assess parameters from available 2008 data. More on that below the fold.

The main point is, now that I have a better grasp of the parameters I'm using, I have much greater confidence about my analysis than I ever have before. Additionally, I adjusted my estimates of voter turnout by scaling the 2004 turnout up by the increase in a state's voting-eligible population.

(By the way, I now believe that the graph I linked to above is somewhat misleading. I was only 10 in 1992, but as I recall Ross Perot's share of the vote plummeted after he bizarrely left and re-entered the race. I believe this is what produces the high-error dots early in the 1992 campaign.)

The key insight for setting parameters is this. Let's say I run the algorithm to generate estimates for each day during the campaign. If the algorithm's parameters are set correctly, then the polling data should tend to equal the estimate plus/minus a predictable error range. If the polling data are skewed to one side of the estimates, if the polling data match the estimates more closely than expected, or if the polling data match the estimates less closely than expected, then the parameters are bad.

For each date during the campaign and for each geographical unit, my algorithm produces (among other data) an estimate of the Obama-McCain difference in public opinion and the uncertainty of that estimate. Now suppose A is the uncertainty of my estimate on the polling date (i.e., one standard deviation of the difference between my estimate and the platonic truth), B is the poll's sample error, and C is the poll's non-sample error (which I assume is 2%). These are the three components of the difference between my estimate of the Obama-McCain difference (E) and the poll's estimate of the Obama-McCain difference (P), and it's reasonable to assume that these sources are independent. Then I expect the poll's estimate of the Obama-McCain difference to be normally distributed with an expected value of my estimate of the Obama-McCain difference and a standard deviation of sqrt(A^2+B^2+C^2).

In other words, if I list out the statistic (P-E)/sqrt(A^2+B^2+C^2) for each poll, I expect the list to have a standard normal distribution. To measure how close it is to a standard normal distribution, I first measure the mean (m) and standard deviation (s) of this statistic. Then, I evaluate the integral of [pdf(0,1)-pdf(m,s)]^2, where pdf(x,y) is the probability density function of the normal distribution with mean x and standard deviation y. Clearly, lower values of this integral signal a better distribution of polls around my estimates, which signals better parameters.

So far, the best parameters I've found are a daily national volatility of 0.245% and a correlation between each pair of states of 92%.

Labels: ,


Read more (maybe)!

Monday, July 28, 2008

No evidence of a bounce

Chris Bowers is orders of magnitude more insightful than David Gergen, so I won't mock him. However, he's wrong to see significant evidence of an Obama bounce from his foreign trip. I know that my chart of Obama's lead doesn't show any bounce. Also, I don't see any bounce in Pollster.com's chart either, which Bowers cites. I just don't see Pollster's average ever dipping down to 2.1% during Obama's trip. I suspect that at some point in time, Pollster had the race at 2.1% because of some randomness in which polls were released to the public first; however if we take their current chart as their best view of the race through time, it's pretty clear that there's little evidence of a significant bounce.

Of course, Bowers is right to criticize Adam Nagourney's dumb concern that Obama fails to crack 50%. When 10% of voters are undecided (and thus likely to split somewhere between 60-40 and 40-60), a candidate with 46-50% support is clearly winning.

Just to be clear, I don't have a view yet about whether there has been a bounce. I look forward to the next week of polling.

Labels: ,


Read more (maybe)!

Great moments in punditry, David Gergen edition

Every time I watch CNN, I remember how dumb pundits can be, and why I avoid watching CNN. Consider conservative commentator David Gergen. Earlier this evening, he was confused about whether Obama or McCain is currently winning. After all, one polling agency, Gallup, had released two polls with conflicting results.

Maybe if David Gergen were paid money to be a political expert, he could have found five minutes of spare time to look at all the polls taken this month, available at Pollster.com:

Obama aheadTieMcCain ahead
Gallup 7/27Rasmussen 7/16USA Today/Gallup 7/27
Rasmussen 7/25
Economist/YouGov 7/24
Gallup 7/24
Democracy Corps 7/24
FOX 7/23
Rasmussen 7/22
Gallup 7/21
NBC/WSJ 7/21
Rasmussen 7/19
Gallup 7/18
Economist/YouGov 7/17
Rasmussen 7/16
Gallup 7/15
CBS/Times 7/14
Rasmussen 7/13
ABC/Post 7/13
Zogby/Reuters 7/13
Quinnipiac 7/13
Gallup 7/12
IBD/TIPP 7/11
Newsweek 7/10
Rasmussen 7/10
Gallup 7/9
Economist/YouGov 7/9
Rasmussen 7/7
Gallup 7/6
Economist/YouGov 7/2
Gallup 7/2
Rasmussen 7/1

Ah, but of course David Gergen does get plenty of money from CNN, which thrives on excited reporting on close races. And, to quote Paul Krugman quoting Upton Sinclair today, it's hard to get a man to understand something when his salary depends on his not understanding it. I'd expect CNN to continue its pattern of selecting its own facts in order to report on a close race from now until election day.

[Edited to add...] Adam C of Redstate is clearly experiencing some wishful thinking, but is still smart enough to understand that poll averaging is an easy and useful thing to do. Why oh why can't CNN have better conservative pundits?

Labels: , , ,


Read more (maybe)!

Monday, July 14, 2008

Obama vote by state

Here's a list of states, order by Obama vote, with 95% confidence bands for the election-day result. (DC is off the chart.)

At the risk of repeating myself, I believe that the changes in different states are highly correlated, which implies that the order of the states here is roughly fixed. Some states may swap places during a campaign, and some (FL and MT) might move several spots from one campaign to the next, but I would be surprised by major changes to this list during the next four months. Therefore, the states whose results are close to the US popular vote (NM, OH, MI, PA) are far more important than the states that happen to have close races by virtue of nationwide fluctuations (FL, the Dakotas, MT, and so on).

From thinking at this list, I have a hypothesis that this race will be determined by the middle four states -- NM, OH, MI, and PA. Specifically, if Obama wins three of these, then he wins the electoral college, while if McCain wins two of these, then he wins the electoral college. Maybe this hypothesis sounds obvious, but it failed in 2000 (Gore won NM/MI/PA) and nearly failed in 2004 (Kerry nearly won NM). I think that Obama's solidification of IA and his gains in CO have changed the map in a way that's far more material than his more publicized gains in MT/ND/SD.

Labels: , , ,


Read more (maybe)!

Sunday, July 13, 2008

Obama up by 5.0%; 91% chance of victory

Prediction for Election Day

  • Probability of victory: 91% electoral, 90% popular. This includes a <0.5% chance of tie.
  • Expected value of vote: 328 electoral votes, popular win by 5.0%.
  • 95% range of electoral votes: 230 to 426.
  • Must-win states (pseudo-Banzhaf):
    • OH .026
    • MI .021
    • NM .019
    • PA .018
  • Bellwether states (correlation):
    • 70%+ OH, MI
    • 60%+ PA
    • 50%+ NM, CO, NH
    • 40%+ NV
    • 30%+ MO, WI, VA
  • Confidence map (states sized by electoral vote, darker red means higher confidence that McCain will win, darker blue for Obama): Map showing my confidence in who the leader will be.

Prediction if the election were today

  • Probability of victory: >99.5% electoral, >99.5% popular.
  • Expected value of vote: 325 electoral votes, popular win by 5.0%.
  • 95% range of electoral votes: 293 to 367.
  • Must-win states (pseudo-Banzhaf):
    • [none]
  • Bellwether states (correlation):
    • [insufficient sample]
  • Confidence map (states sized by electoral vote, darker red means higher confidence that McCain will win, darker blue for Obama): Map showing my confidence in who the leader is.

Popular vote estimate

(Darker red means more votes for McCain, darker blue for Obama.) Map showing my estimate of the popular vote on 07-13-08

Read more (maybe)!

Thursday, July 03, 2008

Not news: MT is close. News: SD is close.

You heard it hear first. For at least a couple weeks, I've considered MT to be a very close race. Because of the lack of frequent polls in MT, this assessment has mostly come from polls in other states, combined with my key belief that changes in opinion in one state are highly correlated to changes in other states.

This was not a widely held opinion, but I feel rather vindicated now. Even before today's Rasmussen MT poll, I believed that McCain led MT by 0.7%, with a 5.4% margin-of-error as an snapshot of current opinion, or a 9.7% margin-of-error as a prediction of November results. So, when a poll comes out saying that Obama leads by 5% in MT with a 9% margin-of-error, I'm not surprised.

The next thing you should not be surprised about is that the SD is probably fairly close too. As in MT, Bush beat Kerry by 21 points in SD, and I see no reason why Obama would fail to improve by as much in the state of Tim Johnson and Stephanie Herseth Sandlin than he would in the state of Brian Schweitzer and Jon Tester. Also, so many of Obama's advisers are former Daschle people, that I'd be surprised if Obama doesn't put some (small) amount of resources into SD.

Labels: ,


Read more (maybe)!