Frontloading HQ: presidential election forecasts

Showing posts with label presidential election forecasts. Show all posts

Wednesday, November 18, 2009

Obama v. Palin in 2012? One Forecast is Already In

A month and a half ago, FHQ posted a link to and discussed a presidential election forecasting model built on candidate biographical information. The benefit of this model -- and it performs quite well stacked up against other forecasting models -- is that the biographical data exists now. In other words, you don't have to wait until the second quarter economic numbers are released or to wait on polling data from a particular period of time in the election year to put an accurate forecast together. [But hey, if you want to continue to come here and watch FHQ wade through the quadrennial polling data on the presidential race, we won't fault you. We here at FHQ may go so far as to encourage it.] I left off in that post urging folks to start scouring the biographical data on the prospective 2012 Republicans.

But why do that? Well, if you're patient, you'll be pleasantly surprised by an email from the authors of the original research. And lo and behold, one of those co-authors, Andreas Graefe (the other is J. Scott Armstrong), emailed me this morning to inform me that -- yes, that's right -- they've already looked at the Obama v. Palin numbers. How does Palin fare against the President?

[Click to Enlarge and here for the full description of the 2012 update at PollyVote.]

That nine point difference between the two candidates' biographical indicators translates to Obama carrying a 59.6% share of the two-party vote in 2012 if this was the match up (For some context, Obama received 52.9% of the vote in 2008 or 53.4% of the two-party vote). That's Reagan-Mondale territory and would likely make for quite the electoral college sweep for Obama.

But didn't you say that this model wasn't particularly adept at picking elections involving incumbents? (Ah, you followed the link and read the previous post, didn't you? Thanks.) That's right. Three incumbents with biographical score advantages lost re-election bids (to: Truman '48, Carter '76, Clinton '92). It has been done, then, but let's look a little more closely at those three elections. Carter and Truman had deficits of 5 points on the biographical index while Clinton trailed Bush by just three points. Palin's disadvantage against Obama is over twice the average deficit across those three incorrectly predicted elections, though.

That's a real hole to be in even before you start considering running for president. But back to my question from the last post: Who among the 2012ers does the best?

A special thanks to Andreas Graefe for drawing our attention to the updated 2012 outlook.

Recent Posts:
St. Cloud St. Poll: Obama leads Pawlenty in 2012 Horserace in MN

Twenty Ten or Two Thousand Ten?

A Follow Up on Palin and Winner-Take-All Presidential Primaries

Wednesday, October 7, 2009

Predicting Presidential Elections from Biographical Information

Why crunch a bunch of numbers via regression to forecast a presidential election, when the candidates' biographical data seemingly gets you closer to the actual results? I don't know. This won't put number crunchers out of business (Good, I didn't waste 2008 after all!), but the findings from a study by Armstrong and Graefe do shed light on an interesting new avenue by which elections outcomes can be predicted. Here's how they constructed their model:

"We created a list of 49 cues from biographical information about candidates that were expected to have an influence on the election outcome. Then, we estimated whether a cue has a positive or negative influence on the election outcome. ... We distinguished two types of cues: (1) Yes / no cues record whether a candidate shows a certain characteristic or not. (2) More / less cues are more complex as they also incorporate information about the relative value of the cue for the candidates that run against each other in a particular election. In general, the candidate who achieved a more favorable value on a cue was assigned a value of 1 and 0 otherwise. For more information on the coding see Appendix 1. Finally, the sum of cue values for each candidate in a particular election determined his PollyBio index score (PB)."

And what did that yield? Out of the 28 elections between 1900 and 2008, the candidate with the highest PB index score won 25 times (see below).

Source: Armstrong and Graefe (2009). "Predicting Elections from Biographical Information about Candidates"

My first thought was, "I'll bet they missed the close ones." Well, those are the types of elections most of the forecasting models have the hardest time predicting. But that wasn't necessarily the case here. The Armstrong and Graefe model missed 1948 (Truman), 1976 (Carter) and 1992 (Clinton) and on the former two had company from other noted forecasting models. The only notable miss was Clinton's election in 1992.

"PollyBio failed in predicting the correct winner for the three elections in 1948, 1976, and 1992, in each of which an incumbent president was running. A look at the data helps to explain the failure for these three elections. Gerald Ford in 1976 and George Bush in 1992, who were both wrongly predicted to win, had particularly strong biographies. For our set of ‘yes / no’ cues, which did not include relative measures between candidates (like height, intelligence, or attractiveness), Ford and Bush achieved the highest score of all 56 candidates in our sample (together with Theodore Roosevelt in 1904 and William McKinley in 1900). By comparison, Harry Truman, who PollyBio failed in predicting to win the 1948 election, scored particularly low on the same set of cues. Being the only U.S. president after 1897 who did not earn a college degree, Truman achieved the lowest score of all incumbents in the sample. Among all candidates, only three achieved a lower score."

What was the common theme? A switch in power from one party to the other? They are all Democrats -- Southern Democrats at that (Fine Missouri's a border state.). No, those weren't it. All three elections involved incumbents. The model seems to do better in open seat races than in those where incumbents were involved.

So why wait for election day in 2012? Start comparing the bios of the prospective Republican candidates against Obama now. Who stacks up best? (My guess is Romney or Gingrich.) Hey, it is a race that involves an incumbent.

Hat tip to Political Wire for the link.

Recent Posts:
The 2012 Presidential Candidates: Pawlenty and Petraeus

State of the Race: New Jersey (10/6/09)

Here's what things would have looked like in New Jersey had the Rasmussen poll been released tomorrow.

Thursday, October 30, 2008

National and State-Level Factors in US Presidential Election Outcomes: An Electoral College Forecast Model

The following is an electoral college forecasting model that grew out of a paper Paul-Henri Gurian and Damon Cann first presented at the Western Political Science Association meeting in San Diego, CA this past March. The inherent value of that paper was its power in explaining that the variation in the two-party vote shares over the last 15 presidential elections (1948-2004) was based on a combination of national and state-level factors, the latter of which were separated into long- and short-term influences. It is a natural extension, then, to utilize the data from those 15 elections to project the 2008 electoral college outcome.

What follows is a brief summary of the model and a discussion of some of the issues both Paul and Damon see in it. As Paul said, "The forecasts in the paper are really preliminary. However, if we wait a few months, till we've re-specified the model, it won't be a forecast anymore." Questions, comments and concerns can be left in the comments section. I will forward them to Paul and Damon.
There is no shortage of presidential election forecasting models, academic or otherwise. In 2008, there are at least 15 political science forecasts, the average of which shows Obama winning approximately 52% of the two-party vote. Most rely on some combination of economic factors, presidential approval and/or incumbency to explain vote shares in presidential elections. Those factors are completely national in scope and what is lost in the process are many of the relevant state-level variables that could play a role in determining the electoral outcome. To be sure, there are also forecasting models that include state-factors, but what Paul Gurian and Damon Cann have done is to draw a distinction between the long- and short-term, state-level influences. [You can view their forecasting paper here.] In much the same way that the past polls in FHQ's weighted averages serve as an anchor to the short-term fluctuations in state polling, the long-term factors included in this forecasting model allow for historical, state-level factors to serve as a baseline of sorts for their forecast.

Those same national factors, then, are included, but are buttressed by short-term, state-level impacts (state primary divisiveness, home state, home region, etc.) as well as some of the more historical, state-level influences (state partisanship and ideology, etc.) that play a role in explaining the variation in the shares of the two-party vote. [A more thorough description of the state-level factors can be found on p. 6-7 in the paper linked above.]

The beauty of this is that you get 51 different forecasts, not just one on the national level. And that is certainly more suitable to the electoral college system. Based on the included variables over the last fifteen presidential elections, a projection of the two-party vote in each state can be made. The results can be found on p. 10-11, but a map of those results is included below. [No, I can't help myself. I have to include a map.]

[Click Map to Enlarge]

The result is a rather close outcome between John McCain and Barack Obama. The line between a solid and a toss up state is whether a state's division of the two-vote is within the margin of error. You'll no doubt notice that there are several states that are on opposite sides of where they may be expected given other forecasts and projections. Iowa, New Hampshire and New Mexico, for example are shaded in red while Arkansas, North Dakota and West Virginia appear in Obama's column.

Here are some caveats that Damon adds:

A few thoughts on the states:

NV: I think the "home region" variable swings the prediction for NV
toward McCain more than has actually happened in this instance.
Without that, McCain would still be in the margin of error.

AR and ND both had strong Democratic showings in House and Senate
races in 2006/2004, probably stronger than past history would suggest
for those states. Plus AR has the Democratic history from the "old
south" and our fixed effects may be picking that up a bit with the
1948-1970s elections.

I think NH is just a matter of history--while they went for Clinton in
'92 and '96, prior to that they only went to a Democrat once, Johnson
in '64. While NH has been battleground recently, our fixed effect
(based on all elections in the sample) moves NH just outside the
margin of error.

FL is probably similar. Like NH, most of the variables for 2008
suggest it ought to be perhaps R leaning but still battleground.
However, the fixed-effect for FL slides it about 2 points closer to
McCain.

I re-ran the model dropping the fixed effects, but that decreases the
general predictive power of the model by about 10%, seemingly
generating more error than it would eliminate.

Also, thinking about this statistically, since our margin of error is
based in the 95% level of confidence and we're making 50 forecasts, we
should actually expect to see 2.5 (OK, let's call that 2-3) of our
predictions that are significantly different from 50% by sampling
error alone. But since these errors are random, they should cancel
each other out in the EC tally (as long as it's not CA that is one
error and WY as the other).

I finally re-ran the model using national fatalities per 100,000
rather than state-level fatalities. The coefficient still comes out
insignificant statistically.

I want to thank both Paul and Damon for sharing this and I hope that we can get a good discussion going that will generate some helpful feedback.

Recent Posts:
The Electoral College Map (10/30/08)

Liveblog: The Obama Infomercial

Update(s): The Electoral College from a Different Angle

Saturday, August 2, 2008

So, Who's Going to Win This Race? The Forecasts are Starting to Come In

With second quarter economic data now available, many of the typical political science presidential election forecasts are beginning to emerge. And this week seemed to be the time for their unveiling. Thomas Edsall, who is writing for The Huffington Post now, had a great run down of some of them earlier in the week and since then Seth Masket and Robert Erikson and Christopher Wlezien have released their numbers. Those add to the back and forth between Abramowitz, et al. and Campbell that I discussed in the electoral college post last Sunday.

Let's look at the numbers here and some of the impressions we can draw from them. [As a side note, I should add that I tried to track down some of the sources from Edsall's piece (Some forecasts have links while others do not.) and stumbled upon a special issue of the International Journal of Forecasting* focused on presidential election forecasts. Now, while the issue does not contain the actual forecasts, it does provide more than adequate insight into the models and debates within the forecasting area of the literature.]

Abramowitz, Mann and Sabato: Comfortable win for Obama (a Democratic environment, Democrats' party ID advantage, and recent state and national polls)

Campbell: Close election (Bush approval does not translate to McCain necessarily, McCain was the best-positioned of the Republican candidates to for the general election, open seat elections are close)

Erikson and Wlezien: Obama = 53% of the popular vote (based on state trial-heat polls and leading economic indicators)

Geer: Close election (electorate's comfort zone with candidate's foreign policy stances post-9-11, McCain is a good candidate for the GOP, the last two elections have been close)

Lewis-Beck and Tien: Obama = 50.6% of the popular vote (based on jobs growth and growth in GNP and including a correction for the race factor in the contest)

Masket: McCain = 47.7% of the popular vote (based on 2nd quarter real per capita disposable income)

Norpoth: Obama = 50.1% of the popular vote (based on support of major party candidates in the primaries among other factors)

I withheld Sandy Maisel's prediction because it doesn't neatly fall into either category, landslide or toss up. He sees an Obama win unless the Illinois senator does something to lose it. With that said, we have seven forecasts; 4 close calls (two unambiguously finding Obama winning) and 3 -- I hesitate to call them landslides -- more comfortable victories for Obama. Five of these forecasts see Obama wins, but find different margins based on the underlying factors included in their models.

I should also mention that there were two forecasting models discussed at this past spring's Western Political Science Association conference: De Sart and Holbrook and Gurian and Cann. De Sart's forecasting web page hasn't yet updated for the 2008 election, but the forecast is based on both state and national polls (While we're citing, I should go ahead and include Thomas Holbrook's blog (again) as well.). The paper that Paul Gurian -- an occasional FHQ contributor -- presented at WPSA wasn't intended as a forecast, but he and Damon do have a forecasting component to it. They are still waiting on another couple of factors to add in to complete the forecast portion, though.

H/T to The Monkey Cage for the head's up on the Edsall article and Erikson and Wlezien's latest forecast.

----
* The pdf files of those articles are gated (for purchase), but the abstracts should be available to those who, at the very least, want to check those out. Simply click on the title of the article to get the abstract.

Recent Posts:
VP Announcement Timing

5% of Democrats Say They'll Vote for McCain

The Electoral College Map (7/30/08)