Anatomy of an Error – Why forecasts missed the 2016 result
On election day 2016 the 538 forecast gave Hillary Clinton a 71.4% chance of winning. Of course, she did not. Forecasts now given Biden an even bigger advantage, but coverage of the race is haunted by the miss in 2016. It shouldn’t be. We know what went wrong in 2016, and we can see that Biden’s advantage is more resilient to the issue.
The Midwest Mistake
It is sometimes said that the polls were wrong in 2016, but it wasn’t that simple. Nationally Clinton’s poll lead was only slightly overstated, 538 had her 3.6% ahead and she won by 2.1%. But she didn’t lose 1.5% of her expected votes everywhere, the error was concentrated in certain states.
As is well known, the error was much larger in the Midwest where Trump overperformed against the final forecast by just over 6% in Wisconsin and over 4% in Michigan and Pennslyvania. This tipped those states, and the electoral college.
He did even better in some other states, overperforming by double digits in states like Wyoming, Kentucky, and North Dakota (though some of that was probably down to sparse polling of those safe states). By contrast, Clinton beat the forecast in over a dozen states (and crushed it by 16.3% in DC).
The pattern of this error was not random, because the cause was not random. 538 and others missed Trump’s victory because the polls were undersampling white voters without a college education.
The Demographic Determination
Back in 2016 most polls ensured they had a balanced sample in terms of gender, age, and racial background. But many didn’t balance by education level, especially within racial groups. Trump’s strength among white non-college voters was underestimated as a result, and the error stems directly from that. Below you can see this relationship in chart form. The correlation is strong, if not perfect.
Nationally, 44% of registered voters in 2016 were white non-college voters. At the top right we have West Virginia, a state with 68% white non-college voters, where Trump beat his forecast by 15.6%. At the bottom left DC and Hawaii, with 5% and 17% white non-college respectively, were Clinton’s biggest overperformances.
Trump beat the forecast, but far from evenly. In states with 50% or more white non-college voters he beat the forecast by an average of 7.7%. But in states with under 50%, Clinton actually outperformed by 0.7%.
Biden’s Twin Paths to Victory
This is where the breadth of Biden’s lead over Trump makes his lead resilient. He has extended the swing states from the Midwest to the Sun Belt, a series of ‘soft-south’ states with very different demographics. For Trump to win again due to forecast error he doesn’t need a polling miss, he needs two of them in two different demographic groups.
As well as consistent leads in the Midwest states that Clinton lost, all of which have over 50% non-college whites, Biden also has leads of over 3% in Florida, Arizona, and (by a slightly smaller margin) North Carolina. These states have barely 40% non-college whites. This may also be why Democrats are looking strong in Georgia, a state only 37% non-college white in 2016. Demographics have also shifted against him; the US becomes about 2% less non-college white every four years.
In 2016, Trump barely overperformed in these states, the forecast miss was well under half of what it was in the Midwest. Now he needs to come from behind in the Midwest and the Sun Belt, beating expectations with two very different sets of voters.
Demographics are not destiny
There are limits to this analysis, of course. Oregon and Vermont are both majority white non-college votes where Clinton not only won easily but beat the forecast. There are other factors at play in how people vote as individuals and how states swing. But while demographics aren’t everything, they are the headwinds or tailwinds which a candidate faces in each state.
Consequently, Biden’s multiple paths to the Presidency make him unusually resilient against forecast error.
Trump has paths to victory, but he needs to be genuinely more popular across the board than the polling suggests. In 2016 he got the right voters in the right places, but repeating that trick is an awful lot harder this time. People looking for ways Trump can win should focus more on unexpected events which could change minds than unexpected errors missing minds already changed.
Pip Moss
Pip Moss posts on Political Betting as Quincel. His bets on the 2020 election are mostly on Biden winning. You can follow him on Twitter at @PipsFunFacts.
All numbers from the 2016 forecast are using the Polls Only Forecast. All numbers from the 2020 forecast are from 13 October. All demographic numbers are from the ‘States of Change’ 2019 report by the Centre for American Progress.