Political polling faces a crisis of confidence. Major news outlets repeatedly ask “What’s the matter with polling?” after major misses like the Bernie Sanders’s primary upset in Michigan, where he beat Hillary Clinton 50–48 despite the fact that she was leading by up to 20 points in reputable polls. There is, however, hope for a polling renaissance thanks to online questionnaires and advanced statistical methods, some of which we are employing during the 2016 cycle. These approaches can take snapshots of the electorate and produce a balanced and accurate election prediction. Using this strategy, web companies may soon use these methods to offer a plausible and accurate alternative to traditional phone polling.
So, what truly is the matter with current polling (if anything)?
In a nutshell, less and less people are answering their phones when pollsters call, and young voters are doing so even less than their parents. Currently, most pollsters try to contact a group of people who look demographically similar to the population as a whole, and with a few adjustments, this sample can offer an accurate picture of how the country or state intends to vote.
To gather this representative sample, many pollsters have relied on calling random landline numbers, under the assumption that if you dial enough numbers the people you call will not have any demographic biases, like being older or whiter or poorer than the true population. Others, like our firm, use list-based sampling methods to ensure quality.
However, fewer and fewer Americans have landlines, and even fewer under-35 voters still use them, so this strategy now struggles to produce accurate snapshots of the general population. Calling cell phones can offer a somewhat more representative sample, but connection rates are difficult and the cost of manually calling large number of cellphones is much higher. As a result, ensuring accurate results in polling has become more and more expensive.
Internet surveys offer a promising possible solution to these problems.
New research from Professor Andrew Gellman and coauthors out of Columbia University and Microsoft shows that cheap, non-representative internet polling with rich demographic information can produce results that are as accurate as the most expensive traditional polling. In their experiment, the authors conducted surveys via Xbox in the lead up to the 2012 election. Obviously, people who answer surveys via Xbox do not look like the general population any more than people who still have landlines. The distinct advantage is that these cheaper online surveys can collect many more responses than traditional phone polls. As a result, the researchers were able to ask the requisite demographic questions to accurately translate these results to the wider population.
The ability to project these polls onto the wider electorate is the key advantage of online surveys. In the Xbox poll, the authors asked for the usual demographic information like race and gender, but also attitudinal questions like whether the respondent identified as liberal or conservative and who they voted for in the 2008 election. This allowed the researchers to chop up the Xbox sample into remarkably small groups based on these characteristics and use a statistical method call multilevel regressions and post-estimation (MRP) to figure out how each of these small groups felt about the 2012 election. Even though there were few older women in the sample, the authors still managed to estimate the preferences of this group to within one percent of national exit poll estimates. The authors then used 2008 exit polls to figure out how many people from each group actually resided in the general population, and combined these exit polls with the Xbox survey data to predict the overall election results.
This approach compared favorably to leading traditional polls and the actual election results. As the following figure shows, the Xbox survey roughly aligned with an average of leading presidential polls in the 45 days before the 2012 election and in the final run up to Election Day the Xbox survey was closer to the final result than traditional polling.
This approach produced accurate, low cost results even though the Xbox poll was likely less representative of the general population than traditional polling. While the poll did capture more young voters than traditional methods, this was not the key to their success. Rather, the rich demographic data allowed the authors to thoroughly correct for the differences between their sample and the general population.
These results are more than academic abstractions — they offer a viable path forward for groups who wish to test an alternative to the rising cost of traditional polling methods. The chase for representative samples is getting harder and harder since there are so few technologies which the whole country uses in equal amounts, but these sorts of demographically rich non-representative samples offers a viable alternative. If there is sufficient enthusiasm and understanding from political campaign and corporations, internet giants such as Google and its Google Consumer Surveys division could soon offer the type of cheap, Xbox-style polling data which statisticians can turn into accurate, representative results. If a web company manages to fully embrace the potential of this method, accurate polling may soon be a much cheaper and more accessible option for everyone from local campaigns to major national brands.