Driven by data; ridden with liberty.
Why were the polls wrong? When a Conservative majority became apparent in the early hours of 8th May, most polls showed a much closer race between the new governing party and Labour .
NatCen is a social research institute, led by the political scientist and psephologist Professor John Curtice of the University of Strathclyde, which conducts the annual British Social Attitudes survey. Their report, The Benefits of Random Sampling , provides support for two ideas as to why polling companies showed a close contest between the Conservatives and Labour: differential non-response and failing to identify non-voters. Differential non-response means that those who did not respond to the survey had a substantially different reply to the survey questions, such as voting intention, than those that did respond .
The real result in Great Britain was that the Conservatives achieved 37.8% of the vote, with Labour on 31.2%.
It is do with how polls are performed. Polls are not conducted in a uniform manner. The most common method during the 2015 General Election campaign was to conduct surveys via the internet. The respondents would be formed from a panel of people who indicated their wish to take part in surveys. The companies would request a certain sample undertake their latest poll, and respondents would fill it out. For some polling companies, this panel is maintained in-house, and for others, it is controlled by a third party.
The other type of polling is telephone polls. In this method, companies would ring landline and mobile phone numbers at random, often with certain targets for respondents of different demographic characteristics.
As the NatCen report highlights, these methods are both distinct and both diverge from the statistical theory that underpins research through surveys. The selection must be carried out through random probability sampling, which assigns each potential participant in the sampled population a probability of being selected.
This is what a margin of error is: a 95% confidence interval means, for every 95 out of 100 times the survey is conducted, the true figure for the whole population will be within a specified and symmetric interval — the margin of error — from the survey estimate.
Online panels are not necessarily representative of the whole population, and would seem to inherently overestimate the proportion of people interested in politics. The brief time period that most polls are taken over, to provide that snapshot of public opinion, also means that polling companies cannot linger when seeking to contact their respondents. Instead of trying multiple times to contact people, telephone polls may simply keep ringing random numbers until they have fulfilled their targets.
The British Social Attitudes survey, conducted by NatCen, has a very different methodology and purpose to polling companies. Addresses of potential respondents are drawn at random from the Postcode Address File, which contains nearly every residential address in the UK, with a few geographical limitations. At the selected addresses, the interviewers compiles a list of adults, and the one at each address that should be interviewed is determined via a grid of random numbers. No other people can be interviewed, and every attempt is made to ensure a successful, face-to-face interview with the selected respondents. This process does not take days, like most polls, but months.
The polling methods meant that the surveys overstated the number of people interested in politics, as shown in very high turnout estimates. In the final election polls, ICM pointed to a 87% turnout, and ComRes estimated 90%. The official turnout was 66.4%.
This can have particular effects when polls weight their results according to demographic segments. Polls often have difficulty reaching certain groups with their polling methods, and so use weighting to make the result more representative of the overall population. As the report states:
Thus, any poll that overestimated the propensity of younger voters to participate was at particular risk in 2015 of overestimating Labour support.
Whilst the polls anticipated that their younger participants would be less likely to vote than older people, this effect was too small, resulting in the claimed Labour support being too high.
The other consideration was differential non-response, that Conservative voters were harder to contact than Labour voters. The report states this is not a consequence of the social character of the respondents. Whilst only one in eight respondents to the British Social Attitudes survey answered at the first time of calling, the largest group was those who answered after three to six attempts. In this segment, the Conservatives had a 11-point lead over Labour.
NatCen suggests it was not an issue of whether Conservatives being truthful to polling companies, or otherwise ‘shy’ about their beliefs. If this were the case, neither the British Social Attitudes survey nor the British Election Study would have been able to replicate the Conservative lead. It was an issue of polling companies, through their client’s desire to produce quick and cheap polls, being unable to properly sample the population.
Whilst we should wait for the British Polling Council’s full report, the NatCen report suggests polling companies should, when seeking voting intentions, take a longer period to gather their data.
 BBC, 2016. ‘Not enough Tories’ in general election opinion polls. Available from: http://www.bbc.co.uk/news/uk-politics-35308129 [Accessed: 16th January 2016]
 NatCen, 2016. The Benefits of Random Sampling. Available from: http://www.bsa.natcen.ac.uk/media/39018/random-sampling.pdf [Accessed: 16th January 2016]
 Blom, A. G., 2009. Nonresponse Bias Adjustments: What Can Process Data Contributes? ISER. Available from: https://www.iser.essex.ac.uk/files/iser_working_papers/2009-21.pdf [Accessed: 16th January 2016]
"You're irritatingly eloquent." - Dick Puddlecote
"That's brave to put in writing." - Berin Midmer
"Well-referenced." - Tom Ash