In Defence of Liberty

Driven by data; ridden with liberty.

Statistics and Lampposts III: Extrapolation Limitations

In statistics, extrapolation is the process of estimation, beyond the original data set, the value of one variable in terms of another variable’s value. Usually, the extrapolation’s quality and accuracy is dependent on the validity of the assumptions underlying the extrapolation method. If a linear extrapolation is made, then that extrapolation only remains valid if the relationship between the two variables is broadly linear.

This 2012 GCSE Statistics paper contained a good question elucidating the problems with linear extrapolation. (Photo: AQA)

This 2012 GCSE Statistics paper contained a good question elucidating the problems with linear extrapolation. (Photo: AQA)

For example, a scatter graph of the service length at a company plotted against the number of sick days taken may be linearly extrapolated. This extrapolation becomes entirely meaningless when the service length is long, and so the number of sick days taken is negative. Also, a linear extrapolation suggests that people who don’t work at this company take the most days off ill.

Population projections are a form of extrapolation, which usually excite doom-laden headlines. The Migration Observatory at the University of Oxford found that “populations of post-war immigrant origin will comprise between 20-40% of national population totals by the middle of the 21st century if recent migration trends persist”. The Daily Express headline screamed that: UK ‘40% ethnic minorities by 2050’. This headline mistakes an upper bound for the central estimate, but the article itself actually contained numerous caveats:

Prof Coleman said this assumption does not factor in the impact of current or future government attempts to reduce net migration.

Demographic Factors

Vincent Cooper at The Commentator claimed “by the year 2050, Britain will be a majority Muslim nation”. This is supposedly based on “demographic facts”. Demographic factors can change: birth rates, immigration and religious conversion are not fixed. As Channel 4’s FactCheck highlights, a 2007 demographic study by Westhoff and Frejka concludes “with the passage of time Muslim fertility moves closer to the fertility of the majority of the population in the respective countries”. The assumptions made for such projections are that present trends persist, which is faulty.

Incomplete data can also be extrapolated and interpolated into estimates for the whole set. Labour conducted a survey of 112 English councils, who responded to Freedom of Information requests. The survey found that 156,563 people were issued court summons for council tax arrears, then extrapolated to the whole of England. This is problematic because the responding councils were not random. The official data shows council tax collection rates had risen in the past five years.

Extrapolation: a fun hobby. (Photo: XKCD)

Extrapolation: a fun hobby. (Photo: XKCD)

Extrapolations can be useful, but are often uncertain and produce meaningless results.



This entry was posted on October 14, 2013 by in Statistics and tagged .
%d bloggers like this: