In Defence of Liberty

Driven by data; ridden with liberty.

Statistics and Lampposts XII: Context

 

(Photo: infomatique)

(Photo: infomatique)

The purpose of statistical graphs is to present the data in a clear, descriptive, precise and efficient manner for the reader, allowing them to make coherent deductions based upon the underlying data. For a time series, one method of distortion is to denude this data of proper context, by choosing the starting and end points to suggest misleading impressions.

As Full Fact have highlighted, the following graph is a particular favourite of the Department for Work and Pensions, and Minister of State for Employment Esther McVey MP:

This time series of UK employment levels begins in 2008. (Photo: DWP)

This time series of UK employment levels begins in 2008. (Photo: DWP)

There is no issue over the graph’s accuracy – the axes are clearly labelled, so the reader can see the employment levels begin at 28.5m. However, the choice of start point is 2008, which contained the peak of employment prior to the financial crisis. This tends to exaggerate the increases in employment levels that have occurred under this government. If a wider view is taken of these levels, the number of people in work is normally increasing, simply because the population is also rising.

In order to account for population rises, the Office for National Statistics calculates the employment rate: the number of employed people aged between 16 and 64 divided by the population in the same age range. In the June 2014 edition of Labour Market Statistics, it is shown that the employment rate for February-April 2014 is currently 72.9%, which is about to surpass its historic highs of 73.1%. This chart would be preferred, since it provides historical context for the latest rises and is not driven by population increases.

This time series of the UK employment rate shows the historic context for the recent rises. (Photo: ONS)

This time series of the UK employment rate shows the historic context for the recent rises. (Photo: ONS)

Whilst it is rarer, another variation of this method is to cut the most recent data from the graph, so the reader assumes that the last point represents the latest figures. This can be seen from Labour supporter Dr Eoin Clarke’s graph for Evidence UK, which depicts statutory levels of homelessness in England from 2003 to 2012. The linked dataset is helpful, and defines statutory homelessness in the following manner: “Households found to be eligible for assistance, unintentionally homeless and falling within a priority need group, and consequently owed a main homelessness duty by a local housing authority.” This is not equivalent to rough sleeping.

This time series begins in 2003 and ends in 2012, even though the original dataset began in 1998 and ended in 2013. (Photo: Evidence UK)

This time series begins in 2003 and ends in 2012, even though the original dataset began in 1998 and ended in 2013. (Photo: Evidence UK)

The reader is led to believe that homeless consistently fell under Labour, whilst it has inexorably risen under the Conservatives. The article’s calculations are incorrect: statutory homelessness decreased from 104,630 in 1998 to 42,390 in 2010, which is a cut of 59.5%. Also, the dataset includes the year 2013, which does not appear on the graph. This shows that statutory homeless fell slightly, from 53,480 in 2012 to 53,160 in 2013. Seeing the full dataset dramatically affects the consequent analysis: statutory homelessness has seemingly found a plateau.

This graph is of the same dataset, with its initial beginning and ending.

This graph is of the same dataset, with its initial beginning and ending.

As Edward Tufte wrote in his book The Visual Display of Quantitative Information:

Of course false graphics are still with us. Deception must always be confronted and demolished, even if lie detection is no longer at the forefront of research. Graphical excellence begins with telling the truth about the data.

Advertisements

Information

This entry was posted on July 4, 2014 by in Statistics and tagged , , .
%d bloggers like this: