Confessions of a recovering statistician
I have a confession to make. I am, by academic training at least, a statistician. That said, it’s a very long time since I worked as one. But even though I don’t work in the area of statistics any more, I gained a few useful perspectives from my academic training and my early career in the UK government statistical service which I wanted to share. Here are a few thoughts:
- Data and the ability to analyze and understand it are very powerful. They allow you the possibility to gain amazing insight into the world, and sometimes to gain insight into the world’s problems and how they work and occasionally how to address them.
- Despite this, numeracy is often underrated in many educational systems (it certainly was the UK when I was growing up) even among the political elite. Too many opinion leaders and policy makers shamelessly admit that they don’t understand official statistics or graphs and charts – yet the same people would be much less willing to admit that they didn’t really master reading or writing.
- Basic statistical literacy is not actually that hard. It’s not difficult to learn how to understand fractions and ratios and how to read data tables or understand graphs and charts, or how to use them effectively (and honestly) in communicating statistical data – if more effort was placed on teaching them and they were more valued. And just this basic understanding could help avoid many incorrect interpretations of data and the faulty decisions which emanate from them.
- BUT – some aspects of statistics ARE highly specialized and require experts to do them. Examples include designing sampling schemes for surveys, developing experimental designs that allow you to test hypotheses and econometric modelling. Even then, it’s not uncommon to see errors and disagreements either in design or interpretation of the results – so it’s good to use experts for expert work and to have some mechanism to peer-review the work (even better if you can publish both your methods and datasets to anyone who has the right skill set can check it). See this notable illustration of the need to check your methods on the cost effectiveness of deworming.
- Users of statistics, such as politicians and journalists often forget that statistics are usually an estimation of the situation in real world – not the literal truth – whether due to the statistic being based on a sample or being measured indirectly or incompletely. This means it is our best guess of the real situation – but not reality itself, and as such it is subject to error and only as good as the approach taken and the quality of data used. Ideally any estimate should be accompanied by a standard error that gives you an idea of how accurate that estimation really is – but this is rarely given or used.
- Some things are even worse – they are based on models. Maternal mortality figures are an example. Bill Easterly recently commented on the use of “inception statistics” – a model with a model within a model when looking at stillbirths. Often this might be the only way to estimate something – but we need to be wary in interpreting and explaining the results and aware of the implications of the (sometimes heroic) assumptions made and the sensitivity of the indicators to them.
- You treasure what you measure – we often seek to identify measurable quantifiable indicators to help monitor progress, whether it is development goals, or process indicators for our projects. But it’s important to remember that when we put this numbers into our frameworks and set up means of collecting them, we risk to focus on improving the numbers themselves, rather than on the underlying issues we are seeking to address. This becomes all the more the case when rewards, personal or institutional, are based on hitting the numbers.
- Understanding statistical data requires not only statistical expertise but also contextual knowledge to interpret it. It’s often tempting to start going beyond what the numbers themselves say to suggest explanations for what they mean – but unless you have a good understanding of the specific context (culture, politics, biology etc.) then “common sense” assumptions and explanations might well be inaccurate or just plain wrong.
- Interpretation is not free of conscious or unconscious biases. People often look to data to find confirmation of their existing beliefs, rather than impartially considering all possible explanations – the famous “confirmation bias”.
- Data can be great to persuade people – but you have to be good at writing and talking about it too – both to explain it accurately and in plain English, but also to use it persuasively. Good data with poor explanation – especially for those audiences who are not data literate – is a poor persuader.
- Sometimes you have to take a position if you want to take action even when you don’t have all the facts: Those who understand statistics and produce them are often rightfully cautious about how they are explained and interpreted for some of the reasons above. But taken at face value this can lead to paralysis. To use data to take action you need to strike a balance between seeking the most complete and reliable information, and taking timely and politically pragmatic action with incomplete data.