Thursday, 26 February 2015

The Statistics of Climate Change

From left to right: Norman Fenton, Hannah Fry, David Spiegelhalter. Link to the Programme's BBC website
(This is a cross posting of the article here)

I had the pleasure of being one of the three presenters of the BBC documentary called “Climate Change by Numbers”  (first) screened on BBC4 on 2 March 2015.

The motivation for the programme was to take a new look at the climate change debate by focusing on three key numbers that all come from the most recent IPCC report. The numbers were:
  • 0.85 degrees - the amount of warming the planet has undergone since 1880
  • 95% - the degree of certainty climate scientists have that at least half the warming in the last 60 years is man-made
  • one trillion tonnes - the cumulative amount of carbon that can be burnt, ever, if the planet is to stay below ‘dangerous levels’ of climate change
The idea was to get mathematicians/statisticians who had not been involved in the climate change debate to explain in lay terms how and why climate scientists had arrived at these three numbers. The other two presenters were Dr Hannah Fry (UCL) and Prof Sir David Spiegelhalter (Cambridge) and we were each assigned approximately 25 minutes on one of the numbers. My number was 95%.

Being neither a climate scientist nor a classical statistician (my research uses Bayesian probability rather than classical statistics to reason about uncertainty) I have to say that I found the complexity of the climate models and their underlying assumptions to be daunting. The relevant sections in the IPCC report are extremely difficult to understand and they use assumptions and techniques that are very different to the Bayesian approach I am used to. In our Bayesian approach we build causal models that combine prior expert knowledge with data. 

In attempting to understand and explain how the climate scientists had arrived at their 95% figure I used a football analogy – both because of my life-time interest in football and because - along with my colleagues Anthony Constantinou and Martin Neil – we have worked extensively on models for football prediction. The climate scientists had performed what is called an “attribution study” to understand the extent to which different factors – such as human CO2 emissions – contributed to changing temperatures. The football analogy was to understand the extent to which different factors contributed to changing success of premiership football teams as measured by the total number of points they achieved season-by-season.  In contrast to our normal Bayesian approach – but consistent with what the climate scientists did – we used data and classical statistical methods to generate a model of success in terms of the various factors. Unlike the climate models which involve thousands of variables we had to restrict ourselves to a very small number of variables (due to a combination of time limitations and lack of data). Specifically, for each team and each year we considered:
  • Wages (this was the single financial figure we used)
  • Total days of player injuries
  • Manager experience
  • Squad experience
  • Number of new players
The statistical model generated from these factors produced, for most teams, a good fit of success over the years for which we had the data. Our ‘attribution study’ showed ‘Wages’ was by far the major influence.  When Wages was removed from the study, the resulting statistical model was not a good fit. This was analogous to what the climate scientists’ models were showing when the human CO2 emissions factor was removed from their models; the previously good fit to temperature was no longer evident. And, analogous to the climate scientists’ 95% figure derived from their models, we were able to conclude there was a 95% chance that an increase in wages of 10% would result in at least one extra premiership point.

Obviously there was no time in the programme to explain either the details or the limitations of my hastily put-together football attribution study and I will no doubt receive criticism for it (I am preparing a detailed analysis).  But the programme also did not have the time or scope to address the complexity of some of the broader statistical issues involved in the climate debate (including issues that lead some climate scientists to claim the 95% figure is underestimated and others to believe it is overestimated). In particular, the issues that were not covered were:
  • The real probabilistic meaning of the 95% figure. In fact it comes from a classical hypothesis test in which observed data is used to test the credibility of the ‘null hypothesis’. The null hypothesis is the ‘opposite’ statement to the one believed to be true, i.e.  ‘Less than half the warming in the last 60 years is man-made’. If, as in this case, there is only a 5%  probability of observing the data if the null hypothesis is true, the statisticians equate this figure (called a p-value) to a 95% confidence that we can reject the null hypothesis. But the probability here is a statement about the data given the hypothesis. It is not generally the same as the probability of the hypothesis given the data (in fact equating the two is often referred to as the ‘prosecutors fallacy’, since it is an error often made by lawyers when interpreting statistical evidence).See here and here for more on the limitations of p-values and confidence intervals.
  • Any real details of the underlying statistical methods and assumptions. For example, there has been controversy about the way a method called principal component analysis was used to create the famous hockey stick graph that appeared in previous IPCC reports. Although the problems with that method were recognised it is not obvious how or if they have been avoided in the most recent analyses.
  •  Assumptions about the accuracy of historical temperatures. Much of the climate debate  (such as that concerning the exceptionalness of the recent rate of temperature increase) depends on assumptions about historical temperatures dating back thousands of years. There has been some debate about whether sufficiently large ranges were used.
  • Variety and choice of models. There are many common assumptions in all of the climate models used by the IPCC and it has been argued that there are alternative models not considered by the IPCC which provide an equally good fit to climate data, but which do not support the same conclusions.
Although I obviously have a bias, my enduring impression from working on the programme is that the scientific discussion about the statistics of climate change would benefit from a more extensive Bayesian approach. Recently some researchers have started to do this, but it is an area where I feel causal Bayesian network models could shed further light and this is something that I would strongly recommend.

Acknowledgements: I would like to thank the BBC team (especially Jonathan Renouf, Alex Freeman, Eileen Inkson, and Gwenan Edwards) for their professionalism, support, encouragement, and training; and my colleagues Martin Neil and Anthony Constantinou for their technical support and advice. 

My fee for presenting the programme has been donated to the charity Magen David Adom

Watching the programme as it is screened

Wednesday, 18 February 2015

Climate Change Statistics

From left to right: Norman Fenton, Hannah Fry, David Spiegelhalter. Link to the Programme's BBC website
Please see Update here.

I am presenting a documentary called "Climate Change by Numbers" - to be screened at 9.00pm on BBC4 on Monday 2 March 2015. The trailer is here. This is the BBC Press Release about the programme:
BBC Four explores the science behind three key climate change statistics

In a special film for BBC Four, three mathematicians will explore three key statistics linked to climate change.

In Climate Change by Numbers, Dr Hannah Fry, Prof Norman Fenton and Prof David Spiegelhalter hone in on three numbers that lie at the heart of science’s current struggle to get a handle on the precise processes and impact of global climate change.

Prof Norman Fenton said: “My work on this programme has revealed the massive complexity of climate models and the novel challenges this poses for making statistical predictions from them.”

The three numbers are:

-       0.85 degrees - the amount of warming the planet has undergone since 1880
-       95% - the degree of certainty climate scientists have that at least half the recent warming is man-made
-       one trillion tonnes - the cumulative amount of carbon that can be burnt, ever, if the planet is to stay below ‘dangerous levels’ of climate change

All three numbers come from the most recent set of reports from the Intergovernmental Panel on Climate Change.

Prof David Spiegelhalter said: “It's been eye-opening to find out what these important numbers are actually based on.”
In this programme, the three scientists unpack what the history of these three numbers are; where did they come from? How have they been measured? How confident can we be in their accuracy? In their journeys they drill into the very heart of how science itself works, from data collection, through testing theories and making predictions, giving us a unique perspective  on the past, present and future of our changing climate.

Cassian Harrison, Channel Editor BBC Four, said: “This 75 minute special takes a whole new perspective on the issue of climate change. It puts aside the politics to concentrate on the science. It offers no definitive answers, but it does show the extraordinary achievements and the challenges still facing scientists who are attempting to get a definitive answer to what are perhaps the biggest scientific questions currently facing mankind.”

Executive Producer Jonathan Renouf said: “Who would have thought there’d be a link between the navigation system used to put men on the moon, and the way scientists work out how much the planet is warming up? It’s been great fun to come at climate change from a fresh angle, and discover stories that I don’t think anyone will have heard before.”
 



Saturday, 15 November 2014

Ben Geen: another possible case of miscarriage of justice and misunderstanding statistics?


Imagine if you asked people to roll eight dice to see if they can 'hit the jackpot' by rolling 8 out of 8 sixes.  The chances are less than 1 in 1.5 million. So if you saw somebody - let's call him Fred - who has a history of 'trouble with authority' getting a jackpot then you might be convinced that Fred is somehow cheating or the dice are loaded.  It would be easy to make a convincing case against Fred just on the basis of the unlikeliness of him getting the jackpot by chance and his problematic history.

But now imagine Fred was just one of the 60 million people in the UK who all had a go at rolling the dice. It would actually be extremely unlikely if less than 25 of them hit the jackpot with fair dice (and without cheating) - the expected number is about 35. In any set of 25 people it is also extremely unlikely that there will not be at least one person who has a history of 'trouble with authority'. In fact you are likely to find something worse, since about 10 million people in the UK have criminal convictions, meaning that in a random set of 25 people there are likely to be about 5 with some criminal conviction.

So the fact that you find a character like Fred rolling 8 out of 8 sixes purely by chance is actually almost inevitable. There is nothing to see here and nothing to investigate. As we showed in Section 4.6.3 of our book (or in the examples here) many events which people think of as 'almost impossibe'/'unbelievable' are in fact routine and inevitable.

Now, instead of thinking about 'clusters' of sixes rolled from dice, think about clusters of patient deaths in hospitals. Just as Fred got his cluster of sixes, if you look hard enough it is inevitable you will find some nurses associated with abnormally high numbers of patient deaths. In Holland a nurse called Lucia deBerk was wrongly convicted of multiple murders as a result of initially reading too much into such statistics (and then getting the relevant probability calculations wrong also). There have been other similar cases, and as my colleague Richard Gill explains so well it seems that Ben Geen may also have been the victim of such misunderstandings.

See also: Justice for Ben Geen

Update: See Richard Gill's excellent comments below

Update 16 Feb 2015: Guardian article talks about my statement made to the Criminla Cases Review Board.

How to measure anything


Douglas Hubbard (left) and Norman Fenton in London 15 Nov 2014.
If you want to know how to use measurement to reduce risk and uncertainty in a wide range of business applications, then there is no better book than Douglas Hubbard's "How to Measure Anything: Finding the Value of Intangibles in Business" (now in its 3rd edition). Douglas is also the author of the excellent "The Failure of Risk Management: Why It's Broken and How to Fix It".

Anyone who has read our Bayesian Networks book or the latest (3rd edition) of my Software Metrics book (the one I gave Douglas in the above picture!) will know how much his work has influenced us recently.

Although we have previously communicated about technical issues by email, today I had the pleasure of meeting Douglas for the first time when we were able to meet for lunch in London.We discussed numerous topics of mutual interest (including the problems with classical hypothesis testing - and how Bayes provides a better alternative, and evolving work on the 'value of information' which enables you to identify where to focus your measurement to optimise your decision-making).

Friday, 14 November 2014

An even more blatant case of plagiarism of our Bayesian Networks work


Spot the difference in our article plagiarised by Daniel and Etuk
Last year I reported the blatant case of plagiarism whereby one of our papers was 'rewritten' by Milan Tuba and Dusan Bulatovic and published in the Journal WSEAS Transactions on Computers. At least Tuba and Bulatovic made an attempt to cover up the plagiarism by inserting our work into a paper with a very different title and with some additional material.

In the latest case (discovered thanks to a tip-off by Emilia Mendes) the 'authors' Matthias Daniel and Ette Harrison Etuk  of Dept. Mathematics and Computer Science, Rivers State University of Science and Technology, Nigeria  have made no such attempt to cover up their plagiarism except to rearrange the words in the title and in a small number of other places. So, for example "reliability and defects" has been replaced by "defects and reliability" at the end of the abstract. The only other difference is that our Acknowledgements have been removed.
Here is the full pdf of our published original paper whose full title and reference is:
Fenton, N.E., Neil, M., and Marquez, D., "Using Bayesian Networks to Predict Software Defects and Reliability". Proceedings of the Institution of Mechanical Engineers, Part O, Journal of Risk and Reliability, 2008. 222(O4): p. 701-712, 10.1243/1748006XJRR161: 

Here is the full pdf of the plagiarised paper whose full title and reference is: Matthias Daniel, Ette Harrison Etuk, "Predicting Software Reliability and Defects Using Bayesian Networks", European Journal of Computer Science and Information Technology Vol.2, No.1, pp.30-44, March 2014
What is really worrying is that Emilia came across the 'new' paper when doing some google searches on using BNs for defect prediction; it was one of the top ones listed!!!
We are waiting for a response from the European Journal of Computer Science and Information Technology.

Incidentally, the new third edition of my book Software Metrics: A Rigorous and Practical Approach has just been published and (for the first time) it covers the use of Bayesian networks for software reliability and defect prediction.

Sunday, 3 August 2014

Who put Bella in the Wych-elm? A Bayesian analysis of a 70 year-old mystery

Our work on using Bayesian reasoning to help in legal cases features in the new series of the popular BBC Radio 4 programme Punt PI, in which comedian Steve Punt turns private investigator, examining little mysteries that perplex, amuse and beguile. Full details of what we did, with links, are here.

The full programme including my interview near the end is available on BBC iPlayer here.


Tuesday, 17 June 2014

Proving referee bias with Bayesian networks

An article in today's Huffington Post by Raj Persaud and Adrian Furnham talks about the scientific evidence that supports the idea of referee bias in football. One of the studies they describe is the recent work I did with Anthony Constantinou and Liam Pollock** where we developed a causal Bayesian network model to determine referee bias and applied it to the data from all matches played in the 2011-12 Premier League season. Here is what they say about our study:
Another recent study might just have scientifically confirmed this possible 'Ferguson Factor', entitled, 'Bayesian networks for unbiased assessment of referee bias in Association Football'. The term 'Bayesian networks', refers to a particular statistical technique deployed in this research, which mathematically analysed referee bias with respect to fouls and penalty kicks awarded during the 2011-12 English Premier League season.
The authors of the study, Anthony Constantinou, Norman Fenton and Liam Pollock found fairly strong referee bias, based on penalty kicks awarded, in favour of certain teams when playing at home.
Specifically, the two teams (Manchester City and Manchester United) who finished first and second in the league, appear to have benefited from bias that cannot be explained by other factors. For example a team may be awarded more penalties simply because it's more attacking, not just because referees are biased in its favour.

The authors from Queen Mary University of London, argue that if the home team is more in control of the ball, then, compared to opponents, it's bound to be awarded more penalties, with less yellow and red cards, compared to opponents. Greater possession leads any team being on the receiving end of more tackles. A higher proportion of these tackles are bound to be committed nearer to the opponent's goal, as greater possession also usually results in territorial advantage.
However, this study, published in the academic journal 'Psychology of Sport and Exercise', found, even allowing for these other possible factors, Manchester United with 9 penalties awarded during that season, was ranked 1st in positive referee bias, while Manchester City with 8 penalties awarded is ranked 2nd. In other words it looks like certain teams (most specifically Manchester United) benefited from referee bias in their favour during Home games, which cannot be explained by any other possible element of 'Home Advantage'. 
What makes this result particularly interesting, the authors argue, is that for most of the season, these were the only two teams fighting for the English Premiere League title. Were referees influenced by this, and it impacted on their decision-making?  Conversely the study found Arsenal, a team of similar popularity and wealth, and who finished third, benefited least of all 20 teams from referee bias at home, with respect to penalty kicks awarded. With the second largest average attendance as well as the second largest average crowd density, Arsenal were still ranked last in terms of referee bias favouring them for penalties awarded. In other words, Arsenal didn't seem to benefit much at all from the kind of referee bias that other teams were gaining from 'Home Advantage'. Psychologists might argue that temperament-wise, Sir Alex Ferguson and Arsene Wenger appear at opposite poles of the spectrum.
**  Constantinou, A. C., Fenton, N. E., & Pollock, L. (2014). "Bayesian networks for unbiased assessment of referee bias in Association Football". To appear in Psychology of Sport & Exercise. A pre-publication draft can be found here.

Our related work on using Bayesian networks to predict football results is discussed here.