Saturday, 15 November 2014

Ben Geen: another possible case of miscarriage of justice and misunderstanding statistics?


Imagine if you asked people to roll eight dice to see if they can 'hit the jackpot' by rolling 8 out of 8 sixes.  The chances are less than 1 in 1.5 million. So if you saw somebody - let's call him Fred - who has a history of 'trouble with authority' getting a jackpot then you might be convinced that Fred is somehow cheating or the dice are loaded.  It would be easy to make a convincing case against Fred just on the basis of the unlikeliness of him getting the jackpot by chance and his problematic history.

But now imagine Fred was just one of the 60 million people in the UK who all had a go at rolling the dice. It would actually be extremely unlikely if less than 25 of them hit the jackpot with fair dice (and without cheating) - the expected number is about 35. In any set of 25 people it is also extremely unlikely that there will not be at least one person who has a history of 'trouble with authority'. In fact you are likely to find something worse, since about 10 million people in the UK have criminal convictions, meaning that in a random set of 25 people there are likely to be about 5 with some criminal conviction.

So the fact that you find a character like Fred rolling 8 out of 8 sixes purely by chance is actually almost inevitable. There is nothing to see here and nothing to investigate. As we showed in Section 4.6.3 of our book (or in the examples here) many events which people think of as 'almost impossibe'/'unbelievable' are in fact routine and inevitable.

Now, instead of thinking about 'clusters' of sixes rolled from dice, think about clusters of patient deaths in hospitals. Just as Fred got his cluster of sixes, if you look hard enough it is inevitable you will find some nurses associated with abnormally high numbers of patient deaths. In Holland a nurse called Lucia deBerk was wrongly convicted of multiple murders as a result of initially reading too much into such statistics (and then getting the relevant probability calculations wrong also). There have been other similar cases, and as my colleague Richard Gill explains so well it seems that Ben Geen may also have been the victim of such misunderstandings.

See also: Justice for Ben Geen

Update: See Richard Gill's excellent comments below

Update 16 Feb 2015: Guardian article talks about my statement made to the Criminal Cases Review Board.

How to measure anything


Douglas Hubbard (left) and Norman Fenton in London 15 Nov 2014.
If you want to know how to use measurement to reduce risk and uncertainty in a wide range of business applications, then there is no better book than Douglas Hubbard's "How to Measure Anything: Finding the Value of Intangibles in Business" (now in its 3rd edition). Douglas is also the author of the excellent "The Failure of Risk Management: Why It's Broken and How to Fix It".

Anyone who has read our Bayesian Networks book or the latest (3rd edition) of my Software Metrics book (the one I gave Douglas in the above picture!) will know how much his work has influenced us recently.

Although we have previously communicated about technical issues by email, today I had the pleasure of meeting Douglas for the first time when we were able to meet for lunch in London.We discussed numerous topics of mutual interest (including the problems with classical hypothesis testing - and how Bayes provides a better alternative, and evolving work on the 'value of information' which enables you to identify where to focus your measurement to optimise your decision-making).

Friday, 14 November 2014

An even more blatant case of plagiarism of our Bayesian Networks work


Spot the difference in our article plagiarised by Daniel and Etuk
Last year I reported the blatant case of plagiarism whereby one of our papers was 'rewritten' by Milan Tuba and Dusan Bulatovic and published in the Journal WSEAS Transactions on Computers. At least Tuba and Bulatovic made an attempt to cover up the plagiarism by inserting our work into a paper with a very different title and with some additional material.

In the latest case (discovered thanks to a tip-off by Emilia Mendes) the 'authors' Matthias Daniel and Ette Harrison Etuk  of Dept. Mathematics and Computer Science, Rivers State University of Science and Technology, Nigeria  have made no such attempt to cover up their plagiarism except to rearrange the words in the title and in a small number of other places. So, for example "reliability and defects" has been replaced by "defects and reliability" at the end of the abstract. The only other difference is that our Acknowledgements have been removed.
Here is the full pdf of our published original paper whose full title and reference is:
Fenton, N.E., Neil, M., and Marquez, D., "Using Bayesian Networks to Predict Software Defects and Reliability". Proceedings of the Institution of Mechanical Engineers, Part O, Journal of Risk and Reliability, 2008. 222(O4): p. 701-712, 10.1243/1748006XJRR161: 

Here is the full pdf of the plagiarised paper whose full title and reference is: Matthias Daniel, Ette Harrison Etuk, "Predicting Software Reliability and Defects Using Bayesian Networks", European Journal of Computer Science and Information Technology Vol.2, No.1, pp.30-44, March 2014
What is really worrying is that Emilia came across the 'new' paper when doing some google searches on using BNs for defect prediction; it was one of the top ones listed!!!
We are waiting for a response from the European Journal of Computer Science and Information Technology.

Incidentally, the new third edition of my book Software Metrics: A Rigorous and Practical Approach has just been published and (for the first time) it covers the use of Bayesian networks for software reliability and defect prediction.

Sunday, 3 August 2014

Who put Bella in the Wych-elm? A Bayesian analysis of a 70 year-old mystery

Our work on using Bayesian reasoning to help in legal cases features in the new series of the popular BBC Radio 4 programme Punt PI, in which comedian Steve Punt turns private investigator, examining little mysteries that perplex, amuse and beguile. Full details of what we did, with links, are here.

The full programme including my interview near the end is available on BBC iPlayer here.


Tuesday, 17 June 2014

Proving referee bias with Bayesian networks

An article in today's Huffington Post by Raj Persaud and Adrian Furnham talks about the scientific evidence that supports the idea of referee bias in football. One of the studies they describe is the recent work I did with Anthony Constantinou and Liam Pollock** where we developed a causal Bayesian network model to determine referee bias and applied it to the data from all matches played in the 2011-12 Premier League season. Here is what they say about our study:
Another recent study might just have scientifically confirmed this possible 'Ferguson Factor', entitled, 'Bayesian networks for unbiased assessment of referee bias in Association Football'. The term 'Bayesian networks', refers to a particular statistical technique deployed in this research, which mathematically analysed referee bias with respect to fouls and penalty kicks awarded during the 2011-12 English Premier League season.
The authors of the study, Anthony Constantinou, Norman Fenton and Liam Pollock found fairly strong referee bias, based on penalty kicks awarded, in favour of certain teams when playing at home.
Specifically, the two teams (Manchester City and Manchester United) who finished first and second in the league, appear to have benefited from bias that cannot be explained by other factors. For example a team may be awarded more penalties simply because it's more attacking, not just because referees are biased in its favour.

The authors from Queen Mary University of London, argue that if the home team is more in control of the ball, then, compared to opponents, it's bound to be awarded more penalties, with less yellow and red cards, compared to opponents. Greater possession leads any team being on the receiving end of more tackles. A higher proportion of these tackles are bound to be committed nearer to the opponent's goal, as greater possession also usually results in territorial advantage.
However, this study, published in the academic journal 'Psychology of Sport and Exercise', found, even allowing for these other possible factors, Manchester United with 9 penalties awarded during that season, was ranked 1st in positive referee bias, while Manchester City with 8 penalties awarded is ranked 2nd. In other words it looks like certain teams (most specifically Manchester United) benefited from referee bias in their favour during Home games, which cannot be explained by any other possible element of 'Home Advantage'. 
What makes this result particularly interesting, the authors argue, is that for most of the season, these were the only two teams fighting for the English Premiere League title. Were referees influenced by this, and it impacted on their decision-making?  Conversely the study found Arsenal, a team of similar popularity and wealth, and who finished third, benefited least of all 20 teams from referee bias at home, with respect to penalty kicks awarded. With the second largest average attendance as well as the second largest average crowd density, Arsenal were still ranked last in terms of referee bias favouring them for penalties awarded. In other words, Arsenal didn't seem to benefit much at all from the kind of referee bias that other teams were gaining from 'Home Advantage'. Psychologists might argue that temperament-wise, Sir Alex Ferguson and Arsene Wenger appear at opposite poles of the spectrum.
**  Constantinou, A. C., Fenton, N. E., & Pollock, L. (2014). "Bayesian networks for unbiased assessment of referee bias in Association Football". To appear in Psychology of Sport & Exercise. A pre-publication draft can be found here.

Our related work on using Bayesian networks to predict football results is discussed here.

Saturday, 14 June 2014

Daniel Kahneman at the Hebrew University Jerusalem

I have just returned from the workshop on "Behavioral Legal Studies - Cognition, Motivation, and Moral Judgment" at the Hebrew University in Jerusalem, Israel. I was especially interested in seeing Daniel Kahneman open the workshop with "Reflections on Psychology, Economics, and Law". Kahneman won the 2002 nobel prize in economics and was also recipient of the Presidential Medal of Freedom from President Obama in 2013.

Kahneman (centre) interviewed by Prof Zamir (left) and Prof Ritov (right)
Kahneman is, of course, very well known for his pioneering work with Amos Tversky and Paul Slovic (who also spoke at the workshop) on cognitive bias (which has greatly influenced our own work on probabilistic reasoning in the law) and also prospect theory (for which he won the Nobel prize). Kahneman's 2011 book "Thinking, Fast and Slow" which summarises much of his work, has sold over one and a half million copies. The book is based on the idea that, when it comes to assessment and decision-making, people are either system 1 (fast) thinkers or system 2 (slow) thinkers. The former act on instinct and often get things wrong while the latter are more likely to get things right because they think through all aspects of a problem carefully. While I think Kahneman's book is a very good read, I personally do not find the fast/slow classification of decision-makers to be especially helpful. Nevertheless, a lot of the speakers at the workshop used it to inform their own work.

Kahneman's presentation was in the form of an interview by Prof. Eyal Zamir and Prof. Ilana Ritov (both of the Law Faculty at the Hebrew University) asking the questions. Kahneman nicely summarised the main results and achievements of his career and was humble enough both to give credit to his co-researchers and also to admit that some of his theories (such as on gambling choices) had subsequently been proven to be false.

Audience at Kahneman interview
Kahneman touched on one of the key points in his book that I find problematic, namely his rejection of what he calls 'complex algorithms';  his argument is that any assessment/decision problem that involves expert judgment should not involve many variables because you can always get just as good a result with a simple model inolving no more than three variables. While I agree that any problem solution should be kept as simple as possible, a crude limit to the number of variables directly contradicts our Bayesian network approach, where models often necessarily involve multiple variables and relationships derived from both data and expert judgment. The important point is that the 'complex algorithms' we use are just Bayesian inference - of course if you had to do this 'by hand' then it would be disastrous, but the fact that there are widely available tools means the algorithmic complexity is completely hidden.  Crucially, we have shown many times (see for example this work on evaluating forensic evidence) that the Bayesian network solution provides greater accuracy and insights than the commonly used simplistic 'solutions'. 
 
Much of the theme of what Kahneman spoke about (and which was also a key theme of the workshop generally) was about 'moral judgment' - he cited the radically different legal responses to murder and attempted murder as an example of irrational (and possibly immoral) decision-making. The problem with 'moral judgment' - and the continually repeated notion of  'what is good for society' is that most academics have a particular view about these that they assume are both 'correct' and universally held. Hence, much of what I heard during the workshop was politicized and biased. This was also evident in Kahneman's answers to audience questions following the interview. I actually asked Kahneman what his rationale was for concluding that President Obama was a system 2 thinker. Bearing in mind that system 2 thinkers are supposed to be 'good' decision makers compared with system 1 thinkers, his response was clearly popular with many in the audience, but actually surprised me because it seemed to be purely political;  he basically said something like "you only have to compare him with the previous guy (Bush) to know the difference".

Kahneman also gave his views on how conflicts (like that of Israel and its enemies) could be solved, which I found were naive and possibly contradictory to his own work in psychology. His theory is that both 'sides' in a conflict are rational, but believe they are responding to the actions of the other side - so all you need to do is to make both sides aware of this.

There was a very nice reception for invited workshop participants after Kahneman's interview, but Kahneman himself had to rush off to another meeting and he took no further part in the workshop. 


"
Dave Lagnado (UCL) - who we have worked with on Bayesian networks and the law - giving an excellent talk on "Spreading the blame" (he presented a framework for intuitive judgements and blame)
My trip was partially funded under ERC Grant number: 339182 (BAYES-KNOWLEDGE) and I gratefully acknowledge the ERC contribution. 

Wednesday, 30 April 2014

Statistics of Poverty

I was one of two plenary speakers at the Winchester Conference on Trust, Risk, Information and the Law yesterday (slides of my talk: "Improving Probability and Risk Assessment in the Law" are here).

The other plenary speaker was Matthew Reed (Chief Executive of the Children's Society) who spoke about "The role of trust and information in assessing risk and protecting the vulnerable". In his talk he made the very dramatic statement that
"one in every four children in the UK today lives in poverty"
He further said that the proportion had increased significantly over the last 25 years and showed no signs of improvement.

When questioned about the definition of child poverty he said he was using the Child Poverty Act 2010 definition which defines a child as living in poverty if they lived in a household whose income (which includes benefits) is less than 60% of the national median (see here).

Matthew Reed has a genuine and deep concern for the welfare of children. However, the definition is purely political and is as good an example of poor measurement and misuse of statistics as you can find. Imagine if every household was given an immediate income increase of 1000%  - this would mean the very poorest households with, say, a single unemployed parent and 2 children going from £18,000 to a fabulously wealthy £180,000 per year. Despite this, one in every four children would still be 'living in poverty' because the number of households whose income is less than 60% of the median has not changed.  If the median before was £35,000, then it is now £350,000 and everybody earning below  £210,000 is, by definition, 'living in poverty'.

At the other extreme if you could ensure that every household in the UK earns a similar amount, such as in Cuba where almost everybody earns $20 per month then the number of children 'living in poverty' is officially zero (since the median is $240 per year and nobody earns less than $144).

In fact, in any wealthy free-market economy whichever way you look at the definition it is loaded not only to exaggerate the number of people living in poverty but also to ensure (unless there is massive wealth redistribution to ensure every household income is close to the median level) there will always be a 'poverty' problem:
  • Households with children are much more likely to have one, rather than two, wage earners, so by definition households with children will dominate those below the median income level.
  • Over the last 20 years people have been having fewer children and having them later in life, which again means that an increasing proportion of the country's children inevitably live in households whose income is below the median (hence the 'significant increase in the proportion of children living in poverty over the last 25 years').
  • Families with large numbers of children (> 3) increasingly are in the immigrant community (Asia/Africa) whose households are disproportionately below the median income. 
Unless the plan is stop households on below median income from having children (also known as eugenics), the only way to achieve the stated objective of 'making child poverty history' (according to this definition) is to redistribute wealth so that no household income is less than 60% of the median (also known as communism). Judging by some of the people who have been pushing the 'poverty' definition and agenda it would seem the latter is indeed their real objective.


Friday, 17 January 2014

More on birthday coincidences

My daughter's birthday was last week (12 January), so I had a personal interest in today's  Telegraph article about a family with 4 children all having the same birthday - 12 January

Family with 4 children - all born on 12 January
Anybody who has read our book or seen our Probability Puzzles page will be familiar with the problem of 'coincidences' being routinely exaggerated (by which I mean probabilities of apparently very unlikely events are not as low as people assume). There is the classic birthdays problem that fits into this category (in a class of 23 children the probability that at least two will share the same birthday is actually better than 50%); but of more concern is that national newspapers routinely print ludicrously exaggerated figures for 'incredible events'*. 

So when I saw the story in today's Telegraph I did what I always do in such cases - work out how wrong the stated odds are. Fortunately, in this case the Telegraph gets it spot on: for a family with 4 children, two of whom are twins, the probability that all 4 have the same birthday is approximately 1 in 133,225. Why? because it is simply the probability that the twins (who we can assume must be born on the same day) have the same birthday as the first child times the probability that the youngest child has the same birthday as the first child. That is 1/365 times 1/365 which is 1/133225. It is the same, of course, as the chance of a family of three children (none of whom are twins or triplets) each having the same birthday. The Telegraph also did not make the common mistake of stating/suggesting that the 1 in 133,225 figure was the probability of this happening in the whole of the UK. In fact, since there are about 800,000 families in the UK with 4 children and since about one in every 100 births are twins, we can assume there are about 8,000 families in the UK with 4 children including a pair of twins. The chances of at least one such family having all children with the same birthday are about 1 in 17.



*Our book gives many examples and also explains why the newspapers routinely make the same types of errors in their calculations. For example (Chapter 4) the Sun published a story in which a mother had just given birth to her 8th child -  all of whom were boys; it claimed the chance of this happening were 'less then 1 in a billion'.  In fact, in any family of 8 children there is a 1 in 256 probability that all 8 will be boys. So, assuming that approximately 1000 women in the UK every year give birth to their 8th child it follows that there is about a 98% chance that in any given year in the UK a mother would give birth to an 8th child all of whom were boys.

Wednesday, 15 January 2014

Sally Clark revisited: another key statistical oversight

The Sally Clark case was notorious for the prosecution’s misuse of statistics in respect of Sudden Infant Death Syndrome (SIDS). In particular, the claim made by Roy Meadows at the original trial – that there was “only a 1 in 73 million chance of both children being SIDS victims” – has been thoroughly, and rightly, discredited.

However, as made clear by probability experts who analysed the case, the key statistical error made was to consider the (prior) probability of SIDS without comparing it to the (prior) probability of murder of a child by a parent. The experts correctly focused on the critical need for this comparison. However, there is an oversight in the way the experts built their arguments. Specifically, the prior probability of the ‘double SIDS’ hypothesis (which we can think of as the ‘defence’ hypothesis) has been compared with the prior probability of the ‘double murder’ hypothesis (which we can think of as the ‘prosecution’ hypothesis’). But, since it would have been sufficient for the prosecution to establish just one murder, the correct hypothesis to compare to ‘double SIDS’ is not ‘double murder’ but rather ‘at least one murder’. The difference can be very important. For example, based on the same assumptions used by one of the probability experts who examined the case, the prior odds in favour of the defence hypothesis over the prosecution are not 30 to 1 but rather more like 5 to 2. After medical and other evidence is taken into account this difference can be critical. The case demonstrates that, in order to use probabilities in legal arguments effectively, it is crucial to identify appropriate hypotheses.

My published article on this is here.