Saturday 15 November 2014

Ben Geen: another possible case of miscarriage of justice and misunderstanding statistics?

Imagine if you asked people to roll eight dice to see if they can 'hit the jackpot' by rolling 8 out of 8 sixes.  The chances are less than 1 in 1.5 million. So if you saw somebody - let's call him Fred - who has a history of 'trouble with authority' getting a jackpot then you might be convinced that Fred is somehow cheating or the dice are loaded.  It would be easy to make a convincing case against Fred just on the basis of the unlikeliness of him getting the jackpot by chance and his problematic history.

But now imagine Fred was just one of the 60 million people in the UK who all had a go at rolling the dice. It would actually be extremely unlikely if less than 25 of them hit the jackpot with fair dice (and without cheating) - the expected number is about 35. In any set of 25 people it is also extremely unlikely that there will not be at least one person who has a history of 'trouble with authority'. In fact you are likely to find something worse, since about 10 million people in the UK have criminal convictions, meaning that in a random set of 25 people there are likely to be about 5 with some criminal conviction.

So the fact that you find a character like Fred rolling 8 out of 8 sixes purely by chance is actually almost inevitable. There is nothing to see here and nothing to investigate. As we showed in Section 4.6.3 of our book (or in the examples here) many events which people think of as 'almost impossibe'/'unbelievable' are in fact routine and inevitable.

Now, instead of thinking about 'clusters' of sixes rolled from dice, think about clusters of patient deaths in hospitals. Just as Fred got his cluster of sixes, if you look hard enough it is inevitable you will find some nurses associated with abnormally high numbers of patient deaths. In Holland a nurse called Lucia deBerk was wrongly convicted of multiple murders as a result of initially reading too much into such statistics (and then getting the relevant probability calculations wrong also). There have been other similar cases, and as my colleague Richard Gill explains so well it seems that Ben Geen may also have been the victim of such misunderstandings.

See also: Justice for Ben Geen

Update: See Richard Gill's excellent comments below

Update 16 Feb 2015: Guardian article talks about my statement made to the Criminal Cases Review Board.

How to measure anything

Douglas Hubbard (left) and Norman Fenton in London 15 Nov 2014.
If you want to know how to use measurement to reduce risk and uncertainty in a wide range of business applications, then there is no better book than Douglas Hubbard's "How to Measure Anything: Finding the Value of Intangibles in Business" (now in its 3rd edition). Douglas is also the author of the excellent "The Failure of Risk Management: Why It's Broken and How to Fix It".

Anyone who has read our Bayesian Networks book or the latest (3rd edition) of my Software Metrics book (the one I gave Douglas in the above picture!) will know how much his work has influenced us recently.

Although we have previously communicated about technical issues by email, today I had the pleasure of meeting Douglas for the first time when we were able to meet for lunch in London.We discussed numerous topics of mutual interest (including the problems with classical hypothesis testing - and how Bayes provides a better alternative, and evolving work on the 'value of information' which enables you to identify where to focus your measurement to optimise your decision-making).

Friday 14 November 2014

An even more blatant case of plagiarism of our Bayesian Networks work

Spot the difference in our article plagiarised by Daniel and Etuk
Last year I reported the blatant case of plagiarism whereby one of our papers was 'rewritten' by Milan Tuba and Dusan Bulatovic and published in the Journal WSEAS Transactions on Computers. At least Tuba and Bulatovic made an attempt to cover up the plagiarism by inserting our work into a paper with a very different title and with some additional material.

In the latest case (discovered thanks to a tip-off by Emilia Mendes) the 'authors' Matthias Daniel and Ette Harrison Etuk  of Dept. Mathematics and Computer Science, Rivers State University of Science and Technology, Nigeria  have made no such attempt to cover up their plagiarism except to rearrange the words in the title and in a small number of other places. So, for example "reliability and defects" has been replaced by "defects and reliability" at the end of the abstract. The only other difference is that our Acknowledgements have been removed.
Here is the full pdf of our published original paper whose full title and reference is:
Fenton, N.E., Neil, M., and Marquez, D., "Using Bayesian Networks to Predict Software Defects and Reliability". Proceedings of the Institution of Mechanical Engineers, Part O, Journal of Risk and Reliability, 2008. 222(O4): p. 701-712, 10.1243/1748006XJRR161: 

Here is the full pdf of the plagiarised paper whose full title and reference is: Matthias Daniel, Ette Harrison Etuk, "Predicting Software Reliability and Defects Using Bayesian Networks", European Journal of Computer Science and Information Technology Vol.2, No.1, pp.30-44, March 2014
What is really worrying is that Emilia came across the 'new' paper when doing some google searches on using BNs for defect prediction; it was one of the top ones listed!!!
We are waiting for a response from the European Journal of Computer Science and Information Technology.

Incidentally, the new third edition of my book Software Metrics: A Rigorous and Practical Approach has just been published and (for the first time) it covers the use of Bayesian networks for software reliability and defect prediction.