Tuesday, 26 April 2016

Hillsborough Inquest - my input

With today's verdict (fans unlawfully killed) coming after more than two years I can now speak about my own involvement in the Inquest.

Because of the years that have passed few people are aware that there was a 'near-miss'  disaster at Hillsborough eight years before the actual disaster. The circumstances were essentially identical -  an FA Cup Semi Final with far too many supporters let in to the Leppings Lane stand leading to a massive crush. Because of the quick thinking of a steward who was able to open a gate onto the pitch nobody died on that occasion (although there were many injuries).  I know this because I was present at that earlier near disaster and I was, in fact, Secretary of the Sheffield Spurs Supporters Club. At the time I wrote to the FA and South Yorkshire police as I felt mistakes had been made, and indeed the incident was sufficiently serious that Hillsborough (which had been used every year as one of the two semi-final venues) was avoided until 1988 (the year before the disaster). Immediately after the disaster in 1989 I wrote to the FA and Lord Taylor (who led the original enquiry) to inform them of the events of 1981. Although I was interviewed at that time by the Police investigators, my evidence was never used.

In 2014 - out of the blue - I was asked to attend the new Hillsborough Inquest as it had been decided that the 1981 incident was an important piece of the story.  Here are a couple of links to media reports about my appearance:
Norman Fenton, 26 April 2016

Friday, 25 March 2016

Statistics of coincidences: Ben Geen case revisited (ABC)

In November 2014 I reported on the case of nurse Ben Geen who was convicted in 2006 for murdering 2 patients and seriously harming 15 others. I had been asked to produce an expert report on the 'statistical coincidences' in the case for the Criminal Cases Review Board.

Now a 30-minute documentary on the case presented by Joel Werner is to be aired on Australia's national radio station ABC on 28 March. In the programme (which you can listen to in full from the links at the top of the ABC page) I present a lay summary of the statistical argument (from minutes 16:30 to 21:34).

Norman Fenton

Saturday, 19 March 2016

Turning poorly structured data into intelligent Bayesian Network models for medical decision support

Medical data is very often badly structured, incomplete and inconsistent. This limits our ability to  generate useful models for prediction and decision support if we rely purely on machine learning techniques. That means we need to exploit expert knowledge at various model development stages. This problem - which is common in many application domains - is tackled in a paper** published in the latest issue of Artificial Intelligence in Medicine.

The paper describes a rigorous and repeatable method for building effective Bayesian Network (BN) models from complex data - much of which comes in unstructured and incomplete responses by patients from questionnaires and interviews. Such data inevitably contains repetitive, redundant and contradictory responses; without expert knowledge learning a BN model from the data alone is especially problematic where we are interested in simulating causal interventions for risk management. The novelty of this work is that it provides a rigorous consolidated and generalised framework that addresses the whole life-cycle of BN model development. The method is validated using data from forensic psychiatry. The resulting BN models demonstrate competitive to superior predictive performance against the data-driven state-of-the-art models. More importantly, the resulting BN models go beyond improving predictive accuracy and into usefulness for risk management through intervention, and enhanced decision support in terms of answering complex clinical questions that are based on unobserved evidence.

The method is applicable to any application domain involving large-scale decision analysis based on such complex and unstructured information. It challenges decision scientists to reason about building models based on what information is really required for inference, rather than based on what data is available. Hence, it forces decision scientists to use available data in a much smarter way.

**The full reference for the paper is:
Constantinou, A. C., Fenton, N., Marsh, W., & Radlinski, L. (2016). "From complex questionnaire and interviewing data to intelligent Bayesian Network models for medical decision support".Artificial Intelligence in Medicine, Vol 67 pages 75-93. DOI http://dx.doi.org/10.1016/j.artmed.2016.01.002

For those who do not have access to the journal a pre-publication draft can be downloaded: http://constantinou.info/downloads/papers/complexBN.pdf 

Thursday, 10 March 2016

A Bayesian network to determine optimal strategy for Spurs' success

As a committed Spurs fan I have spent the last few months salivating at the club's sudden and unexpected rise and the prospect of them winning their first league title since 1961. By mid-February they were clear favourites to win the Premier League title. However, in my view, the challenge was compromised by the team becoming overstretched by playing too many matches in a short space of time. In particular, I felt that their involvement in the Europa League was an unnecessary distraction and burden. When I expressed these views on a Spurs online forum (backed up with some data showing consistent under-performance during periods when they were involved in the Europa League) I got heavily criticised by other fans who said it was important to try to win every competition.

Having simultaneously been involved in research discussions about the use of decisions in Bayesian networks, I decided to build a small model in AgenaRisk to resolve the dilemma once and for all. I have written up the results of the analysis here. The model can be downloaded from here.

In summary, there were 4 strategic options available to Spurs' manager Mauricio Pochettino at the time I started to do the analysis:
  1. Focus on Premier League 
  2. Focus on Premier League and FA Cup 
  3. Focus on Premier League and Europa League 
  4. Focus on all three competitions  
My BN model shows that the optimal decision (based on my subjective utility values of the different outcomes) was to go for 1 with 2 a close second. Unfortunately  (I believe) Pochettino opted for 3 which, as the model shows, suggests his personal utility value for winning the Europa League was actually higher than winning the Premier League.


See also: The problem with predicting football results - you cannot rely on the data

Thursday, 4 February 2016

Problems with the Likelihood Ratio method for determining probative value of evidence: the need for exhaustive hypotheses

Norman Fenton, 4 Feb 2016

I have written several times before about the likelihood ratio (LR) method that is recommended for use by forensic scientists when presenting evidence (such as the fact that DNA collected at a crime scene is found to have a profile that matches the DNA profile of a defendant in a case). In general the LR is a very good and simple method for communicating the impact of evidence (in this case on the hypothesis that the defendant was at the crime scene), but its correct use is based on strict assumptions that have been routinely ignored by forensic experts and statisticians, leading to the very kind of confusion and misunderstanding (when presented to lawyers and juries) that it was supposed to help avoid. The papers [1] and [2] provide an in-depth analysis of the problems. In this short article I will highlight just one of these problems which invalidate the LR. Subsequent articles will focus on the other problems and issues.

To recap: The LR is the probability of finding the evidence E if the prosecution hypothesis Hp is true (formally we write this as 'Probability of E given Hp') divided by the probability of finding the evidence E if the defence hypothesis Hd is true (formally we write this as  'probability of E given Hd').

So, to compute the LR, the forensic expert is forced to consider the probability of finding the evidence under both the prosecution and defence hypotheses.  This is a very good thing to do because it helps to avoid common errors of communication that can mislead lawyers and juries (notably the prosecutor's fallacy). Even more importantly, the LR is a measure of the probative value of the evidence because:
  • when the LR is greater than one the evidence supports the prosecution hypothesis (increasingly for larger values); 
  • when the LR is less than one it supports the defence hypothesis (increasingly as the LR gets closer to zero); 
  • when the LR is equal to one then the evidence supports neither hypothesis and so is 'neutral'. In such cases, since the evidence has no probative value lawyers and forensic experts believe it should not be admissible.
However, as explained in [1] and [2] (because of Bayes Theorem) for the LR to 'work' with respect to being a measure of probative value, the two hypotheses considered must be 'mutually exclusive and exhaustive'. This means that the defence hypothesis Hd must simply be the negation of the prosecution hypothesis Hp. So, for example, if Hp is "Defendant was at the crime scene" then Hp must be "Defendant was not at the crime scene".  Now, while there is more or less unanimity within the statistics and forensics field that the hypotheses must be mutually exclusive in order for the LR to be used, there is no such unanimity about the hypotheses being exhaustive. Indeed, the Royal Statistical Society Practitioner Guide to Case Assessment and Interpretation of Expert Evidence Guidelines [3] (page 32) specifies that the LR requires two mutually exclusive but not necessarily exhaustive hypotheses (which, interestingly, contradicts what is stated in the earlier Guidelines by the same group [4], page 96). To see why incorrect conclusions may be drawn when the hypotheses are not exhaustive we consider a very simple example:

Fred is the defendant for a crime.  The main evidence against Joe is that his DNA profile is found to be a match of a DNA sample found at the scene of the crime (for simplicity we ignore the possibility of errors in the DNA match). The DNA profile is of a type that is found in only 1 in 10,000 people. However, Fred has an identical twin brother Joe. Using the following:
  • Prosecution hypothesis Hp:  "Fred is the source of the DNA"
  • Defence hypothesis Hd: "Joe is the source of the DNA"
  • Evidence E: "the DNA found matches Fred's profile"
The defence reasons - correctly using the likelihood ratio approach- that the evidence E has no probative value with respect to the above two hypotheses, because the twins have the same DNA profile, i.e.
P(E given Hp) = P(E given Hd) = 1.
Hence, the defence demands the evidence is withdrawn because it is 'neutral'.

The problem here is that, even if we assume the hypotheses are mutually exclusive (i.e. we exclude the possibility that both the twins committed the crime) they are certainly NOT exhaustive. The correct defence hypothesis in this case should be "Fred is NOT the source of the DNA". This is made up of two cases:
  • Hd: "Joe is the source of the DNA"
  • Ho: "Another person (not Fred or  Joe) is the source of the DNA"
If we assume - before any evidence is known - that Hp, Hd and Ho are equally likely then the impact of observing the evidence is certainly NOT neutral - it is probative in favour of the prosecution hypothesis as can be shown from running the calculations in a Bayesian network tool:

The probability of Hp increases from 33% to to just under 50%.

But the supposedly 'neutral' evidence can have an even more dramatic impact in practice. Suppose, for example, that Joe has an alibi that is considered pretty reliable. Then this might reduce our prior belief in his innocence to 2%. In this case the before and after probabilities are:

The belief in the prosecution hypothesis in this case has shifted to above 95% - possibly sufficient for a jury to be convinced it is the truth.

If the DNA evidence in the above example was a non-match then the LR approach using the original hypotheses is even more obviously flawed because in this case:
        P(E given Hp) = P(E given Hd) = 0
But the evidence is certainly anything but 'neutral' because, after observing the evidence, the prosecution hypothesis Hp must be false (as must Hd).

While the example above is obviously simplistic and contrived more realistic examples are provided in [1] which also highlights this very problem in the case of Barry George (convicted and subsequently acquitted of the murder of TV celebrity Gill Dando after an appeal ruled that the gunpowder residue evidence presented in the original trial was inadmissible in a re-trial on the basis that it had a LR equal to one and so had 'no probative value'.)

See also:

  1. Fenton, N. E., D. Berger, D. Lagnado, M. Neil and A. Hsu, (2013). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)", Science and Justice, http://dx.doi.org/10.1016/j.scijus.2013.07.002.  A pre-publication draft of the article can be found here.
  2. Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016 to appear. Pre-publication version here
  3. Jackson, G., Aitken, C., & Roberts, P. (2015). PRACTITIONER GUIDE NO 4: Case Assessment and Interpretation of Expert Evidence. Royal Statistical Society.  Available here.
  4. Aitken, C, Roberts, P, Jackson, G, (2010) PRACTITIONER GUIDE NO 1:"Fundamentals of Probability and Statistical Evidence in Criminal Proceedings: Guidance for Judges, Lawyers, Forensic Scientists and Expert Witnesses. Royal Statistical Society. Available here.

Thursday, 28 January 2016

Misleading DNA evidence and the current damaged winning lottery ticket story

Norman Fenton, 28 January 2016

This post is primarily about how DNA match evidence is often presented in a way that is highly misleading (it is an important issue in an ongoing case I'm involved with). But in order to illustrate the point it turns out that we can use a simple analogy based loosely on the current lottery story that is getting a lot of media attention in the UK. This concerns an unverified £33 million winning ticket from a recent draw. About 200 people are claiming to have bought the (single) winning ticket but, until today*, none had actually provided proof of possessing such a ticket. The claim of one - Miss Susan Hinte - is the one that has grabbed media attention because she has produced a ticket in which key identifying information cannot be read because, she claims, the ticket was put through a washing machine.

But first let's look at the DNA issue, which is concerned with the following generic problem:
  • The prosecution claims that defendant Joe was at the crime scene. This hypothesis is denoted as Hp.
  • A tiny trace of DNA from the crime scene has been analysed and found to match the profile of Joe. This evidence (of the match) is denoted E
Typically the defence will argue that Joe was not at the crime scene and that any DNA matching Joe - especially as it was a tiny trace - got there through secondary transfer or other means. So the defence hypothesis Hd is simply the negation of Hp.

The DNA experts have correctly recognised that, in determining the probative value of the evidence E,  they have to use the ‘likelihood ratio’ approach [1]. This means they have to consider both of the following probabilities:
  1. The probability that E is the result of the prosecution hypothesis Hp being true  - formally we write this as P(E given Hp)
  2. The probability that E is the result of the defence hypothesis Hd being false  -  formally we write this as P(E given Hd) 
If probability 1 is greater than probability 2 then the evidence E supports  Hp over Hd and vice versa. The likelihood ratio is simply 1 divided by 2 and provides a simple and compelling measure of probative value of evidence. If the ratio is greater than one the evidence E supports Hp, with higher values indicating stronger support. If the ratio is less than one the evidence E supports Hd, with smaller values indicating stronger support. However, for reasons explained in [1], this whole notion of probative value is not meaningful if the defence hypothesis Hd is not the negation of the prosecution hypothesis Hp. One of the common errors made by DNA experts is to replace Hd with a different hypothesis, namely Hd':  "DNA from Joe got there by secondary transfer".  In this case Hd' excludes other possibilities of observing E even though Joe was not at the crime scene (such as errors or contamination during the DNA testing, or the DNA belonging to a different person with the same profile etc) and is not even mutually exclusive to Hp since Joe may have been at the crime scene even though the trace sample was there through secondary transfer. But, while this common error is serious, it is not the real concern I wish to raise here. In fact, let's suppose that no such error is made and that the expert considers the correct Hd.

The real concern is how a jury member reacts when the DNA expert now makes the following assertions:
  1. “The findings are what I would have expected if Hp were true.” i.e. P(E given Hp) is very high
  2. “The probability of the findings are considerably more likely to have been the result of Hp rather than Hd”  i.e. P(E given Hp) is much higher than P(E given Hd)
Notwithstanding the unnecessary redundancy of statement 1, these assertions sound very important and suggest very strong support for the prosecution hypothesis, especially as most people would already have assumed (wrongly) that the DNA 'match' means the trace certainly belongs to Joe.

But to demonstrate how misleading they are I will return now to the lottery example. For simplicity I will assume the old 6-ball lottery with 49 numbers. Suppose the winning numbers were:
1, 7, 21, 28, 40, 46

Mrs Smith has a damaged ticket that she claims has the winning numbers. The evidence E is that the first number (which is the only number clearly visible) is 1.

Our hypotheses are:
  • Hp: “Mrs Smith's ticket is the winning ticket”
  • Hd: “Mrs Smith ticket is not the winning ticket”
In this case we know the following:
  • P(E given Hp) = 1  (it is certain that the first number on the ticket would be 1 if it was the winning ticket)
  • P(E given Hd) is 0.122 (this is the proportion of non-winning tickets that have 1 as the first number) 
So we could certainly make exactly the same assertions in this case as the DNA experts above:
  1. “The findings are what I would have expected if Hp were true.” (since the probability of E given Hp is 1)
  2. “The probability of the findings are considerably more likely to have been the result of Hp rather than Hd” (since 1 is considerably greater than 0.122).
However, despite these (correct) assertions it is almost certain that Hd rather than Hp is true - Mrs Smith's ticket is not the winning ticket. In fact, the probability of Hp being true is less than one in 1.7 million (because there are over 1.7 million non-winning combinations in which the first number is 1).

So what is the moral of this story? The likelihood ratio of the evidence might often suggest the evidence is highly probative in favour of one of the hypotheses, but if the prior probability of the alternative hypothesis was much higher to start with then the evidence will not ‘overturn’ the prior belief in favour of the alternative.

Lay people ignore this in connection to DNA evidence. Because the random match probability associated with a DNA match is typically less than one in a billion, the very fact that the evidence E is a "DNA match" already puts into their mind the notion that this 'must tie the defendant to the crime scene'. But the random match probability is almost irrelevant in this case - it only accounts for a tiny proportion of P(E given Hp). Lay people can also easily be tricked into believing that the (redundant) assertion 1 “The findings are what I would have expected if Hp were true” provides additional weight to assertion 2.

Unfortunately, this type of evidence is increasingly prejudicing juries and, I believe, leading to serious miscarriages of justice.

*The real winner has now been found, and since their ticket was not damaged it can not have been Miss Hinte

[1] Fenton, N. E., D. Berger, D. Lagnado,  M. Neil and A. Hsu, (2014). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)",  Science and Justice, 54(4), 274-287 http://dx.doi.org/10.1016/j.scijus.2013.07.002. (pre-publication draft here)

Tuesday, 8 December 2015

Norman Fenton at Maths in Action Day (Warwick University)

Today Norman Fenton was one of the five presenters at the Mathematics in Action Day at Warwick University - the others included writer and broadcaster Simon Singh and BBC presenter Steve Mould (who is also part of the amazing trio Festival of the Spoken Nerd which features Queen Mary's Matt Parker). The Maths in Action day is specifically targeted at A-Level Maths students and their teachers.

Norman says:
This was probably the biggest live event I have spoken at - an audience of 550 in the massive Butterworth Hall (which has recently hosted Paul Weller and the Style Council, Jools Holland) - so it was quite intimidating. My talk was on "Fallacies of Probability and Risk" (the powerpoint slides are here). I hope to get some photos of the event uploaded shortly.
Butterworth Hall (hopefully some real photos from the event to come)