Thursday, 26 May 2016

Using Bayesian networks to assess new forensic evidence in an appeal case


If new forensic evidence becomes available after a conviction how do lawyers determine whether it raises sufficient questions about the verdict in order to launch an appeal? It turns out that there is no systematic framework to help lawyers do this. But a paper published today by Nadine Smit and colleagues in Crime Science presents such a framework driven by a recent case, in which a defendant was convicted primarily on the basis of sound evidence, but where subsequent analysis of the evidence revealed additional sounds that were not considered during the trial.

From the case documentation, we know the following:
  • A baby was injured during an incident on the top floor of a house
  • Blood from the baby was found on the wall in one of the rooms upstairs
  • On an audio recording of the emergency telephone call made by the suspect, a scraping sound (allegedly indicating scraping blood off a wall) can be heard
  • The suspect was charged with attempted murder 
The audio evidence played a significant role in the trial. But, during the appeal preparation process, the call was re-analysed by an audio expert on behalf of the defence, and four other sounds were identified on the same recording that, according to the expert, showed similarities to the original sound. In particular, one of these sounds was of interest because of background noise that could be heard simultaneously. The background noise was presumed to be the television, which was located in a different room to where the prosecution argued the scraping of the blood took place.  During this second sound, the TV (located downstairs) could be heard simultaneously on the emergency recording. A statement by the police reads that the suspect was frequently rubbing his face in their presence. The defence proposed that the incriminating sound in the recording was not blood scraping after all, but simply the defendant rubbing his face.

The framework described in Smit's paper is intended to overcome the gap between what is generally known from scientific analyses and what is hypothesized in a legal setting. It is based on Bayesian networks (BNs) which are a structured and understandable way to evaluate the evidence in the specific case context and present it in a clear manner in court. However, BN methods are often criticised for not being sufficiently transparent for legal professionals. To address this concern the paper shows the extent to which the reasoning and decisions of the particular case can be made explicit and transparent. The BN approach enables us to clearly define the relevant propositions and evidence, and uses sensitivity analysis to assess the impact of the evidence under different prior assumptions. The results show that such a framework is suitable to identify information that is currently missing, and clearly crucial for a valid and complete reasoning process. Furthermore, a method is provided whereby BNs can serve as a guide to not only reason with incomplete evidence in forensic cases, but also identify very specific research questions that should be addressed to extend the evidence base to solve similar issues in the future.

Full citation:
Smit, N. M., Lagnado, D. A., Morgan, R. M., & Fenton, N. E. (2016). "An investigation of the application of Bayesian networks to case assessment in an appeal case". Crime Science, 2016, 5: 9, DOI 10.1186/s40163-016-0057-6 (open source). Published version pdf.
The research was funded by the Engineering and Physical Sciences Research Council of the UK through the Security Science Doctoral Research Training Centre (UCL SECReT) based at University College London (EP/G037264/1), and the European Research Council (ERC-2013-AdG339182-BAYES_KNOWLEDGE). 

The BN model (which is fully spceified in the paper) was built and run using the free version of AgenaRisk.

Tuesday, 26 April 2016

Hillsborough Inquest - my input


With today's verdict (fans unlawfully killed) coming after more than two years I can now speak about my own involvement in the Inquest.

Because of the years that have passed few people are aware that there was a 'near-miss'  disaster at Hillsborough eight years before the actual disaster. The circumstances were essentially identical -  an FA Cup Semi Final with far too many supporters let in to the Leppings Lane stand leading to a massive crush. Because of the quick thinking of a steward who was able to open a gate onto the pitch nobody died on that occasion (although there were many injuries).  I know this because I was present at that earlier near disaster and I was, in fact, Secretary of the Sheffield Spurs Supporters Club. At the time I wrote to the FA and South Yorkshire police as I felt mistakes had been made, and indeed the incident was sufficiently serious that Hillsborough (which had been used every year as one of the two semi-final venues) was avoided until 1988 (the year before the disaster). Immediately after the disaster in 1989 I wrote to the FA and Lord Taylor (who led the original enquiry) to inform them of the events of 1981. Although I was interviewed at that time by the Police investigators, my evidence was never used.

In 2014 - out of the blue - I was asked to attend the new Hillsborough Inquest as it had been decided that the 1981 incident was an important piece of the story.  Here are a couple of links to media reports about my appearance:
Norman Fenton, 26 April 2016



Friday, 25 March 2016

Statistics of coincidences: Ben Geen case revisited (ABC)


In November 2014 I reported on the case of nurse Ben Geen who was convicted in 2006 for murdering 2 patients and seriously harming 15 others. I had been asked to produce an expert report on the 'statistical coincidences' in the case for the Criminal Cases Review Board.

Now a 30-minute documentary on the case presented by Joel Werner is to be aired on Australia's national radio station ABC on 28 March. In the programme (which you can listen to in full from the links at the top of the ABC page) I present a lay summary of the statistical argument (from minutes 16:30 to 21:34).

Norman Fenton

Saturday, 19 March 2016

Turning poorly structured data into intelligent Bayesian Network models for medical decision support



Medical data is very often badly structured, incomplete and inconsistent. This limits our ability to  generate useful models for prediction and decision support if we rely purely on machine learning techniques. That means we need to exploit expert knowledge at various model development stages. This problem - which is common in many application domains - is tackled in a paper** published in the latest issue of Artificial Intelligence in Medicine.

The paper describes a rigorous and repeatable method for building effective Bayesian Network (BN) models from complex data - much of which comes in unstructured and incomplete responses by patients from questionnaires and interviews. Such data inevitably contains repetitive, redundant and contradictory responses; without expert knowledge learning a BN model from the data alone is especially problematic where we are interested in simulating causal interventions for risk management. The novelty of this work is that it provides a rigorous consolidated and generalised framework that addresses the whole life-cycle of BN model development. The method is validated using data from forensic psychiatry. The resulting BN models demonstrate competitive to superior predictive performance against the data-driven state-of-the-art models. More importantly, the resulting BN models go beyond improving predictive accuracy and into usefulness for risk management through intervention, and enhanced decision support in terms of answering complex clinical questions that are based on unobserved evidence.

The method is applicable to any application domain involving large-scale decision analysis based on such complex and unstructured information. It challenges decision scientists to reason about building models based on what information is really required for inference, rather than based on what data is available. Hence, it forces decision scientists to use available data in a much smarter way.

**The full reference for the paper is:
Constantinou, A. C., Fenton, N., Marsh, W., & Radlinski, L. (2016). "From complex questionnaire and interviewing data to intelligent Bayesian Network models for medical decision support".Artificial Intelligence in Medicine, Vol 67 pages 75-93. DOI http://dx.doi.org/10.1016/j.artmed.2016.01.002

For those who do not have access to the journal a pre-publication draft can be downloaded: http://constantinou.info/downloads/papers/complexBN.pdf 

Thursday, 10 March 2016

A Bayesian network to determine optimal strategy for Spurs' success


As a committed Spurs fan I have spent the last few months salivating at the club's sudden and unexpected rise and the prospect of them winning their first league title since 1961. By mid-February they were clear favourites to win the Premier League title. However, in my view, the challenge was compromised by the team becoming overstretched by playing too many matches in a short space of time. In particular, I felt that their involvement in the Europa League was an unnecessary distraction and burden. When I expressed these views on a Spurs online forum (backed up with some data showing consistent under-performance during periods when they were involved in the Europa League) I got heavily criticised by other fans who said it was important to try to win every competition.

Having simultaneously been involved in research discussions about the use of decisions in Bayesian networks, I decided to build a small model in AgenaRisk to resolve the dilemma once and for all. I have written up the results of the analysis here. The model can be downloaded from here.

In summary, there were 4 strategic options available to Spurs' manager Mauricio Pochettino at the time I started to do the analysis:
  1. Focus on Premier League 
  2. Focus on Premier League and FA Cup 
  3. Focus on Premier League and Europa League 
  4. Focus on all three competitions  
My BN model shows that the optimal decision (based on my subjective utility values of the different outcomes) was to go for 1 with 2 a close second. Unfortunately  (I believe) Pochettino opted for 3 which, as the model shows, suggests his personal utility value for winning the Europa League was actually higher than winning the Premier League.

Downloads:

See also: The problem with predicting football results - you cannot rely on the data

Thursday, 4 February 2016

Problems with the Likelihood Ratio method for determining probative value of evidence: the need for exhaustive hypotheses

Norman Fenton, 4 Feb 2016

I have written several times before about the likelihood ratio (LR) method that is recommended for use by forensic scientists when presenting evidence (such as the fact that DNA collected at a crime scene is found to have a profile that matches the DNA profile of a defendant in a case). In general the LR is a very good and simple method for communicating the impact of evidence (in this case on the hypothesis that the defendant was at the crime scene), but its correct use is based on strict assumptions that have been routinely ignored by forensic experts and statisticians, leading to the very kind of confusion and misunderstanding (when presented to lawyers and juries) that it was supposed to help avoid. The papers [1] and [2] provide an in-depth analysis of the problems. In this short article I will highlight just one of these problems which invalidate the LR. Subsequent articles will focus on the other problems and issues.

To recap: The LR is the probability of finding the evidence E if the prosecution hypothesis Hp is true (formally we write this as 'Probability of E given Hp') divided by the probability of finding the evidence E if the defence hypothesis Hd is true (formally we write this as  'probability of E given Hd').

So, to compute the LR, the forensic expert is forced to consider the probability of finding the evidence under both the prosecution and defence hypotheses.  This is a very good thing to do because it helps to avoid common errors of communication that can mislead lawyers and juries (notably the prosecutor's fallacy). Even more importantly, the LR is a measure of the probative value of the evidence because:
  • when the LR is greater than one the evidence supports the prosecution hypothesis (increasingly for larger values); 
  • when the LR is less than one it supports the defence hypothesis (increasingly as the LR gets closer to zero); 
  • when the LR is equal to one then the evidence supports neither hypothesis and so is 'neutral'. In such cases, since the evidence has no probative value lawyers and forensic experts believe it should not be admissible.
However, as explained in [1] and [2] (because of Bayes Theorem) for the LR to 'work' with respect to being a measure of probative value, the two hypotheses considered must be 'mutually exclusive and exhaustive'. This means that the defence hypothesis Hd must simply be the negation of the prosecution hypothesis Hp. So, for example, if Hp is "Defendant was at the crime scene" then Hp must be "Defendant was not at the crime scene".  Now, while there is more or less unanimity within the statistics and forensics field that the hypotheses must be mutually exclusive in order for the LR to be used, there is no such unanimity about the hypotheses being exhaustive. Indeed, the Royal Statistical Society Practitioner Guide to Case Assessment and Interpretation of Expert Evidence Guidelines [3] (page 32) specifies that the LR requires two mutually exclusive but not necessarily exhaustive hypotheses (which, interestingly, contradicts what is stated in the earlier Guidelines by the same group [4], page 96). To see why incorrect conclusions may be drawn when the hypotheses are not exhaustive we consider a very simple example:

Fred is the defendant for a crime.  The main evidence against Joe is that his DNA profile is found to be a match of a DNA sample found at the scene of the crime (for simplicity we ignore the possibility of errors in the DNA match). The DNA profile is of a type that is found in only 1 in 10,000 people. However, Fred has an identical twin brother Joe. Using the following:
  • Prosecution hypothesis Hp:  "Fred is the source of the DNA"
  • Defence hypothesis Hd: "Joe is the source of the DNA"
and
  • Evidence E: "the DNA found matches Fred's profile"
The defence reasons - correctly using the likelihood ratio approach- that the evidence E has no probative value with respect to the above two hypotheses, because the twins have the same DNA profile, i.e.
P(E given Hp) = P(E given Hd) = 1.
Hence, the defence demands the evidence is withdrawn because it is 'neutral'.

The problem here is that, even if we assume the hypotheses are mutually exclusive (i.e. we exclude the possibility that both the twins committed the crime) they are certainly NOT exhaustive. The correct defence hypothesis in this case should be "Fred is NOT the source of the DNA". This is made up of two cases:
  • Hd: "Joe is the source of the DNA"
  • Ho: "Another person (not Fred or  Joe) is the source of the DNA"
If we assume - before any evidence is known - that Hp, Hd and Ho are equally likely then the impact of observing the evidence is certainly NOT neutral - it is probative in favour of the prosecution hypothesis as can be shown from running the calculations in a Bayesian network tool:


The probability of Hp increases from 33% to to just under 50%.

But the supposedly 'neutral' evidence can have an even more dramatic impact in practice. Suppose, for example, that Joe has an alibi that is considered pretty reliable. Then this might reduce our prior belief in his innocence to 2%. In this case the before and after probabilities are:

The belief in the prosecution hypothesis in this case has shifted to above 95% - possibly sufficient for a jury to be convinced it is the truth.

If the DNA evidence in the above example was a non-match then the LR approach using the original hypotheses is even more obviously flawed because in this case:
        P(E given Hp) = P(E given Hd) = 0
But the evidence is certainly anything but 'neutral' because, after observing the evidence, the prosecution hypothesis Hp must be false (as must Hd).

While the example above is obviously simplistic and contrived more realistic examples are provided in [1] which also highlights this very problem in the case of Barry George (convicted and subsequently acquitted of the murder of TV celebrity Gill Dando after an appeal ruled that the gunpowder residue evidence presented in the original trial was inadmissible in a re-trial on the basis that it had a LR equal to one and so had 'no probative value'.)

See also:

References
  1. Fenton, N. E., D. Berger, D. Lagnado, M. Neil and A. Hsu, (2013). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)", Science and Justice, http://dx.doi.org/10.1016/j.scijus.2013.07.002.  A pre-publication draft of the article can be found here.
  2. Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016 to appear. Pre-publication version here
  3. Jackson, G., Aitken, C., & Roberts, P. (2015). PRACTITIONER GUIDE NO 4: Case Assessment and Interpretation of Expert Evidence. Royal Statistical Society.  Available here.
  4. Aitken, C, Roberts, P, Jackson, G, (2010) PRACTITIONER GUIDE NO 1:"Fundamentals of Probability and Statistical Evidence in Criminal Proceedings: Guidance for Judges, Lawyers, Forensic Scientists and Expert Witnesses. Royal Statistical Society. Available here.

Thursday, 28 January 2016

Misleading DNA evidence and the current damaged winning lottery ticket story


Norman Fenton, 28 January 2016

This post is primarily about how DNA match evidence is often presented in a way that is highly misleading (it is an important issue in an ongoing case I'm involved with). But in order to illustrate the point it turns out that we can use a simple analogy based loosely on the current lottery story that is getting a lot of media attention in the UK. This concerns an unverified £33 million winning ticket from a recent draw. About 200 people are claiming to have bought the (single) winning ticket but, until today*, none had actually provided proof of possessing such a ticket. The claim of one - Miss Susan Hinte - is the one that has grabbed media attention because she has produced a ticket in which key identifying information cannot be read because, she claims, the ticket was put through a washing machine.

But first let's look at the DNA issue, which is concerned with the following generic problem:
  • The prosecution claims that defendant Joe was at the crime scene. This hypothesis is denoted as Hp.
  • A tiny trace of DNA from the crime scene has been analysed and found to match the profile of Joe. This evidence (of the match) is denoted E
Typically the defence will argue that Joe was not at the crime scene and that any DNA matching Joe - especially as it was a tiny trace - got there through secondary transfer or other means. So the defence hypothesis Hd is simply the negation of Hp.

The DNA experts have correctly recognised that, in determining the probative value of the evidence E,  they have to use the ‘likelihood ratio’ approach [1]. This means they have to consider both of the following probabilities:
  1. The probability that E is the result of the prosecution hypothesis Hp being true  - formally we write this as P(E given Hp)
  2. The probability that E is the result of the defence hypothesis Hd being false  -  formally we write this as P(E given Hd) 
If probability 1 is greater than probability 2 then the evidence E supports  Hp over Hd and vice versa. The likelihood ratio is simply 1 divided by 2 and provides a simple and compelling measure of probative value of evidence. If the ratio is greater than one the evidence E supports Hp, with higher values indicating stronger support. If the ratio is less than one the evidence E supports Hd, with smaller values indicating stronger support. However, for reasons explained in [1], this whole notion of probative value is not meaningful if the defence hypothesis Hd is not the negation of the prosecution hypothesis Hp. One of the common errors made by DNA experts is to replace Hd with a different hypothesis, namely Hd':  "DNA from Joe got there by secondary transfer".  In this case Hd' excludes other possibilities of observing E even though Joe was not at the crime scene (such as errors or contamination during the DNA testing, or the DNA belonging to a different person with the same profile etc) and is not even mutually exclusive to Hp since Joe may have been at the crime scene even though the trace sample was there through secondary transfer. But, while this common error is serious, it is not the real concern I wish to raise here. In fact, let's suppose that no such error is made and that the expert considers the correct Hd.

The real concern is how a jury member reacts when the DNA expert now makes the following assertions:
  1. “The findings are what I would have expected if Hp were true.” i.e. P(E given Hp) is very high
  2. “The probability of the findings are considerably more likely to have been the result of Hp rather than Hd”  i.e. P(E given Hp) is much higher than P(E given Hd)
Notwithstanding the unnecessary redundancy of statement 1, these assertions sound very important and suggest very strong support for the prosecution hypothesis, especially as most people would already have assumed (wrongly) that the DNA 'match' means the trace certainly belongs to Joe.

But to demonstrate how misleading they are I will return now to the lottery example. For simplicity I will assume the old 6-ball lottery with 49 numbers. Suppose the winning numbers were:
1, 7, 21, 28, 40, 46

Mrs Smith has a damaged ticket that she claims has the winning numbers. The evidence E is that the first number (which is the only number clearly visible) is 1.

Our hypotheses are:
  • Hp: “Mrs Smith's ticket is the winning ticket”
  • Hd: “Mrs Smith ticket is not the winning ticket”
In this case we know the following:
  • P(E given Hp) = 1  (it is certain that the first number on the ticket would be 1 if it was the winning ticket)
  • P(E given Hd) is 0.122 (this is the proportion of non-winning tickets that have 1 as the first number) 
So we could certainly make exactly the same assertions in this case as the DNA experts above:
  1. “The findings are what I would have expected if Hp were true.” (since the probability of E given Hp is 1)
  2. “The probability of the findings are considerably more likely to have been the result of Hp rather than Hd” (since 1 is considerably greater than 0.122).
However, despite these (correct) assertions it is almost certain that Hd rather than Hp is true - Mrs Smith's ticket is not the winning ticket. In fact, the probability of Hp being true is less than one in 1.7 million (because there are over 1.7 million non-winning combinations in which the first number is 1).

So what is the moral of this story? The likelihood ratio of the evidence might often suggest the evidence is highly probative in favour of one of the hypotheses, but if the prior probability of the alternative hypothesis was much higher to start with then the evidence will not ‘overturn’ the prior belief in favour of the alternative.

Lay people ignore this in connection to DNA evidence. Because the random match probability associated with a DNA match is typically less than one in a billion, the very fact that the evidence E is a "DNA match" already puts into their mind the notion that this 'must tie the defendant to the crime scene'. But the random match probability is almost irrelevant in this case - it only accounts for a tiny proportion of P(E given Hp). Lay people can also easily be tricked into believing that the (redundant) assertion 1 “The findings are what I would have expected if Hp were true” provides additional weight to assertion 2.

Unfortunately, this type of evidence is increasingly prejudicing juries and, I believe, leading to serious miscarriages of justice.

*The real winner has now been found, and since their ticket was not damaged it can not have been Miss Hinte

[1] Fenton, N. E., D. Berger, D. Lagnado,  M. Neil and A. Hsu, (2014). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)",  Science and Justice, 54(4), 274-287 http://dx.doi.org/10.1016/j.scijus.2013.07.002. (pre-publication draft here)