## Tuesday, 8 November 2016

### Confusion over the Likelihood Ratio

The 'Likelihood Ratio' (LR) has been dominating discussions at the third workshop  in our Isaac Newton Institute Cambridge Programme Probability and Statistics in Forensic Science.
There have been many fine talks on the subject - and these talks will be available here for those not fortunate enough to be attending.

We have written before (see links at bottom) about some concerns with the use of the LR. For example, we feel there is often a desire to produce a single LR even when there are multiple different unknown hypotheses and dependent pieces of evidence (in such cases we fell the problem needs to be modelled as a Bayesian network)- see [1]. Based on the extensive discussions this week, I think it is worth recapping on another one of these concerns (namely when hypotheses are non-exhaustive).

To recap: The LR  is a formula/method that is recommended for use by forensic scientists when presenting evidence - such as the fact that DNA collected at a crime scene is found to have a profile that matches the DNA profile of a defendant in a case. In general, the LR can a very good and simple method for communicating the impact of evidence (in this case on the hypothesis that the defendant is the source of the DNA found at the crime scene).

To compute the LR, the forensic expert is forced to consider the probability of finding the evidence under both the prosecution and defence hypotheses. So, if the prosecution hypothesis Hp is "Defendant is the source of the DNA found" and the defence hypothesis Hp is "Defendant is not the source of the DNA found" then we compute both the probability of the evidence given Hp - written P(E | Hp) - and the probability of the evidence given Hd - written P(E | Hd). The LR is simply the ratio of these two likelihoods, i.e. P(E | Hp) divided by P(E | Hd).

The very act of considering both likelihood values is a good thing to do because it helps to avoid common errors of communication that can mislead lawyers and juries (notably the prosecutor's fallacy). But, most importantly, the LR is a measure of the probative value of the evidence. However, this notion of probative value is where misunderstandings and confusion sometimes arise. In the case where the defence hypothesis is the negation of the prosecution hypothesis (i.e. Hd is the same as "not Hp" as in our example above) things are clear and very powerful because, by Bayes theorem:
• when the LR is greater than one the evidence supports the prosecution hypothesis (increasingly for larger values) - in fact the posterior odds of the prosecution hypothesis increase by a factor of LR over the prior odds.
• when the LR is less than one it supports the defence hypothesis (increasingly as the LR gets closer to zero) -  the posterior odds of the defence hypothesis increase by a factor of LR over the prior odds.
• when the LR is equal to one then the evidence supports neither hypothesis and so is 'neutral' - the posterior odds of both hypotheses are unchanged from their prior odds. In such cases, since the evidence has no probative value lawyers and forensic experts believe it should not be admissible.
However, things are by no means as clear and powerful when the hypotheses are not exhaustive (i.e. the negation of each other) and in most forensic applications this is the case. For example, in the case of DNA evidence, while the prosecution hypothesis Hp is still "defendant is source of the DNA found" in practice the defence hypothesis Hd is often something like "a person unrelated to the defendant is the source of the DNA found".

In such circumstances the LR can only help us to distinguish between which of the two hypotheses is more likely, so, e.g.  when the LR is greater than one the evidence supports the prosecution hypothesis over the defence hypothesis (with larger values leading to increased support). Unlike the case for exhaustive hypotheses the LR tells us nothing about the posterior odds of the prosecution hypothesis. In fact, it is quite possible that the LR can be very large - i.e. strongly supporting the prosecution hypothesis over the defence hypothesis - even though the posterior probability of the prosecution hypothesis goes down.  This rather worrying point is not understood by all forensic scientists (or indeed by all statisticians). Consider the following example (it's a made-up coin tossing example, but has the advantage that the numbers are indisputable):
Fred claims to be able to toss a fair coin in such a way that about 90% of the time it comes up Heads. So the main hypothesis is
H1: Fred has genuine skill
To test the hypothesis, we observe him toss a coin 10 times. It comes out Heads each time. So our evidence E is 10 out of 10 Heads. Our alternative hypothesis is:
H2: Fred is just lucky.

By Binomial theorem assumptions, P(E | H1) is about 0.35 while P(E | H2) is about 0.001. So the LR is about 350, strongly in favour of H1.

However, the problem here is that H1 and H2 are not exhaustive. There could be another hypotheses H3: "Fred is cheating by using a double-headed coin". Now, P(E | H3) = 1.

If we assume that H1, H2 and H3 are the only possible hypotheses* (i.e. they are exhaustive) and that the priors are equally likely, i.e. each is equal to 1/3 then the posteriors after observing the evidence E are:

H1: 0.25907 H2: 0.00074 H3: 0.74019

So, after observing the evidence E, the posterior for H1 has actually decreased despite the very large LR in its favour over H2.
In the above example, a good forensic scientist - if considering only H1 and H2 - would conclude by saying something like
"The evidence shows that hypothesis H1 is 350 times more likely than H2, but tells us nothing about whether we should have greater belief in H1 being true; indeed, it is possible that the evidence may much more strongly support some other hypothesis not considered and even make our belief in H1 decrease".
However, in practice (and I can confirm this from having read numerous DNA reports) no such careful statement is made. In fact, the most common assertion used in such circumstances is:
"The evidence provides strong support for hypothesis H1"
Such an assertion is not only mathematically wrong but highly misleading. Consider, as discussed above, a DNA case where:

Hp is "defendant is source of the DNA found"
Hd is  "a person unrelated to the defendant is the source of the DNA found".

This particular Hd hypothesis is a common convenient choice for the simple reason that P(E | Hd) is relatively easy to compute (it is the 'random match probability'). For single-source, high quality DNA this probability can be extremely small - of the order of one over several billions; since P(E | Hp) is equal to 1 in this case the LR is several billions. But, this does NOT provide overwhelming support for Hp as is often assumed unless we have been able to rule out all relatives of the defendant as suspects. Indeed, for less than perfect DNA samples it is quite possible for the LR  to be in the order of millions but for a close relative to be a more likely source than the defendant.

While confusion and misunderstandings can and do occur as a result of using hypotheses that are not exhaustive, there are many real examples where the choice of such non-exhaustive hypotheses is actually negligent.  The following appalling example is based on a real case (location details changed as an appeal is ongoing):
The suspect is accused of committing a crime in a particular rural location A near his home village in Dorset. The evidence E is soil found on the suspect's car.  The prosecution hypothesis Hp is "the soil comes from A". The suspect lives (and drives) near this location but claims he did not drive to that specific spot. To 'test' the prosecution hypothesis a soil expert compares Hp with the hypothesis Hd: "the soil comes from a different rural location". However, the 'different rural location'  B happens to be 500 miles away in Perth Scotland (simply because it is close to where the soil analyst works and he assumes soil from there is 'typical' of rural soil). To carry out the test the expert considers soil profiles of E and samples from the two sites A and B.

Inevitably the LR strongly favours Hp (i.e. site A)  over Hd (i.e. site B); the soil profile on the car - even if it was never at location A - is going to be much closer to the A profile than the B profile. But we can conclude absolutely nothing about the posterior probability of A. The LR is completely useless - it tells us nothing other than the fact that the car was more likely to have been driven in the rural location in Dorset than in a a rural location in Perth. Since the suspect had never driven the car outside Dorset this is hardly a surprise.  Yet, in the case this soil evidence was considered important since it was wrongly assumed to mean that it "provided support for the prosecution hypothesis".
This example also illustrates, however, why in practice it can be impossible to consider exhautive hypotheses. For such soil cases, it would require us to consider samples from every possible 'other' location. What an expert like Pat Wiltshire (who is also a participant on the FOS programme) does is to choose alternative sites close to the alleged crime scene and compare the profile of each of those and the crime scene profile with the profile from the suspect. While this does not tell us if the suspect was at the crime scene it can tell us how much more likely the suspect was to have been there rather than sites nearby.

*as pointed out by Joe Gastwirth there could be other hypotheses like "Fred uses the double-headed coin but switches to a regular coin after every 9 tosses"

References
1. Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016 (June), pp 51-77 http://dx.doi.org/10.1146/annurev-statistics-041715-033428 .Pre-publication version here and here is the Supplementary Material See also blog posting.
2. Fenton, N. E., D. Berger, D. Lagnado, M. Neil and A. Hsu, (2013). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)", Science and Justice, http://dx.doi.org/10.1016/j.scijus.2013.07.002.  A pre-publication version of the article can be found here.

## Friday, 7 October 2016

### Bayesian Networks and Argumentation in Evidence Analysis

 Some of the workshop participants
On 26-29 September 2016 a workshop on "Bayesian Networks and Argumentation in Evidence Analysis" took place at the Isaac Newton Institute Cambridge. This workshop, which was part of the FOS Programme was also the first public workshop of the ERC-funded project Bayes-Knowledge (ERC-2013-AdG339182-BAYES_KNOWLEDGE).

The workshop was a tremendous success, attracting many of the world's leading scholars in the use of Bayesian networks in law and forensics. Most of the presentations were filmed and can now be viewed here.

There was also a pre-workshop meeting on 23-24 September where participants focused on an important Dutch case that recently went to appeal. The partcipants were divided into two groups - one group developed a BN model of the case and the other developed an agumentation/scenarios-based model of the case. We plan to further develop these and write up the results.

 Some of the participants at the pre-workshop meeting anyalysing a specific Dutch case

### The Bayesian Networks mutual exclusivity problem

Several years ago when we started serious modelling of legal arguments using Bayesian networks we hit a problem that we felt would be easily solved. We had a set of mutually exclusive events such as "X murdered Y, Z murdered Y, Y was not murdered" that we needed to model as separate variables because they had separate causal pathways and evidence.

It turned  out that existing BN modelling techniques cannot capture the correct intuitive reasoning when a set of mutually exclusive events need to be modelled as separate nodes instead of states of a single node. The standard proposed ’solution’, which introduces a simple constraint node that enforces mutual exclusivity, fails to preserve the prior probabilities of the events and is therefore flawed.

In 2012 myself (and the co-authors listed below) produced an initial novel and simple solution to this problem that works in a reasonable set of circumstances, but it proved to be difficult to get people to understand why the problem was an important one that needed to be solved. After many changes and iterations this work has finally been published and, as a 'gold access paper' it is free for anybody to download in full (see link below).

During the current Programme "Probability and Statistics in Forensic Science" that I am helping to run at the Isaac Newton Institute for Mathematical Sciences, Cambridge, 18 July - 21 Dec 2016, it has become clear that the mutual exclusivity problem is critical in any legal case where there are diverse prosecution and defence narratives. Although our solution does not work in all cases (and indeed we are working on more comprehsive approaches) we feel it is an important start.

Norman Fenton, Martin Neil, David Lagnado, William Marsh, Barbaros Yet, Anthony Constantinou, "How to model mutually exclusive events based on independent causal pathways in Bayesian network models", Knowledge-Based Systems, Available online 17 September 2016
http://dx.doi.org/10.1016/j.knosys.2016.09.012

## Saturday, 17 September 2016

### Bayesian networks: increasingly important in cross disclipinary work

The growing importance of Bayesian networks was demonstrated this week by the award of a prestigious Leverhulme Trust Research Project Grant of £385,510 to Queen Mary University of London that ultimately will lead to improved design and use of self-monitoring systems such as blood sugar monitors, home energy smart meters, and self-improvement mobile phone apps.

The project, CAUSAL-DYNAMICS ("Improved Understanding of Causal Models in Dynamic Decision-making") is a collaborative project, led by Professor Norman Fenton of the School of Electronic Engineering and Computer Science, with co-investigators Dr Magda Osman (School of Biological and Chemical Sciences), Prof Martin Neil (School of Electronic Engineering and Computer Science) and Prof David Lagnado (Department of Experimental Psychology, University College London).

The project exploits Fenton and Neil's expertise in causal modelling using Bayesian networks and Osman and Lagnado's expertise in cognitive decision making. Previously, psychologists have extensively studied dynamic decision-making without formally modelling causality while statisticians, computer scientists, and AI researchers have extensively studied causality without considering its central role in human dynamic decision making. This new project starts with the hypothesis that we can formally model dynamic decision-making from a causal perspective. This enables us to identify both where sub-optimal decisions are made and to recommend what the optimal decision is. The hypothesis will be tested in real world examples of how people make decisions when interacting with dynamic self-monitoring systems such as blood sugar monitors and energy smart meters and will lead to improved understanding and design of such systems.

The project is for 3 years starting Jan 2017. For further details, see: CAUSAL-DYNAMICS.

WATCH THIS SPACE FOR THE ANNOUNCEMENT VERY SOON OF TWO OTHER MAJOR NEW CROSS-DISCIPLINARY BAYESIAN NETWORK PROJECTS!!

The Leverhulme Trust was established by the Will of William Hesketh Lever, the founder of Lever Brothers. Since 1925 the Trust has provided grants and scholarships for research and education; today it is one of the largest all-subject providers of research funding in the UK, distributing approximately £80 million a year. For more information: www.leverhulme.ac.uk / @LeverhulmeTrust

## Friday, 16 September 2016

### Bayes and the Law: what's been happening in Cambridge and how you can see it

 Programme Organisers (left to right): Richard Gill, David Lagnado, Leila Schneps, David Balding, Norman Fenton
Since 21 July 2016 I have been running the Isaac Newton Institute (INI) Programme on Probability and Statistics in Forensic Science in Cambridge.

For those of you who were not fortunate enough to be at the first formal workshop "The nature of questions arising in court that can be addressed via probability and statistical methods" (30 August to 2 September) you can watch the full videos here of most of the 35 presentations on the INI website. The presentation slide are also available in the INI link..

The workshop attracted many of the world's leading figures from the law, statistics and forensics with a mixture of academics (including mathematicians and legal scholar), forensic practitioners, and practicing lawyers (including judges and eminent QCs). It was rated a great success.

The second formal workshop "Bayesian Networks and Argumentation in Evidence Analysis" will take place on 26-29 September. It is also part of the BAYES-KNOWLEDGE project programe of work. For those who wish to attend, but cannot, the workshop will be streamed live.

Norman Fenton, 16 September 2016

## Friday, 1 July 2016

### The likelihood ratio and why its use in forensic analysis is often flawed

 FORREST 2016 (for details see here)

I am giving the opening address at the Forensic Institute 2016 Conference (FORREST 2016) in Glasgow on 5 July 2016. The talk is about the benefits and pitfalls of using the likelihood ratio to help understand the impact of forensic evidence. The powerpoint slide show for my talk is here.

While a lot of the material is based on our recent Bayes and the Law paper, there is a new simple example of the danger of using the likelihood ratio (LR) when the defence hypothesis is not the negation of the prosecution hypothesis. Recall that the LR for some evidence E is the probability of E given the prosecution hypothesis divided by the probability of E given the defence hypothesis. The reason the LR is popular is because it is a measure of the probative value of the evidence E in the sense that:
• LR>1 means E supports the prosecution hypothesis
• LR<1 means  E supports the defence hypothesis
• LR=1 means E has no probative value
This follows from Bayes Theorem but only when the defence hypothesis is the negation of the prosecution hypothesis. The problem is that there are Forensic Science Guidelines* that explicitly state that this requirement is not necessary. But if the requirement is not met then it is possible to have LR<1 even though E actually supports the prosecution hypothesis. Here is the example:

A raffle has 100 tickets numbered 1 to 100

Joe buys 2 tickets and gets numbers 3 and 99

The ticket is drawn but is blown away in the wind.

Joe says the ticket drawn was 99 and demands the prize, but the organisers say 99 was not the winning ticket. In this case the prosecution hypothesis H is “Joe won the raffle”.
Suppose we have the following evidence E presented by a totally reliable eye witness:

E: “winning ticket was an odd nineties number (i.e. 91, 93, 95, 97, or 99)”

Does the evidence E support H? let's do the calculations:
• Probability of E given H = ½
• Probability of E given not H = 4/98
So the LR  is  (1/2)/(4/98) = 12.25

That means the evidence CLEARLY supports H. In fact, the probability of H increases from a prior of 1/50 to a posterior of 1/5, so thee is no doubt it is supportive.

But suppose the organisers’ assert that their (defence) hypothesis is:

H’: “Winning ticket was a number between 95 and 97”

Then in this case we have:
• Probability of E given H = ½
• Probability of E given H’ = 2/3
So the LR  is  ( 1/2)/(2/3) = 0.75

That means that in this case the evidence supports H’ over H. The problem is that, while the LR does indeed 'prove' that the evidence is more supportive of H' than H that is actually irrelevant unless there is other evidence that proves that H' is the only possible alternative to H (i.e. that H' equivalent to 'not H').  In fact, the  'defence' hypothesis has been cherry picked. The evidence E supports H irrespective of which cherry-picked alternative is considered.
Norman Fenton, 1 July 2016

*Jackson G, Aitken C, Roberts P. 2013. Practitioner guide no. 4. Case assessment and interpretation of expert evidence: guidance for judges, lawyers, forensic scientists and expert witnesses. London: R. Stat. Soc. http://www.maths.ed.ac.uk/∼cgga/Guide-4-WEB.pdfPage 29: "The LR is the ratio of two probabilities, conditioned on mutually exclusive (but not necessarily exhaustive) propositions."

## Friday, 17 June 2016

### Bayes and the Law: Cambridge event and new review paper

When we set up the Bayes and the Law network in 2012 we made the following assertion:
Proper use of statistics and probabilistic reasoning has the potential to improve dramatically the efficiency, transparency and fairness of the criminal justice system and the accuracy of its verdicts, by enabling the relevance of evidence – especially forensic evidence - to be meaningfully evaluated and communicated. However, its actual use in practice is minimal, and indeed the most natural way to handle probabilistic evidence (Bayes) has generally been shunned.
The first workshop (30th August to 2nd September 2016)  that is part of our 6-month programme "Probability and Statistics in Forensic Science" at the Issac Newton Institute of Mathematics Cambridge directly addresses the above assertion and seeks to understand the scope, limitations, and barriers of using statistics and probability in court. The Workshop brings together many of the world's leading academics and pracitioners (including lawyers) in this area. Information on the programme and how to participate can be found here.

A new review paper* "Bayes and the Law" has just been published in Annual Review of Statistics and Its Application.

This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes’ theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. The paper argues that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law.

*Full citation:
Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, pp51-77, June 2016 http://dx.doi.org/10.1146/annurev-statistics-041715-033428. Pre-publication version is here and the Supplementary Material is here.

## Wednesday, 1 June 2016

### Bayesian networks for Cost, Benefit and Risk Analysis of Agricultural Development Projects

Successful implementation of major projects requires careful management of uncertainty and risk. Yet, uncertainty is rarely effectively calculated when analysing project costs and benefits. In the case of major agricultural and other development projects in Africa this challenge is especially important.

A paper just published* in the journal Experts Systems with Applications presents a Bayesian network (BN) modelling framework to calculate the costs, benefits, and return on investment of a project over a specified time period, allowing for changing circumstances and trade-offs. Marianne Gadeberg and Eike Luedeling have written an overview of the work here.

The framework uses hybrid and dynamic BNs containing both discrete and continuous variables over multiple time stages. The BN framework calculates costs and benefits based on multiple causal factors including the effects of individual risk factors, budget deficits, and time value discounting, taking account of the parameter uncertainty of all continuous variables. The framework can serve as the basis for various project management assessments and is illustrated using a case study of an agricultural development project. The work was a collaboration between the World Agroforestry Centre (ICRAF), Nairobi, Kenya, the Risk Information Management Group at Queen Mary (as part of the BAYES-KNOWLEDGE project) and Agena Ltd.

*The full reference is:
Yet, B., Constantinou, A., Fenton, N., Neil, M., Luedeling, E., & Shepherd, K. (2016). "A Bayesian Network Framework for Project Cost, Benefit and Risk Analysis with an Agricultural Development Case Study" . Expert Systems with Applications, Volume 60, 30 October 2016, Pages 141–155. DOI: 10.1016/j.eswa.2016.05.005
Until July 2016 the full published pdf is available for free.  A permanent pre-publication pdf is available here.

See also: Can we build a better project: assessing complexities in development projects

Acknowledgements: Part of this work was performed under the auspices of EU project ERC-2013-AdG339182-BAYES_KNOWLEDGE and part under ICRAF Contract No SD4/2012/214 issued to Agena. We acknowledge support from the Water, Land and Ecosystems (WLE) program of the Consultative Group on International Agricultural Research (CGIAR).

## Thursday, 26 May 2016

### Using Bayesian networks to assess new forensic evidence in an appeal case

If new forensic evidence becomes available after a conviction how do lawyers determine whether it raises sufficient questions about the verdict in order to launch an appeal? It turns out that there is no systematic framework to help lawyers do this. But a paper published today by Nadine Smit and colleagues in Crime Science presents such a framework driven by a recent case, in which a defendant was convicted primarily on the basis of sound evidence, but where subsequent analysis of the evidence revealed additional sounds that were not considered during the trial.

From the case documentation, we know the following:
• A baby was injured during an incident on the top floor of a house
• Blood from the baby was found on the wall in one of the rooms upstairs
• On an audio recording of the emergency telephone call made by the suspect, a scraping sound (allegedly indicating scraping blood off a wall) can be heard
• The suspect was charged with attempted murder
The audio evidence played a significant role in the trial. But, during the appeal preparation process, the call was re-analysed by an audio expert on behalf of the defence, and four other sounds were identified on the same recording that, according to the expert, showed similarities to the original sound. In particular, one of these sounds was of interest because of background noise that could be heard simultaneously. The background noise was presumed to be the television, which was located in a different room to where the prosecution argued the scraping of the blood took place.  During this second sound, the TV (located downstairs) could be heard simultaneously on the emergency recording. A statement by the police reads that the suspect was frequently rubbing his face in their presence. The defence proposed that the incriminating sound in the recording was not blood scraping after all, but simply the defendant rubbing his face.

The framework described in Smit's paper is intended to overcome the gap between what is generally known from scientific analyses and what is hypothesized in a legal setting. It is based on Bayesian networks (BNs) which are a structured and understandable way to evaluate the evidence in the specific case context and present it in a clear manner in court. However, BN methods are often criticised for not being sufficiently transparent for legal professionals. To address this concern the paper shows the extent to which the reasoning and decisions of the particular case can be made explicit and transparent. The BN approach enables us to clearly define the relevant propositions and evidence, and uses sensitivity analysis to assess the impact of the evidence under different prior assumptions. The results show that such a framework is suitable to identify information that is currently missing, and clearly crucial for a valid and complete reasoning process. Furthermore, a method is provided whereby BNs can serve as a guide to not only reason with incomplete evidence in forensic cases, but also identify very specific research questions that should be addressed to extend the evidence base to solve similar issues in the future.

Full citation:
Smit, N. M., Lagnado, D. A., Morgan, R. M., & Fenton, N. E. (2016). "An investigation of the application of Bayesian networks to case assessment in an appeal case". Crime Science, 2016, 5: 9, DOI 10.1186/s40163-016-0057-6 (open source). Published version pdf.
The research was funded by the Engineering and Physical Sciences Research Council of the UK through the Security Science Doctoral Research Training Centre (UCL SECReT) based at University College London (EP/G037264/1), and the European Research Council (ERC-2013-AdG339182-BAYES_KNOWLEDGE).

The BN model (which is fully spceified in the paper) was built and run using the free version of AgenaRisk.

## Tuesday, 26 April 2016

### Hillsborough Inquest - my input

With today's verdict (fans unlawfully killed) coming after more than two years I can now speak about my own involvement in the Inquest.

Because of the years that have passed few people are aware that there was a 'near-miss'  disaster at Hillsborough eight years before the actual disaster. The circumstances were essentially identical -  an FA Cup Semi Final with far too many supporters let in to the Leppings Lane stand leading to a massive crush. Because of the quick thinking of a steward who was able to open a gate onto the pitch nobody died on that occasion (although there were many injuries).  I know this because I was present at that earlier near disaster and I was, in fact, Secretary of the Sheffield Spurs Supporters Club. At the time I wrote to the FA and South Yorkshire police as I felt mistakes had been made, and indeed the incident was sufficiently serious that Hillsborough (which had been used every year as one of the two semi-final venues) was avoided until 1988 (the year before the disaster). Immediately after the disaster in 1989 I wrote to the FA and Lord Taylor (who led the original enquiry) to inform them of the events of 1981. Although I was interviewed at that time by the Police investigators, my evidence was never used.

In 2014 - out of the blue - I was asked to attend the new Hillsborough Inquest as it had been decided that the 1981 incident was an important piece of the story.  Here are a couple of links to media reports about my appearance:
Norman Fenton, 26 April 2016

## Friday, 25 March 2016

### Statistics of coincidences: Ben Geen case revisited (ABC)

In November 2014 I reported on the case of nurse Ben Geen who was convicted in 2006 for murdering 2 patients and seriously harming 15 others. I had been asked to produce an expert report on the 'statistical coincidences' in the case for the Criminal Cases Review Board.

Now a 30-minute documentary on the case presented by Joel Werner is to be aired on Australia's national radio station ABC on 28 March. In the programme (which you can listen to in full from the links at the top of the ABC page) I present a lay summary of the statistical argument (from minutes 16:30 to 21:34).

Norman Fenton

## Saturday, 19 March 2016

### Turning poorly structured data into intelligent Bayesian Network models for medical decision support

Medical data is very often badly structured, incomplete and inconsistent. This limits our ability to  generate useful models for prediction and decision support if we rely purely on machine learning techniques. That means we need to exploit expert knowledge at various model development stages. This problem - which is common in many application domains - is tackled in a paper** published in the latest issue of Artificial Intelligence in Medicine.

The paper describes a rigorous and repeatable method for building effective Bayesian Network (BN) models from complex data - much of which comes in unstructured and incomplete responses by patients from questionnaires and interviews. Such data inevitably contains repetitive, redundant and contradictory responses; without expert knowledge learning a BN model from the data alone is especially problematic where we are interested in simulating causal interventions for risk management. The novelty of this work is that it provides a rigorous consolidated and generalised framework that addresses the whole life-cycle of BN model development. The method is validated using data from forensic psychiatry. The resulting BN models demonstrate competitive to superior predictive performance against the data-driven state-of-the-art models. More importantly, the resulting BN models go beyond improving predictive accuracy and into usefulness for risk management through intervention, and enhanced decision support in terms of answering complex clinical questions that are based on unobserved evidence.

The method is applicable to any application domain involving large-scale decision analysis based on such complex and unstructured information. It challenges decision scientists to reason about building models based on what information is really required for inference, rather than based on what data is available. Hence, it forces decision scientists to use available data in a much smarter way.

**The full reference for the paper is:
Constantinou, A. C., Fenton, N., Marsh, W., & Radlinski, L. (2016). "From complex questionnaire and interviewing data to intelligent Bayesian Network models for medical decision support".Artificial Intelligence in Medicine, Vol 67 pages 75-93. DOI http://dx.doi.org/10.1016/j.artmed.2016.01.002

## Thursday, 10 March 2016

### A Bayesian network to determine optimal strategy for Spurs' success

As a committed Spurs fan I have spent the last few months salivating at the club's sudden and unexpected rise and the prospect of them winning their first league title since 1961. By mid-February they were clear favourites to win the Premier League title. However, in my view, the challenge was compromised by the team becoming overstretched by playing too many matches in a short space of time. In particular, I felt that their involvement in the Europa League was an unnecessary distraction and burden. When I expressed these views on a Spurs online forum (backed up with some data showing consistent under-performance during periods when they were involved in the Europa League) I got heavily criticised by other fans who said it was important to try to win every competition.

Having simultaneously been involved in research discussions about the use of decisions in Bayesian networks, I decided to build a small model in AgenaRisk to resolve the dilemma once and for all. I have written up the results of the analysis here. The model can be downloaded from here.

In summary, there were 4 strategic options available to Spurs' manager Mauricio Pochettino at the time I started to do the analysis:
1. Focus on Premier League
2. Focus on Premier League and FA Cup
3. Focus on Premier League and Europa League
4. Focus on all three competitions
My BN model shows that the optimal decision (based on my subjective utility values of the different outcomes) was to go for 1 with 2 a close second. Unfortunately  (I believe) Pochettino opted for 3 which, as the model shows, suggests his personal utility value for winning the Europa League was actually higher than winning the Premier League.

See also: The problem with predicting football results - you cannot rely on the data

## Thursday, 4 February 2016

### Problems with the Likelihood Ratio method for determining probative value of evidence: the need for exhaustive hypotheses

Norman Fenton, 4 Feb 2016

I have written several times before about the likelihood ratio (LR) method that is recommended for use by forensic scientists when presenting evidence (such as the fact that DNA collected at a crime scene is found to have a profile that matches the DNA profile of a defendant in a case). In general the LR is a very good and simple method for communicating the impact of evidence (in this case on the hypothesis that the defendant was at the crime scene), but its correct use is based on strict assumptions that have been routinely ignored by forensic experts and statisticians, leading to the very kind of confusion and misunderstanding (when presented to lawyers and juries) that it was supposed to help avoid. The papers [1] and [2] provide an in-depth analysis of the problems. In this short article I will highlight just one of these problems which invalidate the LR. Subsequent articles will focus on the other problems and issues.

To recap: The LR is the probability of finding the evidence E if the prosecution hypothesis Hp is true (formally we write this as 'Probability of E given Hp') divided by the probability of finding the evidence E if the defence hypothesis Hd is true (formally we write this as  'probability of E given Hd').

So, to compute the LR, the forensic expert is forced to consider the probability of finding the evidence under both the prosecution and defence hypotheses.  This is a very good thing to do because it helps to avoid common errors of communication that can mislead lawyers and juries (notably the prosecutor's fallacy). Even more importantly, the LR is a measure of the probative value of the evidence because:
• when the LR is greater than one the evidence supports the prosecution hypothesis (increasingly for larger values);
• when the LR is less than one it supports the defence hypothesis (increasingly as the LR gets closer to zero);
• when the LR is equal to one then the evidence supports neither hypothesis and so is 'neutral'. In such cases, since the evidence has no probative value lawyers and forensic experts believe it should not be admissible.
However, as explained in [1] and [2] (because of Bayes Theorem) for the LR to 'work' with respect to being a measure of probative value, the two hypotheses considered must be 'mutually exclusive and exhaustive'. This means that the defence hypothesis Hd must simply be the negation of the prosecution hypothesis Hp. So, for example, if Hp is "Defendant was at the crime scene" then Hp must be "Defendant was not at the crime scene".  Now, while there is more or less unanimity within the statistics and forensics field that the hypotheses must be mutually exclusive in order for the LR to be used, there is no such unanimity about the hypotheses being exhaustive. Indeed, the Royal Statistical Society Practitioner Guide to Case Assessment and Interpretation of Expert Evidence Guidelines [3] (page 32) specifies that the LR requires two mutually exclusive but not necessarily exhaustive hypotheses (which, interestingly, contradicts what is stated in the earlier Guidelines by the same group [4], page 96). To see why incorrect conclusions may be drawn when the hypotheses are not exhaustive we consider a very simple example:

Fred is the defendant for a crime.  The main evidence against Joe is that his DNA profile is found to be a match of a DNA sample found at the scene of the crime (for simplicity we ignore the possibility of errors in the DNA match). The DNA profile is of a type that is found in only 1 in 10,000 people. However, Fred has an identical twin brother Joe. Using the following:
• Prosecution hypothesis Hp:  "Fred is the source of the DNA"
• Defence hypothesis Hd: "Joe is the source of the DNA"
and
• Evidence E: "the DNA found matches Fred's profile"
The defence reasons - correctly using the likelihood ratio approach- that the evidence E has no probative value with respect to the above two hypotheses, because the twins have the same DNA profile, i.e.
P(E given Hp) = P(E given Hd) = 1.
Hence, the defence demands the evidence is withdrawn because it is 'neutral'.

The problem here is that, even if we assume the hypotheses are mutually exclusive (i.e. we exclude the possibility that both the twins committed the crime) they are certainly NOT exhaustive. The correct defence hypothesis in this case should be "Fred is NOT the source of the DNA". This is made up of two cases:
• Hd: "Joe is the source of the DNA"
• Ho: "Another person (not Fred or  Joe) is the source of the DNA"
If we assume - before any evidence is known - that Hp, Hd and Ho are equally likely then the impact of observing the evidence is certainly NOT neutral - it is probative in favour of the prosecution hypothesis as can be shown from running the calculations in a Bayesian network tool:

The probability of Hp increases from 33% to to just under 50%.

But the supposedly 'neutral' evidence can have an even more dramatic impact in practice. Suppose, for example, that Joe has an alibi that is considered pretty reliable. Then this might reduce our prior belief in his innocence to 2%. In this case the before and after probabilities are:

The belief in the prosecution hypothesis in this case has shifted to above 95% - possibly sufficient for a jury to be convinced it is the truth.

If the DNA evidence in the above example was a non-match then the LR approach using the original hypotheses is even more obviously flawed because in this case:
P(E given Hp) = P(E given Hd) = 0
But the evidence is certainly anything but 'neutral' because, after observing the evidence, the prosecution hypothesis Hp must be false (as must Hd).

While the example above is obviously simplistic and contrived more realistic examples are provided in [1] which also highlights this very problem in the case of Barry George (convicted and subsequently acquitted of the murder of TV celebrity Gill Dando after an appeal ruled that the gunpowder residue evidence presented in the original trial was inadmissible in a re-trial on the basis that it had a LR equal to one and so had 'no probative value'.)

References
1. Fenton, N. E., D. Berger, D. Lagnado, M. Neil and A. Hsu, (2013). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)", Science and Justice, http://dx.doi.org/10.1016/j.scijus.2013.07.002.  A pre-publication draft of the article can be found here.
2. Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016 to appear. Pre-publication version here
3. Jackson, G., Aitken, C., & Roberts, P. (2015). PRACTITIONER GUIDE NO 4: Case Assessment and Interpretation of Expert Evidence. Royal Statistical Society.  Available here.
4. Aitken, C, Roberts, P, Jackson, G, (2010) PRACTITIONER GUIDE NO 1:"Fundamentals of Probability and Statistical Evidence in Criminal Proceedings: Guidance for Judges, Lawyers, Forensic Scientists and Expert Witnesses. Royal Statistical Society. Available here.

## Thursday, 28 January 2016

### Misleading DNA evidence and the current damaged winning lottery ticket story

Norman Fenton, 28 January 2016

This post is primarily about how DNA match evidence is often presented in a way that is highly misleading (it is an important issue in an ongoing case I'm involved with). But in order to illustrate the point it turns out that we can use a simple analogy based loosely on the current lottery story that is getting a lot of media attention in the UK. This concerns an unverified £33 million winning ticket from a recent draw. About 200 people are claiming to have bought the (single) winning ticket but, until today*, none had actually provided proof of possessing such a ticket. The claim of one - Miss Susan Hinte - is the one that has grabbed media attention because she has produced a ticket in which key identifying information cannot be read because, she claims, the ticket was put through a washing machine.

But first let's look at the DNA issue, which is concerned with the following generic problem:
• The prosecution claims that defendant Joe was at the crime scene. This hypothesis is denoted as Hp.
• A tiny trace of DNA from the crime scene has been analysed and found to match the profile of Joe. This evidence (of the match) is denoted E
Typically the defence will argue that Joe was not at the crime scene and that any DNA matching Joe - especially as it was a tiny trace - got there through secondary transfer or other means. So the defence hypothesis Hd is simply the negation of Hp.

The DNA experts have correctly recognised that, in determining the probative value of the evidence E,  they have to use the ‘likelihood ratio’ approach [1]. This means they have to consider both of the following probabilities:
1. The probability that E is the result of the prosecution hypothesis Hp being true  - formally we write this as P(E given Hp)
2. The probability that E is the result of the defence hypothesis Hd being false  -  formally we write this as P(E given Hd)
If probability 1 is greater than probability 2 then the evidence E supports  Hp over Hd and vice versa. The likelihood ratio is simply 1 divided by 2 and provides a simple and compelling measure of probative value of evidence. If the ratio is greater than one the evidence E supports Hp, with higher values indicating stronger support. If the ratio is less than one the evidence E supports Hd, with smaller values indicating stronger support. However, for reasons explained in [1], this whole notion of probative value is not meaningful if the defence hypothesis Hd is not the negation of the prosecution hypothesis Hp. One of the common errors made by DNA experts is to replace Hd with a different hypothesis, namely Hd':  "DNA from Joe got there by secondary transfer".  In this case Hd' excludes other possibilities of observing E even though Joe was not at the crime scene (such as errors or contamination during the DNA testing, or the DNA belonging to a different person with the same profile etc) and is not even mutually exclusive to Hp since Joe may have been at the crime scene even though the trace sample was there through secondary transfer. But, while this common error is serious, it is not the real concern I wish to raise here. In fact, let's suppose that no such error is made and that the expert considers the correct Hd.

The real concern is how a jury member reacts when the DNA expert now makes the following assertions:
1. “The findings are what I would have expected if Hp were true.” i.e. P(E given Hp) is very high
2. “The probability of the findings are considerably more likely to have been the result of Hp rather than Hd”  i.e. P(E given Hp) is much higher than P(E given Hd)
Notwithstanding the unnecessary redundancy of statement 1, these assertions sound very important and suggest very strong support for the prosecution hypothesis, especially as most people would already have assumed (wrongly) that the DNA 'match' means the trace certainly belongs to Joe.

But to demonstrate how misleading they are I will return now to the lottery example. For simplicity I will assume the old 6-ball lottery with 49 numbers. Suppose the winning numbers were:
1, 7, 21, 28, 40, 46

Mrs Smith has a damaged ticket that she claims has the winning numbers. The evidence E is that the first number (which is the only number clearly visible) is 1.

Our hypotheses are:
• Hp: “Mrs Smith's ticket is the winning ticket”
• Hd: “Mrs Smith ticket is not the winning ticket”
In this case we know the following:
• P(E given Hp) = 1  (it is certain that the first number on the ticket would be 1 if it was the winning ticket)
• P(E given Hd) is 0.122 (this is the proportion of non-winning tickets that have 1 as the first number)
So we could certainly make exactly the same assertions in this case as the DNA experts above:
1. “The findings are what I would have expected if Hp were true.” (since the probability of E given Hp is 1)
2. “The probability of the findings are considerably more likely to have been the result of Hp rather than Hd” (since 1 is considerably greater than 0.122).
However, despite these (correct) assertions it is almost certain that Hd rather than Hp is true - Mrs Smith's ticket is not the winning ticket. In fact, the probability of Hp being true is less than one in 1.7 million (because there are over 1.7 million non-winning combinations in which the first number is 1).

So what is the moral of this story? The likelihood ratio of the evidence might often suggest the evidence is highly probative in favour of one of the hypotheses, but if the prior probability of the alternative hypothesis was much higher to start with then the evidence will not ‘overturn’ the prior belief in favour of the alternative.

Lay people ignore this in connection to DNA evidence. Because the random match probability associated with a DNA match is typically less than one in a billion, the very fact that the evidence E is a "DNA match" already puts into their mind the notion that this 'must tie the defendant to the crime scene'. But the random match probability is almost irrelevant in this case - it only accounts for a tiny proportion of P(E given Hp). Lay people can also easily be tricked into believing that the (redundant) assertion 1 “The findings are what I would have expected if Hp were true” provides additional weight to assertion 2.

Unfortunately, this type of evidence is increasingly prejudicing juries and, I believe, leading to serious miscarriages of justice.

*The real winner has now been found, and since their ticket was not damaged it can not have been Miss Hinte

[1] Fenton, N. E., D. Berger, D. Lagnado,  M. Neil and A. Hsu, (2014). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)",  Science and Justice, 54(4), 274-287 http://dx.doi.org/10.1016/j.scijus.2013.07.002. (pre-publication draft here)