Probability and Risk: 2017

Monday, 11 September 2017

An objective prior probability for guilt?

One of the greatest impediments to the use of probabilistic reasoning in legal arguments is the difficulty in agreeing on an appropriate prior probability that the defendant is guilty. The 'innocent until proven guilty' assumption technically means a prior probability of 0 - a figure that (by Bayesian reasoning) can never be overturned no matter how much evidence follows. Some have suggested the logical equivalent of 1/N where N is the number of people in the world. But this probability is clearly too low as N includes too many who could not physically have committed the crime. On the other hand the often suggested prior 0.5 is too high as it stacks the odds too much against the defendant.

Therefore, even strong supporters of a Bayesian approach seem to think they can and must ignore the need to consider a prior probability of guilt (indeed it is this thinking that explains the prominence of the 'likelihood ratio' approach discussed so often on this blog).

New work - presented at the 2017 International Conference on Artificial Intelligence and the Law (ICAIL 2017) - shows that, in a large class of cases, it is possible to arrive at a realistic prior that is also as consistent as possible with the legal notion of ‘innocent until proven guilty’. The approach is based first on identifying the 'smallest' time and location from the actual crime scene within which the defendant was definitely present and then estimating the number of people - other than the suspect - who were also within this time/area. If there were n people in total, then before any other evidence is considered each person, including the suspect, has an equal prior probability 1/n of having carried out the crime.

The method applies to cases where we assume a crime has definitely taken place and that it was committed by one person against one other person (e.g. murder, assault, robbery). The work considers both the practical and legal implications of the approach and demonstrates how the prior probability is naturally incorporated into a generic Bayesian network model that allows us to integrate other evidence about the case.

Full details:

Fenton, N. E., Lagnado, D. A., Dahlman, C., & Neil, M. (2017). "The Opportunity Prior: A Simple and Practical Solution to the Prior Probability Problem for Legal Cases". In International Conference on Artificial Intelligence and the Law (ICAIL 2017). Published by ACM. Pre-publication draft.

Thursday, 7 September 2017

Recommendations for Dealing with Quantitative Evidence in Criminal Law

From July to December 2016 the Isaac Newton Institute Programme on Probability and Statistics in Forensic Science in Cambridge hosted many of the world's leading figures from the law, statistics and forensics with a mixture of academics (including mathematicians and legal scholar), forensic practitioners, and practicing lawyers (including judges and eminent QCs). Videos of many of the seminars and presentation from the Programme can be seen here.

A key output of the Programme has now been published. It is a very simple set of twelve guiding principles and recommendations for dealing with quantitative evidence in criminal law for the use of statisticians, forensic scientists and legal professionals. The layout consists of one principle per page as shown below.

Links:

Monday, 14 August 2017

The likelihood ratio and its use in the 'grooming gangs' news story

This blog has reported many times previously (see links below) about problems with using the likelihood ratio. Recall that the likelihood ratio is commonly used as a measure of the probative value of some evidence E for a hypothesis H; it is defined as the probability of E given H divided by the probability of E given not H.

There is especially great confusion in its use where we have data for the probability of H given E rather than for the probability of E given H. Look at the somewhat confusing argument here in relation to the offence of 'child grooming' which is taken directly from the book McLoughlin, P. “Easy Meat: Inside Britain’s Grooming Gang Scandal.” (2016):

Given the sensitive nature of the grooming gangs story in the UK and the increasing number of convictions, it is important to get the maths right. The McLoughlin book is the most thoroughly researched work on the subject. What the author of the book is attempting to determine is the likelihood ratio of the evidence E with respect to the hypothesis H where:

H: “Offence is committed by a Muslim” (so not H means “Offence is committed by a non-Muslim”)

E: “Offence is child grooming”

In this case, the population data cited by McLoughlin provides our priors P(H)=0.05 and, hence, P(not H)=0.95. But we also have the data on child grooming convictions that gives us P(H | E)=0.9 and, hence, P(not H | E)=0.1.

What we do NOT have here is direct data on either P(E|H) or P(E|not H). However, we can still use Bayes theorem to calculate the likelihood ratio since:

So, in the example we get:

Hence, while the method described in the book is confusing, the conclusion arrived at is (almost) correct (the slight error in the result, namely 170.94 instead of 171, is caused by the authors rounding 10 divided by 95% to 10.53)

See also

Friday, 11 August 2017

Automatically generating Bayesian networks in analysis of linked crimes

Constructing an effective and complete Bayesian network (BN) for individual cases that involve multiple related pieces of evidence and hypotheses requires a major investment of effort. Hence, generic BNs have been developed for common situations that only require adapting the underlying probabilities. These so called `idioms’ make it practically possible to build and use BNs in casework without spending unacceptable amounts of time constructing the network. However, in some situations both the probability tables and the structure of the network depend on case specific details.

Examples of such situations are where there are multiple linked crimes. In (deZoete2015) a BN structure was produced for evaluating evidence in cases where a person is suspected of being the offender in multiple possibly linked crimes. In (deZoete2017) this work has been expanded to cover situations with multiple offenders for possibly linked crimes. Although the papers present a methodology of constructing such BNs, the workload associated with constructing them together with the possibility of making mistakes in conditional probability tables, still present unnecessary difficulties for potential users.

As part of the BAYES KNOWLEDGE project, we have developed online accessible GUIs that allow the user to select the parameters that reflect their crime linkage situation (both for one and double offender crime linkage cases). The associated BN is then automatically generated according to the structures described in (deZoete2015) and (deZoete2017). It is presented visually in the GUI and is available as download for the user as a .net file which can be opened in AgenaRisk or another BN software package. These applications both serve as a tool for those interested or working with crime linkage problems and as a proof of principle of the added value of such GUIs to make BNs accessible by removing the effort of constructing every network from scratch.

The GUIs are available from the `DEMO’ tab on the BAYES KNOWLEDGE website and is based on R code, a statistical programming language. This automated workflow can reduce the workload for, in this case, forensic statisticians and increase the mutual understanding between researchers and legal professionals.

Jacob deZoete will be presenting this work at the 10th International Conference on Forensic Inference and Statistics (ICFIS 2017) in Minneapolis, September 2017.

Links

The working demo for single offender multiple crimes
The working demo for two offenders and linked crimes
de Zoete, J, Sjerps, M, Lagnado,D, Fenton, N.E. (2015), "Modelling crime linkage with Bayesian Networks" Law, Science & Justice, 55(3), 209-217. http://doi:10.1016/j.scijus.2014.11.005 Pre-publication draft here.
de Zoete, J, Sjerps, M, Evaluating evidence in linked crimes with multiple offenders. Science & Justice, 57(3): pp 228-238. https://doi.org/10.1016/j.scijus.2017.01.003

Thursday, 29 June 2017

Queen Mary researchers evaluate impact of new regulations on the Buy-To-Let property market using novel AI methods

In 2015 the British government announced major tax reforms for individual landlords that will be in full effect in tax year 2020/21, being introduced gradually after April 2017. The new reforms and regulations have received much media attention as there has been widespread belief that they were sufficiently skewed against landlords that they could signal the end of the Buy-To-Let (BTL) investment era in the UK.

Research by Anthony Constantinou and Norman Fenton of Queen Mary University of London, has now been published that provides the first comprehensive evaluation of the impact of the reforms on the London BTL property market. The results use a novel model (based on revolutionary new work in an AI method called Bayesian networks) that captures multiple uncertainties and allows investors to assess the impact of various factors of interest on their BTL investment, such as changes in interest rates, capital and rental growth. Additionally, the model allows for portfolio risk management through intervention between time steps, such as the effects of different scenarios of re-mortgaging.

The results show that, over a 10-year period, the overall return-on-investment (ROI) will be reduced under the new tax measures, but that the ROI remains good assuming a common BTL London profile. However, there are major differences depending on the investor strategy. For example, for risk-averse investors who choose not to expand their portfolio, the reforms are expected to have only a marginal negative impact, with the overall ROI reducing from 301% under the old regulations to 290% under the new (-3.7%), and this loss comes exclusively from a decrease in net profits from rental income (-32.2%). However, the impact on risk-seeking investors who aim to expand their property portfolio through leveraging is much more significant, since the new tax reforms are projected to decrease ROI from 941% to 590% (-37.3%), over the same 10-year period.

The impact on net profits also poses substantial risks for loss-making returns excluding capital gains, especially in the case of rising interest rates. While this makes it less desirable or even non-viable for some to continue being a landlord, based on the current status of all factors taken into consideration for simulation, investment prospects are still likely to remain good within a reasonable range of interest rate and capital growth rate variations. Further, the results also indicate that the recent trend of property prices in London increasing faster than rents will not continue for much longer; either capital growth rates will have to decrease, rental growth rates will have to increase, or we shall observe a combination of the two events.

The full paper (with open access link):

Constantinou, A. C., & Fenton, N. (2017). The future of the London Buy-To-Let property market: Simulation with temporal Bayesian Networks. PLoS ONE, 12(6): e0179297, https://doi.org/10.1371/journal.pone.0179297

The research was supported in part by the European Research Council (ERC) through the research project, ERC-2013-AdG339182-BAYES_KNOWLEDGE, while Agena Ltd provided software support.

Monday, 6 March 2017

Explaining and predicting football team performance over an entire season

When I was presenting the BBC documentary Climate Changes by Numbers and had to explain the idea of a statistical 'attribution study', I used the analogy of determining which factors most affected the performance of Premiership football teams year on year. Because I had to do it in a hurry I and my colleague Dr Anthony Constantinou did a very crude analysis which focused on a very small number of factors and showed, unsurprisingly, that turnover (i.e. mainly spend on transfer and wages) had the most impact of these.

We weren't happy with the quality of the study and decided to undertake a much more comprehensive analysis as part of the BAYES-KNOWLEDGE project. This project is all about improved decision-making and risk assessment using a probabilistic technique called Bayesian Networks. In particular, the main objective of the project is to produce useful/accurate predictions and assessments in situations where there is not a lot of data available. In such situations the current fad of 'big data' methods using machine learning techniques do not work; instead we use 'smart-data' - a method that combines the limited data available with expert causal knowledge and real-world ‘facts’. The idea of predicting Premiership teams' long term performance and identifying the key factors explaining changes was a perfect opportunity to both develop and validate the BAYES-KNOWLEDGE method, especially as we had previously done extensive work in predicting individual premiership match results (see links at bottom).

The results of the study have now been published in one of the premier international AI journals Knowledge Based Systems.

The Bayesian Network model in the paper enables us to predict, before a season starts, the total league points a team is expected to accumulate throughout the season (each team plays 38 games in a season with three points per win and one per draw). The model results compare very favourably against a number of other relevant and different types of models, including some which use far more data. As hoped for the results also provide a novel and comprehensive attribution study of the factors most affecting performance (measured in terms of impact on actual points gained/lost per season). For example, although unsurprisingly, the largest improvements in performance result from massive increases in spending on new players (an 8.49 points gain), an even greater decrease (up to 16.52 points) results from involvement in the European competitions (especially the Europa League) for teams that have previous little experience in such competitions. Also, something that was very surprising and that possibly confounds bookies - and gives punters good potential for exploiting - is that promoted teams generate (on average) a staggering increase in performance of 8.34 points, relative to the relegated team they are replacing. The results in the study also partly address/explain the widely accepted 'favourite-longshot bias' observed in bookies odds.

The full reference citation is:

Constantinou, A. C. and Fenton, N. (2017). Towards Smart-Data: Improving predictive accuracy in long-term football team performance. Knowledge-Based Systems, In Press, 2017, http://dx.doi.org/10.1016/j.knosys.2017.03.005

The pre-print version of the paper (pdf) can be found at http://constantinou.info/downloads/papers/smartDataFootball.pdf

We acknowledge the financial support by the European Research Council (ERC) for funding research project, ERC-2013-AdG339182-BAYES_KNOWLEDGE, and Agena Ltd for software support.

See also:

Thursday, 9 February 2017

Helping US Intelligence Analysts using Bayesian networks

Causal Bayesian networks are at the heart of a major new collaborative research project led by Australian University Monash - funded by the United States' Intelligence Advanced Research Projects Activity (IARPA). The objective is to help intelligence analysts assess the value of their information. IARPA was set up following the failure of the US intelligence agencies to properly assess the correct levels of threat posed by Al Qaeda in 2001 and Iraq in 2003.

The chief investigator at Monash, Kevin Korb, said in an interview in the Australian:

"..quantitative rather than qualitative methods were crucial in judging the value of intelligence.... more quantitative approaches could have helped contain the ebola epidemic by making authorities appreciate the scale of the problem months earlier. They could also build a better assessment of the likelihood of events like gunfire between vessels in the South China Sea, a substantial devaluation of the Venezuelan currency or a new presidential aspirant in Egypt."

Norman Fenton and Martin Neil (both of Agena and Queen Mary University of London) will be working on the project along with colleagues such as David Lagnado and Ulrike Hahn at UCL. AgenaRisk will be used throughout the project as the Bayesian network platform.

Further information:

Wednesday, 8 February 2017

Queen Mary in new £2 million project using Bayesian networks to create intelligent medical decision support systems with real-time monitoring for chronic conditions

UPDATE 9 Feb 2017: Various Research Fellowship and PhD vacancies funded by this project are now advertised. See here.

Queen Mary has been awarded a grant of £1,538,497 (Full economic cost £1,923,122) from the EPSRC towards a major new collaborative project to develop a new generation of intelligent medical decision support systems. The project, called PAMBAYESIAN (Patient Managed Decision-Support using Bayesian Networks) focuses on home-based and wearable real-time monitoring systems for chronic conditions including rheumatoid arthritis, diabetes in pregnancy and atrial fibrillation. It has the potential to improve the well-being of millions of people.

The project team includes researchers from both the School of Electronic Engineering and Computer Science (EECS) and clinical academics from the Barts and the London School of Medicine and Dentistry (SMD). The collaboration is underpinned by extensive research in EECS and SMD, with access to digital health firms that have extensive experience developing patient engagement tools for clinical development (BeMoreDigital, Mediwise, Rescon, SMART Medical, uMotif, IBM UK and Hasiba Medical).

The project is led by Prof Norman Fenton with co-investigators: Dr William Marsh, Prof Paul Curzon, Prof Martin Neil, Dr Akram Alomainy (all EECS) and Dr Dylan Morrissey, Dr David Collier, Professor Graham Hitman, Professor Anita Patel, Dr Frances Humby, Dr Mohammed Huda, Dr Victoria Tzortziou Brown (all SMD). The project will also include four QMUL-funded PhD students.

The three-year project will begin June 2017.

Background

Patients with chronic diseases must take day-to-day decisions about their care and rely on advice from medical staff to do this. However, regular appointments with doctors or nurses are expensive, inconvenient and not necessarily scheduled when needed. Increasingly, we are seeing the use of low cost and highly portable sensors that can measure a wide range of physiological values. Such 'wearable' sensors could improve the way chronic conditions are managed. Patients could have more control over their own care if they wished; doctors and nurses could monitor their patients without the expense and inconvenience of visits, except when they are needed. Remote monitoring of patients is already in use for some conditions but there are barriers to its wider use: it relies too much on clinical staff to interpret the sensor readings; patients, confused by the information presented, may become more dependent on health professionals; remote sensor use may then lead to an increase in medical assistance, rather than reduction.

The project seeks to overcome these barriers by addressing two key weaknesses of the current systems:

Their lack of intelligence. Intelligent systems that can help medical staff in making decisions already exist and can be used for diagnosis, prognosis and advice on treatments. One especially important form of these systems uses belief or Bayesian networks, which show how the relevant factors are related and allow beliefs, such as the presence of a medical condition, to be updated from the available evidence. However, these intelligent systems do not yet work easily with data coming from sensors.
Any mismatch between the design of the technical system and the way the people - patients and professional - interact.

We will work on these two weaknesses together: patients and medical staff will be involved from the start, enabling us to understand what information is needed by each player and how to use the intelligent reasoning to provide it.

The medical work will be centred on three case studies, looking at the management of rheumatoid arthritis, diabetes in pregnancy and atrial fibrillation (irregular heartbeat). These have been chosen both because they are important chronic diseases and because they are investigated by significant research groups in our Medical School, who are partners in the project. This makes them ideal test beds for the technical developments needed to realise our vision and allow patients more autonomy in practice.

To advance the technology, we will design ways to create belief networks for the different intelligent reasoning tasks, derived from an overall model of medical knowledge relevant to the diseases being managed. Then we will investigate how to run the necessary algorithms on the small computers attached to the sensors that gather the data as well as on the systems used by the healthcare team. Finally, we will use the case studies to learn how the technical systems can integrate smoothly into the interactions between patients and health professionals, ensuring that information presented to patients is understandable, useful and reduces demands on the care system while at the same time providing the clinical team with the information they need to ensure that patients are safe.

Further information: www.eecs.qmul.ac.uk/~norman/projects/PAMBAYESIAN/

This project also complements another Bayesian networks based project - the Leverhulme-funded project "CAUSAL-DYNAMICS (Improved Understanding of Causal Models in Dynamic Decision Making)" - starting January 2017. See CAUSAL-DYNAMICS

Sunday, 1 January 2017

The problem with the likelihood ratio for DNA mixture profiles

We have written many times before (see the links below) about use of the Likelihood Ratio (LR) in legal and forensic analysis.

To recap: the LR is a very good and simple method for determining the extent to which some evidence (such as DNA found at the crime scene matching the defendant) supports one hypothesis (such as "defendant is the source of the DNA") over an alternative hypothesis (such as "defendant is not the source of the DNA"). The previous articles discussed the various problems and misinterpretations surrounding the use of the LR. Many of these arise when the hypotheses are not mutually exclusive and exhaustive. This problem is especially pertinent in the case of 'DNA mixture' evidence, i.e. when some DNA sample relevant to a case comes from more than one person. With modern DNA testing techniques it is common to find DNA samples with multiple (but unknown number of) contributors. In such cases there is no obvious 'pair' of hypotheses that are mutually exclusive and exhaustive, since we have individual hypotheses such as:

H1: suspect + one unknown
H2: suspect + one known other
H3: two unknowns
H4: suspect + two unknowns
H5: suspect + one known other + one unknown
H6: suspect + two known others
H7: three unknowns
H8: one known other + two unknowns
H9: two known others + one unknown
H10: three known others
H11: suspect + three unknowns
etc.

It is typical in such situations to focus on the 'most likely' number of contributors (say n) and then compare the hypothesis "suspect + (n-1) unknowns" with the hypothesis "n unknowns". For example, if there are likely to be 3 contributors then typically the following hypotheses are compared:

H1: suspect + two unknowns
H2: three unknowns

Now, to compute the LR we have to compute the likelihood of the particular DNA trace evidence E under each of the hypotheses. Generally both of these are extremely small numbers, i.e. both the probability values P(E | H1) and P( E | H2) are very small numbers. For example, we might get something like

P(E | H1) = 0.00000000000000000001 (10 to the minus 20)
P(E | H2) = 0.00000000000000000000000001 (10 to the minus 26)

For a statistician, the size of these numbers does not matter – we are only interested in the ratio (that is precisely what the LR is) and in the above example the LR is very large (one million) meaning that the evidence is a million times more likely to have been observed if H1 is true compared to H2. This seems to be overwhelming evidence that the suspect was a contributor. Case closed?

Apart from the communication problem in court of getting across what this all means (defence lawyers can and do exploit the very low probability of E given H1) and how it is computed, there is an underlying statistical problem with small likelihoods for non-exhaustive hypotheses and I will highlight the problem with two scenarios involving a simple urn example. Superficially, the scenarios seem identical. The first scenario causes no problem but the second one does. The concern is that it is not at all obvious that the DNA mixture problem always corresponds more closely to the first scenario than the second.

In both scenarios we assume the following:

There is an urn with 1000 balls – some of which are white. Suppose W is the (unknown) number of white balls. We have 2 hypotheses:

H1: W=100
H2: W=90

We can draw a ball as many times as we like, note its colour and replace it (i.e. sample with replacement). We wish to use the evidence of 10,000 such samples.

Scenario 1: We draw 1001 white balls. In this case using standard statistical assumptions we calculate P(E | H1) = 0.013, P(E|H2) = 0.0000036. Both values are small but the LR is large, 3611, strongly favouring H1 over H2.

Scenario 2: We draw 1100 white balls. In this case P(E | H1) = 0.000057, P(E|H2) < 0.00000001. Again both values are very small but the LR is very large, strongly favouring of H1 over H2.

(note: in both cases we could have chosen a much larger sample and got truly tiny likelihoods but these values are sufficient to make the point).

So in what sense are these two scenarios fundamentally different and why is there a problem?

In scenario 1 not only does the conclusion favouring H1 make sense, but the actual number of balls drawn is very close to the expected number we would get if H1 were true (in fact, W=100 is the 'maximum likelihood estimate' for number of balls). So not only does the evidence point to H1 over H2, but also to H1 over any other hypothesis (and there are 1000 different hypotheses W=0, W=1, W=2 etc.).

In scenario 2 the evidence is actually even much more supportive of H1 over H2 than in scenario 1. But it is essentially meaningless because it is virtually certain that BOTH hypotheses are false.

So, returning to the DNA mixture example, it is certainly not sufficient to compare just two hypotheses. The LR of one million in favour of H1 over H2 may be hiding the fact that neither of these hypotheses is true. It is far better to identify as exhaustive a set of hypotheses as is realistically possible and then determine the individual likelihood value of each hypothesis. We can then identify the hypothesis with the highest likelihood value and consider its LR compared to each of the other hypotheses.