Tuesday 8 November 2011

Nonsensical probabilities about asteroid risk

There is an article in today's Evening Standard in which, rather depressingly, someone who should know better (Roger Highfield, Editor of New Scientist) reels off a typically misleading probability about asteroid strike risk (this is in response to the news today that an asteroid was within just 200,000 miles of Earth).

Quoting a recent book by Ted Nield he says (presumably to comfort readers) that

Our chances of dying as a result of being hit by a space rock are something like one in 600,000. 
There are all kinds of  ambiguities about the statement that I won't go into (involving assumptions about  random people of  'average' age and 'average' life expectancy) but even ignoring all that, if the statement is supposed to be a great comfort to us then it fails miserably. That is because it can reasonably be interpreted as providing the 'average' probability that a random person living today will eventually die from being hit by a space rock. Assuming a world population of 7 billion that's about 12,000 of us. And 12,000 actual living people is a pretty large number to die in this way. But it is about as meaningful as putting Arnold Schwarzenegger in a line up with a thousand ants and asserting that the average height of those in the line-up is 3 inches tall. The key issue here is that large asteroid strikes are, like Schwarzenegger in the line-up, low probability high impact events. Space rocks will not kill a few hundred people every year as implied by the original statement, just as there are no 3-inch tall beings in the line-up. Tens of thousands of years pass between them killing any more than a handful of people. But eventually one will wipe out most of the world's population. 

What Ted Nield should have stated (and what we are most interested in knowing) was the probability that a large space rock (one big enough to cause massive loss of life) will strike Earth in the next 50 years.

Indeed, I suspect that (using Nield's own analysis) this probability would be close to the 1 in 600,000 quoted (given that incidents of small space rocks killing a small number of people are very rare). You might argue I am splitting hairs here but there is an important point of principle. Nield and Highfield avoid stating an explicit probability of a very rare event (such as in this case a massive asteroid strike) because there is a natural resistance (especially from non-Bayesians) to do so. For some reason it is more palatable in their eyes to consider the probability of a random person dying (albeit due to a rare event), presumably because it can more easily be imagined. But, as I have hopefully shown, that only creates more confusion.

Friday 4 November 2011

Bayes and the Law: Nature article

My commentary piece on the role of Bayes in the Law has just appeared in Nature. A pdf of an extended draft on which it was based is here.

Monday 3 October 2011

Bayes and the Law: Guardian article

There is an article in the Guardian today that is the result of an interview I had with the journalist Angela Saini. It is about the issue of Bayes and the Law following the RvT ruling and it includes a reference to the consortium that we are putting together to improve the situation.

Friday 2 September 2011

Another specialist risk assessment company gets bought out

Algorithmics, a company specialising in risk software for financial institutions, has been sold to IBM for $387m.

Agena partnered Algorithmics during the period 2003-2005 when there was a clamour for so-called 'advanced measurement approaches' to operational risk assessment. The Basel 2 accord specified that banks which used a validated advanced measurement approach to calculate their operational risk exposure could set aside a lower percentage capital allocation. This meant there was a major financial incentive for banks to develop such approaches. Most banks looked at Bayesian networks as a potential solution and, indeed, a number wanted Algorithmics to provide such a solution to integrate with the existing credit and market risk software that Algo provided. Algorithmics had started work on their own Bayesian network platform for OpRisk, but decided that AgenaRisk was superior. Hence we partnered them in projects with some major banks, with Agena providing the underlying Bayesian network technology and modelling skills and Algorithmics providing the reporting infrastructure.  Here is a paper we wrote that gives a feel for the BN approach to operational risk that we developed.

In late 2005 Algorithimcs actually got taken over by the Fitch Group. Since Fitch already had their own (non-Bayesian) OpRisk solution - which they had massively invested in - the partnership effectively ended then, as it appears did Algorithmics' interest in Bayesian networks. This is a great shame, especially when you consider the mess that financial institutions have made using classical statistics.

It is difficult to determine the extent to which banks are using Bayesian networks but, as we described here, there are plenty of financial analysts who are using fundamentally flawed methods in situations when the Bayesian approach would work.

Thursday 21 July 2011

Transforming Legal Reasoning through Effective use of Probability

Recent reports have highlighted the difficulties faced by the criminal justice system in adequately responding to the dramatic increase in the amount and complexity of forensic science, particularly given its (not infrequently) questionable value. Despite the growing consensus that the role of experts should be limited to making statements about the probability of their findings under competing hypotheses (instead of, for example, making categorical source attributions), and the ability of Bayes’ theorem to encapsulate the proper or normative effect of probabilistic evidence, Bayesian reasoning has been largely ignored or misunderstood by criminal justice professionals.

Proper use of probabilistic reasoning has the potential to improve dramatically the efficiency, transparency and fairness of the criminal justice system and the accuracy of its verdicts, by enabling the value of any given piece of evidence to be meaningfully evaluated and communicated. Bayesian reasoning employs the likelihood ratio (which is the probability of seeing the evidence given the prosecution hypothesis divided by the probability of seeing the evidence given the defence hypothesis), to illustrate the relevance and strength of each piece of evidence. Bayesian reasoning can therefore help the expert formulate accurate and informative opinions; help the court in determining the admissibility of evidence; help identify which cases should and should not be pursued and help lawyers explain, and jurors to evaluate the weight of evidence during a trial. It would also help identify error rates and unjustified assumptions entailed in expert opinions, which would in turn contribute to the transparency and legitimacy of the criminal justice process.

Unfortunately, there is widespread disagreement about the kind of evidence to which Bayesian reasoning should be applied and the manner in which it should be presented. Much of the disagreement over when it should be applied arises from fundamental misunderstandings about the way Bayes’ reasoning works, whereas disagreement over the manner in which it should be presented could be resolved by empirical research. Misunderstandings in the criminal justice system are exacerbated by the fact that in the few areas where Bayesian reasoning has been applied (such as DNA profiling ) its application has often been faulty and its ramifications poorly communicated. This has further resulted in widespread recourse to probabilistic fallacies in legal proceedings.

A dramatic and worrying example of this was a recent appeal court decision ((2010). R v T. EWCA Crim 2439 , see our draft article about this here) which appears to reject the use of Bayesian analysis and likelihood ratios for all but a very narrowly defined class of forensic evidence. Instead of being accepted as a standard tool of the forensic science trade, Bayesian analysis is perceived by much of the legal profession as an exotic, somewhat eccentric method to be wheeled out for occasional specialist appearances whereupon a judge or lawyer will cast doubts on, and even ridicule, its integrity (hence ensuring it is kept firmly locked in the cupboard for more years to come).

Ultimately this represents a failure by the community of academics, expert witnesses and lawyers who understand the potentially crucial and wide role that can be played by Bayesian analysis and likelihood ratios in legal arguments. This failure must be attributed to our inability to communicate the core ideas effectively. Resorting to the formulas and calculations in court is a dead-end strategy since these will never be understood by most lawyers, judges and juries.

Sunday 17 July 2011

Using Bayes to prove Obama did not write his own book?

There have been many questions about the closeness of President Obama's relationship with Weather Underground terrorist Bill Ayers. A whole new angle on the relationship has been raised in Jack Cashill's book Deconstructing Obama. Using information in this book Andre Lofthus has applied Bayes Theorem to conclude that Bill Ayers actually was the ghost writer for Obama's best selling book Dreams from My Father.

Loftus's analysis is based on a) a comparison of  Dreams with one of Ayers's own books Fugitive Days; and b) a comparison of Dreams with a different book Sucker Punch based on similar material to that of both Dreams and Fugitive Days.

Specifically in a) there were 759 similarities, of which 180 were categorized by Cashill  as "striking similarities", whereas in b) Cashill claims there were just six definite similarities, with a maximum of sixteen possible or definite similarities. As Lofhus's Bayesian analysis is not complete I have done my own analysis here. My own conclusions are not as definite. The evidence does indeed provide very strong support in favour of the books being written by the same author. However, if you have a strong prior belief that the books were written by different authors (say you are 99.9% sure) then even after observing the evidence of 180 striking similarities, it turns out that (with what I believe are more reasonable assumptions than made by Lofhus) there is still a better than 50% chance that the books were written by different authors.

Monday 11 July 2011

An interesting Conference

I just got back from the International Centre for Comparative Criminological Research annual conference (programme is here) held at the Open University Milton Keynes. There were some outstanding keynote speakers such as Lord Justice Leveson, Prof John Hatchard, Prof Jim Fraser and Dr Itiel Dror, and the panel included Iain McKie (ex-policeman and campaigner on behalf of his daughter Shirley McKie who was wrongly accused of leaving her fingerprint at a crime scene and lying about it). When I was originally invited to speak at the conference I was going to talk about the latest research we were doing in collaboration with David Lagnado at UCL on using Bayesian networks to help build complex legal arguments (dealing with things like alibi evidence, motive and opportunity). But that was before the  R v T ruling and its potentially devasting impact for using Bayes in English courts (the analogy would have been like talking about differential equations after being told that you were not allowed to use addition and subtraction). So I ended up doing a presentation based on our draft paper addressing the R v T ruling (my slides are here).  This turned out to be a good move because I think it also addressed some of the core recurrent themes of the conference. Full report is here.

Wednesday 6 July 2011

Drug traces on banknotes: problems with the statistics in the legal arguments?

A colleague at our Law School has alerted us to a range of possible problems associated with evidence about drug traces in a number of cases. Most people are unaware of the extent to which drug traces are found on banknotes in circulation. As long ago as 1999 there were reports that 99% of banknotes in circulation in London were tained with cocaine and that one in 20 of the notes show levels high enough to indicate they have been handled by dealers or used to snort the drug. Similar results have been reported for Euros in Germany. Nevertheless in England and Wales (although not apparently in Scotland) drug trace evidence on banknotes has been used to help convict suspects on drug-related offences. One company in the UK specialises in analysing drug traces on banknotes and they have provided evidence in many cases.  Even if we ignore the intriguing question of why high levels of drugs on a person's banknotes suggests that they are drug dealers, a quick look at some of the case materials and papers suggests that there may be some fundamental statistical flaws in the way this kind of evidence is presented. I have produced a very simplified version of the problem in these slides (which includes relevant references at the end).

Friday 1 July 2011

An introduction to using Bayes and causal modelling in decision making, uncertainty and risk

We have produced a new overview paper on Bayesian networks for risk assessment. It was actually an invited paper for the a special issue of Upgrade, the journal of CEPIS (Council of European Professional Informatics Societies. 

I guess this means we can finally 'retire' our previous overview paper "Managing Risk in the Modern World" that we produced on behalf of the London Mathematical Society and Knowledge Transfer Network for Industrial Mathematics. 

Marco Ramoni: a great Bayesian and a great guy

I only just discovered the terrible news that Marco died last June (2010) aged just 47.  Marco was actually the first consultant we ever employed at Agena back in 1998. He was working at the Open University and had developed the Bayesian Knowledge Discovey tool with Paola Sebastiani (who we also worked with at the time at City University). We used Marco's expertise and tool on a project with a major insurance company.  The last time I saw him was around 2000 or 2001 when I took him and Paola Sebastiani to a Turkish restaurant in North London to try and convince both of them to come and work at Queen Mary University of London where I had just moved with my research group. Although they turned down my offer in favour of a move to the USA I kept in touch with Marco until quite recently. I only found out about his death when I tried to contact him about a new project I wanted to involve him in.

Marco was a brilliant man, and (unlike some academics) was also pleasure to be around.

Why risk models used by financial analysts are fundamentally flawed

A  letter I sent to the Financial Times, 2 March 2011:

John Kay's analysis of why the models used by financial analysts are
fundamentally flawed when it comes to predicting rare events ("Don't blame
luck when your models misfire" FT 1 March 2011) is correct but overly
pessimistic as he focuses only on 'traditional' statistical techniques
that rely on relevant historical data. These flawed methods cannot
accommodate even simple causal explanations that involve new risk
factors where previous data has not been accumulated. It is like trying to
predict what happens to the surface area of a balloon as you puff into it,
by relying only on data from puffs of this balloon. If, after each puff, you
measure the surface area and record it, and after say, the 23rd puff, you
create a statistical model showing how surface area increases with each
puff, you will then have a 'model' to predict what will happen after a
further 20, 50 or 100 puffs. None of these predictions will tell you that
the surface area will drop to zero as a result of the balloon bursting,
because your model does not incorporate the basic causal knowledge.
Fortunately, and in contrast to the article's dire conclusions, there are
formal modelling techniques that enable you to incorporate causal,
subjective judgements about previously unseen risks, and allow you to
predict rare events with some accuracy. We have been using such techniques
- causal Bayesian networks - successfully
in research and in practice for several years in real applications ranging
from transport accidents through to terrorist threats. We remain stunned
by the financial markets poor take-up of these methods as opposed to those
which have consistently proved not to work.