Probability and Risk: July 2020

Sunday, 19 July 2020

A privacy-preserving Bayesian network model for personalised COVID19 risk assessment and contact tracing

Concerns about the practicality and effectiveness of using Contact Tracing Apps (CTA) to reduce the spread of COVID19 have been well documented and, in the UK, led to the abandonment of the NHS CTA shortly after its release in May 2020.

One of the key non-technical obstacles to widespread adoption of CTA has been concerns about privacy. In this new paper our group present a causal probabilistic model (a Bayesian network) that provides the basis for a practical CTA solution that does not compromise privacy. Users of the model can provide as much or little personal information as they wish about relevant risk factors, symptoms, and recent social interactions. The model then provides them feedback about the likelihood of the presence of asymptotic, mild or severe COVID19 (past, present and projected). When the model is embedded in a smartphone app, it can be used to detect new outbreaks in a monitored population and identify outbreak locations as early as possible. For this purpose, the only data needed to be centrally collected is the probability the user has COVID19 and the GPS location.

The paper contains details of how to download and run the model.

Fenton, N., McLachlan, S., Lucas, P., Dube, K., Hitman, G., Osman, M., Kyrimi, E., Neil, M. (2020). A privacy-preserving Bayesian network model for personalised COVID19 risk assessment and contact tracing. MedRxiv, 2020.07.15.20154286. https://doi.org/10.1101/2020.07.15.20154286

And here is a 7-minute video showing the model in action:

Thursday, 16 July 2020

The need for causal models to understand and explain whether statistics provide evidence of racially biased policing

Even before the recent George Floyd case, there has been much debate about the extent to which claims of systemic racism are supported by statistical evidence. Contradictory conclusions have been made about whether unarmed blacks are more likely to be shot by police than unarmed whites using the same data. The problem is that, by relying only on data of ‘police encounters’, there is the possibility that genuine bias can be hidden.

In this short paper we provide a causal Bayesian network model to explain this bias – which is called collider bias or Berkson’s paradox – and show how the different conclusions arise from the same model and data. We also show that causal Bayesian networks provide the ideal formalism for considering alternative hypotheses and explanations of bias.

Fenton N.E., Neil M, Frazier S (2020), "The role of collider bias in understanding statistics on racially biased policing", http://arxiv.org/abs/2007.08406

Wednesday, 8 July 2020

UK Covid19 death rates by religion: Jews by far the highest and atheists by far the lowest 'overall' - but what does it mean?

15 July 2020 Update: an updated pdf version of this article is available here.

The most recent UK Office of National Statistics (ONS) report on Covid19 deaths by religion (covering the period 2 March - 15 May) provides the overall number of fatalities for each religious group but, curiously, provides no simple overall fatality rate by religious group. So, I have done it myself.

Using Table 2 of the report (which provides the total deaths per religious group) and Table 1 of the report (which provides the population proportion per religion) and assuming the UK population size is 65 million, we get the following table of deaths per 100,000 by religion:

So, looking only at these population totals, Jews (by far) and then Christians have the highest death rate with atheists (no religion)* by far the lowest.

Now, while there are many Black Christians who come under the “BAME” (Black And Minority Ethnic) classification, there are very few Black Jews in the UK. So the results here seem to contradict the widely accepted narrative about BAME being ‘by far’ the highest risk group.

The question is whether an obvious confounding factor like age is causing a Simpson's paradox effect here whereby - although the overall rate is highest for a particular class of people - it may be possible that a different class is highest in each age sub-category. For example, as Dana Mackenzie shows for US statistics:

although in every age category (except ages 0-4), whites have a lower case fatality rate than non-whites, when we aggregate all of the ages, whites have a higher fatality rate. The reason is simple: whites are older.

So, is that what we have here also, i.e. is it all explained by the fact that Jews and Christians are older?

Well, according to the statistical analysis in the ONS report it may be to a certain extent. The report uses ‘age standardized mortality rates’ to take account of the age distribution differences and concludes that Muslims, rather than Jews, have the highest fatality risk (something which seems very surprising given the above table).

However, the report does not define how the ‘age standardized mortality rates’ are calculated and it does not provide the raw data to check the results either (just as this Barts study failed to provide the necessary raw data to check if its bold claims about higher risk for BAME people were valid). Another concerning aspect of the report is that a lot of it focuses on the under 65s. Yet, the the total number of fatalities in the under 65s is dwarfed by the number of fatalities in the over 65s.

Our approach** to this problem is to construct causal (probabilistic) models such as the one below (this is, of course, also the approach recommended by Pearl and Mackenzie in their excellent "Book of Why").

The kind of causal model required to fully understand impact of religion and ethnicity on Covid19 death risk (dotted nodes represent variables that cannot be directly observed)

Note that there are many factors other than just age that must be incorporated into any analysis of the observed data before making definitive conclusions about risk based on religion/ethnicity. Moreover, if we discount unknown genetic factors, then religion and ethnicity have NO impact at all on a person's Covid19 death risk once we know their age, underlying medical conditions, work/living conditions, and extent of social distancing.

Fenton, N. (2020). A Note on UK Covid19 death rates by religion: which groups are most at risk? http://arxiv.org/abs/2007.07083

Thanks to Georgina Prodhan for alerting us to the ONS report.

*It is fair to assume these are atheists because these are people who declared "no religion" as opposed to those who did not declare any religion (i.e. those who fall into the category "not stated or required")

** Some of our recent work on causal models on Covid19:

References: