Saturday 26 September 2020

The impact of false positives in Covid testing


(Updated 30 Sept).

There has been much recent debate - and controversy - about the impact of false positives in Covid testing. As I showed in this video:  if the current rate of infection is 1 in 200 (i.e. 1 in every 200 people is currently infected with the virus) and if a person is selected at random to be tested then, if that person tests positive, there is actually only about a 1 in 6 chance (less than 17%) the person actually has the virus. This assumes the test has a 2% false probability rate (i.e. 2 out of every 100 people who don't have the virus will wrongly be tested positive) and a 20% false negative rate (i.e. 20 out of every 100 people with the virus will wrongly be tested negative).

Obviously the 1 in 200 pre-test (i.e. 'prior') probability assumption is critical. A person who is tested because they have been in contact with somebody confirmed as having the virus will have a much higher pre-test probability of having the virus. If we assumed it was 50% then, if that person tests positive, there is a 97% chance they have the virus.

The BMJ have produced an excellent infographic which allows you to adjust all of the key parameters (namely the pre-test probability, false positive rate, and false negative rate). However, there is a severe limitation. The graphic does not allow you to enter pre-test probabilities of less than 1% (as I found out when I tried to enter the value 0.5% that I had used in my video - it automatically rounded it up to 1%). This is a curious limitation, given that the current infection rate is widely believed to be much lower than 1%; if it was 1% this would mean 680,000 people in the UK were infected right now, i.e. not including those who were previously infected (if it was that high this would confirm the belief of many that the virus goes unnoticed in most people).

Moreover, it also very curious that the default setting in the BMJ infographic has the pre-test probability set at a ludicrously high 80%. Even for a person with symptoms and having been in contact with a person with Covid this is actually too high (see this post and video). With that prior assumption somebody testing positive is, of course, almost certain to have the virus.

By focusing on the notion that people getting tested have a relatively high pre-test probability of having the virus, an article in the Huffington Post uses the BMJ infographic to hammer those claiming that most people testing positive do not have the virus. For example, they suggest a scenario where the pre-test probability is 20% and the false positive rate is 1%. With these assumptions, somebody testing positive has a 94% chance of having the virus. 

In reality there is massive uncertainty about all of the three parameters as explained in this article. Very early on during this crisis we argued (see also the other links below) that a more intelligent approach to data collection and analysis was needed to learn these parameters; in particular, there was a need to consider causal models to explain observed data. A basic causal model showed that it was critical to distinguish between people who had no, mild and severe symptoms, both when recording those being tested and when recording those testing positive. Yet there are no publicly available data on those being tested which makes these distinctions (we just have 'total tested' per day) and neither do we have it for those testing positive; all we can do is make some crude inferences based on the number hospitalised, but even then this published daily number includes all patients hospitalized with Covid not because of Covid. And, even worse, there are fundamental errors in some of the UK data on hospital admissions.

There is even less truly known about the accuracy of the various tests (*see the statement below by my colleague Dr Scott McLachlan on this issue) because - in the absence of any 'gold standard' test for Covid there is no way to determine this accuracy. And there have been no independent evaluations of test accuracy. 

There is plenty of anecdotal evidence that most people testing positive either really don't have the virus or will be totally unaffected by it. For example, as part of the standard testing of professional footballers earlier this week 18 out of 25 of Leyton Orient's players - and many of their staff - tested positive. None of these people had - or will have - any symptoms at all. The same is true of the many other footballers and staff (including many older managers/trainers) who have tested positive and the hundreds of Scottish University students who tested positive this week.

Update: See important article "Waiting for Zero" by Dr Clare Craig.

That is why I have been posting on twitter updates of my cases (i.e. positive tests) per 1000 tested graph as a contrast to the naive cases only graph which all the media post (obviously cases have been rising as number of tests have been rising). This simply takes the (separate) daily cases (i.e. those testing positive) and number tested from (the Government website) and divides the former by the latter: 

It is curious that the website produces all kinds of plots but not this most obvious and informative one. As we explained here this shows that, contrary to the Government claims made this week, there is no real real evidence of exponential growth in the virus. 

People have been responding by saying 'ah but in April we were only testing those hospitalized with severe symptoms'; this is generally true (in fact the testing 'strategy' - i.e. who primarily gets tested has changed several times, which is why we argued long ago it needs to factored into a causal model); so the proportion of positives among those tested in April was obviously a lot higher than now. However, it is also the case that the proportion of people now testing positive who will be totally unaffected by the virus (whether they have it or not) is also much higher. That is why we need to distinguish between mild, severe and asymptomatic cases. We really need to see the plot of severe cases, and as I mentioned above, hospitalizations is the best approximation we have for that. However, that is also compromized for reasons explained above and the fact that we are now entering the normal flu season when hospital admission inevitably rise significantly.


*On the issue of what is know about test accuracy, Dr Scott McLachlan says: 

During the last five months we have collected every preprint and journal publication that we could locate on COVID-19 testing with rt-PCR and antibodies. The issues of false positives (FP) and false negatives (FN) are more complicated than test product developers, some academic authors, and the mass and social media have presented. 

First, discussion of a single FP or FN rate completely misses the fact that there are multiple tests from different vendors being used at the moment. The NHS alone are using at least five different primary rt-PCR tests. Each has a different manufacturer or seeks a different RNA target and therefore has a different sensitivity and specificity profile. What we do not have as yet is independent laboratory verification of the manufacturer’s claimed sensitivity and specificity. As well as those five there are a range of perhaps three or four, including the DNANudge cartridge test that comes from the UCL company that also market DNA NFC identity wristband in malls such as White City Westfield. Their test documentation focuses predominately on their claims of near zero FNs - because FNs were have been the leading subject of much media and academic literature in recent weeks, and brushes over the fact that by their own numbers they make around 3% FP. 

Second, we have very little in the way of credible independent 3rd party verification for any of the COVID tests. Everything we get at the moment either consists of self-validation by the manufacturer lab (which should not be accepted wholesale without independent verification but sadly has been during COVID times), or of poorly constructed literature from well-meaning medics who, in one example, used an unvalidated PCR test as the standard by which to assess the accuracy of chest CT/CXR for diagnosing COVID (for the record, the CT did far better than the PCR test they used as *cough gold-standard… and they ended up acknowledging the PCR test resulted in FNs and FPs at far higher levels than expected). 

 As best as we have been able to identify from the literature collected that has assessed the rt-PCR and lab-based antibody tests:

  • FNs are occurring at a rate of between 3-30% for rt-PCR COVID-19 tests - depending on the test type, manufacturer, and the lab that ran the tests. 
  • FPs are occurring at a rate of between 0.8% (lowest value accepted in the literature) and 7.9% (in a recent EU-based preprint) for rt-PCR COVID-19 tests. 
  • Sens/Spec for rt-PCR COVID-19 tests ranges from 87%-100% depending on which manufacturers test and whether they performed their own testing or wrote their own academic report. 
  • The antibody tests have an accuracy of somewhere between 30 and 93%, again depending on whose antibody test you review, whether it was IgG or IgM, and whether they averaged the score of all antibodies the test assayed for, or reported them individually. Antibody tests tended to be really good at identifying one antibody (often IgG), and less accurate or specific for the other (most often IgM).

See also:




  2. Dear Prof. Fenton,

    I have a couple of questions about your Bayes presentation. I am familiar with Bayesian belief networks and used them in my own PhD work.

    First your assumption of a 2% false positive rate. If this is the case, how can you explain the ONS survey results? In the recent ONS infection survey update they found 419 positives in 219732 swab tests over the last six weeks. So even if all of those were false positives, that would, by my calculations give a false positive rate of just 0.14%. (Though I must admit from my own experience of ROC curve analysis, a specificity of 99.86% seems exceptionally high). They also estimate 1 in 500 with the virus, not 1 in 200 as in your example.

    However, my real problem with the presentation is this: Your "Sarah" is a assumed to be a randomly selected member of the population. However, in the daily test positive results published, it would normally be the case that a person goes for a test for a reason - they might have developed symptoms and wanted to get it checked, or they might have been contact-traced. This would imply that a person going for a test (excluding random surveys) would have a much higher prior probability of having Covid than a random member of the population, either by virtue of having symptoms, or having been in contact with someone who has the virus.

    In other words, the set of people tested in Pillar 1 and 2 is not a typical random sample, and you would have to adjust the pre-test probability (Prior) accordingly in the Bayes theorem calculation. I did some calculations with Bayes theorem given the ONS estimate of FPR, and the current prevalence of the disease, and found that it gave a Positive Predictive Value of about 98%, much greater than the 17% in your example.

    I commented on this on Carl Heneghan's Twitter feed when he raised this issue, and wrote to one of his post-docs who is a former colleague of mine, but received no reply.

    I'd be interested to hear your comments - what have I missed?

    1. Sorry I don't get alerts to these blog posts and have only just seen this. Will respond later.

    2. Regarding your first question: "how can there be a 2% FP rate when the proportion of positives reported is 0.14%?". You partly addressed this in your second comment, but another key point is that I understand that any positive found - if the ct value used was >40 (whihc is apparently common) - is supposed to be retested. So the reported number of people testing positive does not really tell us a lot about the overall test FP. Regarding your second question (the 'real problem') of Sarah I thought I explicitly dealt with that near the beginnign of the article in the discussion about the 'prior'.

  3. I want to add something to the above comment.

    It has now become clear that the recent surge in cases is largely university students. Northumbria reported 770 positives but only 78 symptomatic. Since the govt guidance if you're contact-traced is to self-isolate and only get a test if you develop symptoms, the question has to be asked why on earth are they testing people who don't have symptoms? In which case your argument about false positives may well come into consideration.

    Also it occurs to me that the very high specificity deduced by ONS (originally they reported 50-60 cases in 116,000 swab tests or at least 99.96% specificity), that it may well be the case that for their survey they carried out the test analysis under ideal conditions, and that this high specificity may not be achieved in private labs struggling to get the demand through quickly, leading to greater opportunities for human error, cross-contamination etc.

  4. Do you think PCR Tests have improved since you published this post? I'm looking to fly soon from Liverpool and need to take a test, have been looking at any feedback is most welcome.