In our new book, we cite the famous Harvard Medical School experiment where doctors and medical students were asked the following question.
"One in a thousand people has a prevalence for a particular heart disease. There is a test to detect this disease. The test is 100% accurate for people who have the disease and is 95% accurate for those who don't (this means that 5% of people who do not have the disease will be wrongly diagnosed as having it)."
The answer (as explained here) is a bit less than 2% (which is interesting because most of the people in the study gave the answer as 95%).
Now a reader has posed the following problem:
I understand the 2% answer of a random person having the heart disease. What is the probability of that person having the disease if a 2nd test comes back positive? What about a 3rd test?
I have added this as an exercise to Chapter 6 of the book and provided a full answer - using a Bayesian network. In summary, if the tests are really independent (with the same level of accuracy) then if two tests are positive the probability of the disease rises to 28.592%; and when three tests are positive it rises to 88.899%.
However, the tests can be dependent on each other and it is also possible that there are some personal features of the patient that lead to common test errors. These two situations (with particular assumed prior probabilities of the dependencies leads to a far lower probative value of the multiple positive test results. As shown in the solution, we get:
When the tests are directly dependent and two are positive the probability of disease only increases to 3.229%; with all three positive the probability only increases to 4.002%.
When there is a common source of error and two tests are positive the probability of disease increases to 13.805%; with all 3 tests positive, probability increases to 64.023%.