If a screening test is 90% accurate, and your result comes back positive, what are the chances it is a false positive, asks Michael Blastland in his regular column.
Browsing the web recently, I found a fascinating article about screening for terrorists and it's made me think about accuracy and uncertainty.
Imagine you've invented a machine to detect terrorists. It's good, about 90% accurate. You sit back with pride and think of the terrorists trembling.
Conventional lie-detector or polygraph accuracy has been claimed to be 90% but this is doubtful. Most independent experts think it's more like 60% - not much better than tossing a coin.
But your invention is the real deal, it really is 90% accurate. It's quick, light, portable and works by detecting patterns of brain activity and facial movement known to match terrorist intent.
You're in the Houses of Parliament demonstrating the device to MPs when you receive urgent information from MI5 that a potential attacker is in the building. Security teams seal every exit and all 3,000 people inside are rounded up to be tested.
The first 30 pass. Then, dramatically, a man in a mac fails. Police pounce, guns point.
How sure are you that this person is a terrorist?
The answer is C, about 0.3%.
Think about screening all the non-terrorists for innocence - and being wrong about 10 people in every 100
If 3,000 people are tested, and the test is 90% accurate, it is also 10% wrong. So it will probably identify 301 terrorists - about 300 by mistake and 1 correctly. You won't know from the test which is the real terrorist. So the chance that our man in the mac is the real thing is 1 in 301.
That a good test can leave us so uncertain about any individual is a head-spinner to many. The problem is the false positives: tests that say you've found what you are looking for but are wrong, and which wreak particular havoc with the results when what you are looking for is rare. That means most of your mistakes apply to those you are not looking for.
Go Figure has been puzzling over how to make all this more intuitive and invites readers to send their own ideas, using the form at the bottom of this page.
Here are a couple of suggestions.
The first is to visualise the numbers. In the picture below, four pixels = 10,000 people. The whole area is the population of the United States - about 300m people. The dark blue area is roughly how many would be suspected of terrorism by a screening process with 90% accuracy - about 30m. On this scale, the area representing the number who are real terrorists - let's say 300 people, of whom 30 would be missed - is too small to see on screen so we've blown up one pixel to show the proportion.
The second suggestion is that whenever we discuss screening, be it of terrorists, HIV, cancer or anything else, we should try to refocus. Any mention of screening for terrorists causes all our attention immediately to zoom into those who really are terrorists. We think of the individuals and how a 90% accurate test would work on one of them. We zoom into the white area and forget the blue.
Refocus. Get into the habit of also thinking about screening the light blue area too.
How would this work in practice? Whenever we hear what's being screened for, we should switch it around to think about the opposite. So, screening for terrorists with 90% accuracy? Think about screening all the non-terrorists for innocence - and being wrong about 10 people in every 100. Imagine them all, virtually the whole population, 10% of whom might become suspects.
Screening for HIV with 99.9% accuracy? Switch it around. Think also about screening the millions of non-HIV people and being wrong about one person in every 1,000.
For another, visually-captivating method, try the brilliant animations on the
Understanding Uncertainty website
- read the first page then click on "testing" - which encourages us to think of real people rather than percentages.
See too data-viz guru
Howard Wainer's discussion of false positives.
• "We don't know."
It has become a refrain, the answer to almost every question. I'm discussing swine flu with someone who is looking at how it's spreading. Based at one of our leading medical institutions, highly experienced and capable, this is someone who might have been expected to know quite a lot.
"We don't know."
SWINE FLU SYMPTOMS
If you have a temperature and two or more of the following, it may be swine flu:
I'm not asking for clairvoyance, this is not about what will happen in future. All I want to know - all this researcher wants to know but can't find out - is simple stuff: how many people who had swine flu in the first few months since it emerged are believed to have caught it abroad? What proportion of people with reported swine flu have been hospitalised? Do they tend to be younger or older? How soon after the first symptoms do they start antiviral treatment?
Someone, somewhere might have a slightly better idea of some of this, but my academic friend, whose job it is to try to understand the illness, is exasperated by the difficulty of finding out the basics.
Statistical models of the spread of disease are never perfect, but they can help. If they are to be remotely useful, they need some reasonable numbers to start with. Otherwise, as the old adage has it: rubbish in, rubbish out.
Although there's a rough total of reported cases, we don't know how many there have really been, and how often these are serious, because we have little idea how many people have sub-clinical symptoms. Little idea how many treat themselves without reference to the health service. Little idea what proportion of the total finish up in hospital. Little idea how accurate the diagnoses are now that diagnosis is no longer confirmed with a blood test.
The numbers you see quoted in the media are bound to be crude. How crude, we don't know.
As so often with data, it is the simple business of counting things and keeping consistent, accurate records that turns out to be where the glitches occur. That's just a lot harder than it seems. Not only do we not know where we are going to be with swine flu in a few months time, we don't really know where we are.
But there's also a perverse comfort in some of this. If there is a huge amount of mild swine flu we don't know about, the proportion of cases that are serious is correspondingly reduced.
Add your comments on this story, using the form below.
The way I like to explain false positives to people is by what effect it would have on them - for 90% accuracy it means that everyone could have someone in their immediate family (parents, children, siblings) labelled as a terrorist. Make it personal and people start to really comprehend the problem.
Peter Clarke, Auckland, NZ
You could try the visualisation method from slide five of
which deals with forecasting extreme rain events. Of course, if your event is rare enough, the most accurate forecast available will simply be to say "no" every time (e.g. no-one is a terrorist). More usefully, your forecasting method should be as accurate as the probability of NOT finding the event (e.g. 90% accurate is fine for finding something which has a probability of 10%). Candy Spillard, York, UK
Ninety per cent accuracy means that 90 times in 100, the machine will be correct ABOUT ANY ONE PERSON, 10 times in 100 it will be incorrect.
Jelani Crue, UK
You discuss the problems around using a detection system with a 90% accuracy. It should be noted that these problems diminish rapidly if you have a second independent system with a similar level of accuracy, and obviously concentrate on the double positives.
Lawson G, Taunton
Of course in reality you let the MPs, staff, families etc. go home, then just test the people that are left (the guys in the macs). Just as you don't subject random members of the public to lie detector tests, nor do OAP nuns seek HIV tests.
Finn, London, UK
The very term 90% accurate is the sort of thing that might get trumpeted in a media headline but is actually meaningless. Does this mean it catches 9 out of 10 terrorists? Or does it mean it catches 1 out of 10 innocent people? You have assumed the latter to make your point, but in real world situations the chance of false positives is not the same as the chance of false negatives. In fact great effort goes into minimising one and maximising the other. Didn't you make this point in a very early column, in a medical context? It should be made again here.
Ian Nartowicz, Stockport, England
Tests used in medicine often have a specificity and sensitivity rather than just "accuracy"; I think on the whole this is more useful, and reflects the trade-offs inherent in many such tests between being too sensitive (finding all your terrorists, but picking up a lot of innocent people) and too specific (only finding terrorists, but missing a few of them).
Matthew, Coventry, UK
The concept of false positive and false negative (and true positives and negatives) results from any test is hugely important and needs to be better explained to the general public. Only then can discussions about e.g. breast cancer screening or any mass testing process can be properly understood.
DavidF, Watford, England