The cost of false positives in detection (lessons from public health)
More is not always better. This is especially true for screening and detection systems.
False positives can be very costly in a sneaky way. For example, they can cause users, administrators, or managers to go around or turn off the detection/protection mechanism. Here are a few publicized examples of false positives in information security:
- invalid self-signed SSL certificate warnings in web browsers
- false positives in intrusion detection and prevention
- over-classification of data or documents
- false positives in software static code analysis (For in-depth discussion, read this empirical research on Google’s experience with open-source FindBugs tool)
- Windows Vista “Nagware” (a.k.a. “Security Through Endless Warning Dialogs“)
We need to be able to steer away from policies, designs, or controls where the detection/prevention costs are greater than the benefits. No security measurement or management program can be considered complete unless it includes assessment for the likely costs of false positives.
We can learn lessons from recent pronouncements from public health organizations: one on mammograms for breast cancer screening, and the other on pap tests for cervical cancer screening. Both are a result of statistical analysis of the total costs and total benefits of testing. Both reports recommend less frequent and/or later testing in most cases, basically because the cost of frequent testing (including false positives) exceeds the benefits in risk reduction. Here are quotes from summary articles:
On Mammograms: “While many women do not think a screening test can be harmful, medical experts say the risks are real. A test can trigger unnecessary further tests, like biopsies, that can create extreme anxiety. And mammograms can find cancers that grow so slowly that they never would be noticed in a woman’s lifetime, resulting in unnecessary treatment. Over all, the report says, the modest benefit of mammograms — reducing the breast cancer death rate by 15 percent — must be weighed against the harms. And those harms loom larger for women in their 40s, who are 60 percent more likely to experience them than women 50 and older but are less likely to have breast cancer, skewing the risk-benefit equation.” [emphasis added]
On Pap testing: “The tradition of doing a Pap test every year has not been supported by recent scientific evidence,” Alan G. Waxman, MD, of the University of New Mexico in Albuquerque, said in a statement. “A review of the evidence to date shows that screening at less frequent intervals prevents cervical cancer just as well, has decreased costs, and avoids unnecessary interventions that could be harmful.” [emphasis added]
Similar conclusions have been reached regarding other medical screening tests, including colonoscopy, PSA test (for prostate cancer), chest X-ray (lung cancer screening for smokers), and full body scan (for everything!). In nearly all of these situations, the forces that were promoting more frequent and earlier testing were ignoring or downplaying the consequences of false positives.
If only the information security community had as much well-organized data and well-controlled tests and experiments as our public health brethren, we would be able to make better informed decisions based on evidence and not prevalent beliefs. This is the direction we need to go.
[Update: Here’s a good article from the Wall Street Journal on the cost aspects of risk/benefit analysis in these cases. Great quote: “Americans feel that in health care, more is always better and more means better outcomes,” she said. “That’s just not true, but it’s counterintuitive to a lot of people.”]
[Update 2: Bruce Schneier has a good post on the significance of false positives in evaluating detection mechanisms. In the second half of the post, he gives a fairly clear example of how even a “high quality” detection system (= very low false-positive rate) can still yield poor results when the underlying phenomena are very rare, even if you have huge piles of data. Great line: “It’s a needle-in-a-haystack problem, and throwing more hay on the pile doesn’t make that problem any easier.”]