Shostack + Friends Blog Archive


Breach Analysis: Data Source biases

Bob Rudis has an fascinating and important post “Once More Into The [PRC Aggregated] Breaches.” In it, he delves into the various data sources that the Privacy Rights Clearinghouse is tracking.

In doing so, he makes a strong case that data source matters, or as Obi-Wan said, “Luke, you’re going to find that many of the truths we cling to depend greatly on our own point of view:”

Breach count metatype year 530x353

I don’t want to detract from the work Bob’s done. He shows pretty clearly that human and accidental factors are exceeding technical ones as a source of incidents that reveal PII. Without detracting from that important result, I do want to add two points.

First, I reported a similar result in work released in Microsoft SIR v11, “Zeroing in on Malware Propagation Methods.” Of course, I was analyzing malware, rather than PII incidents. We need to get away from the idea that security is a purely technical problem.

Second, it’s time to extend our reporting regimes so that there’s a single source for data. The work done by non-profits like the Open Security Foundation and the Privacy Rights Clearinghouse has been awesome. But these folks are spending a massive amount of energy to collect data that ought to be available from a single source.

As we talk about mandatory breach disclosure and reporting, new laws should create and fund a single place where those reports must go. I’m not asking for additional data here (although additional data would be great). I’m asking that the reports we have now all go to one additional place, where an authoritative record will be published.

Of course, anyone who studies statistics knows that there’s often different collections, and competition between resources. You can get your aircraft accident data from the NTSB or the FAA. You can get your crime statistics from the FBI’s Unified Crime Reports or the National Crime Victimization Survey, and each has advantages and disadvantages. But each is produced because we consider the data an important part of overcoming the problem.

Many nations consider cyber-security to be an important problem, and it’s an area where new laws are being proposed all the time. These new laws really must make the data easier for more people to access.

One comment on "Breach Analysis: Data Source biases"

  • Chris says:

    I agree. In the meantime, I’d like to see Verizon’s VERIS framework used to uniformly code the incidents we do know about. I’m not sure whether their terms and conditions permit this, at least for some entities that might do it.

Comments are closed.