Shostack + Friends Blog Archive


Informed discussion? Cool!

David Litchfield examines some public breach data and concludes that

Word documents and spreadsheets mistakenly left on a web server or indexed by a search engine account for 20.6% of the 276 breaches, both physical and digital, recorded up to the 23rd of October.

He further surmises that the proportion may be even higher, since the bad guys don’t alert data custodians when Google serves up social security numbers.
I looked at this very question for the FIRST presentation referred to in my previous post. There, breach incidents affecting entities based in New York, and reported by at least one of several sources (including the state itself) were examined.

The upshot? As my notes for the presentation said:

[A]t least in terms of numbers of breach incidents, equipment or media loss and unintended online exposure (such as with a misconfigured web server) are the main sources of exposure. Indeed, results from the New York dataset and the New York cases from the University of Washington dataset are statistically indistinguishable, each showing 60-65% of breaches due to lost or stolen media and 15-25%
exposed online.
By way of comparison,’s DLDOS shows almost exactly 50% (180 of 362) of recorded 2006 incidents being due to lost or stolen equipment or media, and Hasan and Yurcik report 36% of incidents from the period 01-05-2005 through 06-05-2006. North Carolina’s breach notification log (obtained via an open records request) shows 53 incidents of 107 (50%) involved lost or stolen media/hardware.
New Hampshire records from December 2006 to June 2007 show 54% of incidents (N=51) due to lost or stolen equipment or media (67% of affected firms since one stolen laptop had 13 firms’ data!)
Since so many of the cases reported to NY involve small numbers of persons affected, one might think that the “small incidents” differ from the
rest. However, when small (defined as 99 persons affected or fewer) incidents are excluded, the breakdown of breach mechanisms is statistically indistinguishable from an examination with all cases included.

So far, Litchfield is right on the money, and as a database guru rather than a burglar alarm salesman he focuses on logical, not physical, security. However, at least for the breaches I looked at (which, for methodological reasons involved only entities in NY) about 99% of the exposed records were due to lost or stolen computers or media.
The folks in the whole disk encryption business probably are on the case, but I wanted to point out that accidental publishing is a minor exposure source, as measured by records exposed (IMO).
Two final observations:
1. Not all exposures are equal. Having your backup tape out on the sidewalk is bad. Having it indexed by Google is worse.
2. Litchfield’s contribution would not have been possible without data on breaches. I know that as an opinion leader, his words will resonate. That means that because of data availability, security just got better. Feels good, doesn’t it?

4 comments on "Informed discussion? Cool!"

  • You said – “That means that because of data availability, security just got better. Feels good, doesn’t it?”
    Not sure I understand how. We have more data, but until we know this data leads to fewer breaches, we haven’t really improved security have we?

  • Chris says:

    “…his words will resonate” was intended to mean “his words will lead others to behave differently”.

  • Andy Steingruebl says:

    Ok, fair enough. I’m having a literal morning…

  • Nik says:

    20.6%? So that’s 56.856 breaches then… 🙂

Comments are closed.