The "Human Action" argument is not even wrong
Several commenters on my post yesterday have put forth some form of the argument that hackers are humans, humans are unpredictable, and therefore, information security cannot have a Nate Silver.
This is a distraction, as a moment’s reflection will show.
Muggings, rapes and murders all depend on the actions of unpredictable humans, and we can, in aggregate, study them. We can see if they are rising or falling. We can debate if one or another methodology is a superior way of measuring them. (For example, should we rely on police reports or survey people and see who’s been victimized?)
Now, internet crimes are different from non-internet crimes in a couple of important ways. It is far harder to properly attribute the crimes to particular actors because the crimes are mediated by computers and networks. Another difference is people generally don’t report internet crime to the police, and the police often suggest sweeping the crime under the rug. (This may relate to the challenges and expense of investigating internet crimes.) It’s possible that there are other differences.
But no one bringing up the internet exceptionalism argument have explained why we can’t change the lack of reporting, why we couldn’t study repeated events in the aggregate, or why information security can’t have a Nate Silver.
But none of them have explained how internet crime differs from non-internet crime in ways which make it unmeasurable. My question was really intended as a provocation, to get people to think about measurability in our field. But a reasonable objection is that I am hand-waving with respect to we would have our Nate Silver do, and so I want to be a bit more specific about that.
The best exemplar of this was Martin McKeay asking “Give me an example of what you think we should be able to predict.” I think we should be able to predict the number of vulnerabilities discovered, the number of malware infections per million machines, or the odds that a web server in the top N sites will be serving up attack code. I think we should be able to discuss the odds that a given SSN has been leaked, how that impacts its (ab)use as an authenticator. I also think (see my article, “The Evolution of Information Security“) that we should be able to say that organizations that invest in defense X have fewer incidents than those who invest in Y.
[Edited to add: The reason that I didn’t want to give an example of what the Nate Silver of infosec would measure is to avoid debates in the weeds about one or the other of those things. I think what is usefully measured will surprise many people, including me. By asking why in general, I want to encourage people to think about the over-arching problems, and I hope that we’ll hear more solutions to the general problem than we did yesterday.]
Here’s a fine example of what is possible regarding statistical analysis of “human action” — in this case, it’s “shots fired” by private contractors in Iraq, drawing from 4,500 pages of released documents: http://overview.ap.org/blog/2012/02/iraq-security-contractors/
Methods: http://overview.ap.org/blog/2012/02/private-security-contractors-in-iraq-analysis/