Calls for an NTSB?

In September, Steve Bellovin and I asked “Why Don’t We Have an Incident Repository?.”

I’m continuing to do research on the topic, and I’m interested in putting together a list of such things. I’d like to ask you for two favors.

First, if you remember such things, can you tell me about it? I recall “Computers at Risk,” the National Cyber Leap Year report, and the Bellovin & Neumann editorial in IEEE S&P. Oh, and “The New School of Information Security.” But I’m sure there have been others.

In particular, what I’m looking for are calls like this one in Computers at Risk (National Academies Press, 1991):

3a. Build a repository of incident data. The committee recommends that a repository of incident information be established for use in research, to increase public awareness of successful penetrations and existing vulnerabilities, and to assist security practitioners, who often have difficulty persuading managers to invest in security. This database should categorize, report, and track pertinent instances of system security-related threats, risks, and failures. […] One possible model for data collection is the incident reporting system administered by the National Transportation Safety Board… (chapter 3)

Second, I am trying to do searches such as “cites “Computers at Risk” and contains ‘NTSB’.” I have tried without luck to do this on Google Scholar, Microsoft Academic and Semantic Scholar. Only Google seems to be reliably identifying that report. Is there a good way to perform such a search?

Do Games Teach Security?

There’s a new paper from Mark Thompson and Hassan Takabi of the University of North Texas. The title captures the question:
Effectiveness Of Using Card Games To Teach Threat Modeling For Secure Web Application Developments

Gamification of classroom assignments and online tools has grown significantly in recent years. There have been a number of card games designed for teaching various cybersecurity concepts. However, effectiveness of these card games is unknown for the most part and there is no study on evaluating their effectiveness. In this paper, we evaluate effectiveness of one such game, namely the OWASP Cornucopia card game which is designed to assist software development teams identify security requirements in Agile, conventional and formal development
processes. We performed an experiment where sections of graduate students and undergraduate students in a security related course at our university were split into two groups, one of which played the Cornucopia card game, and one of which did not. Quizzes were administered both before and after the activity, and a survey was taken to measure student attitudes toward the exercise. The results show that while students found the activity useful and would like to see this activity and more similar exercises integrated into the classroom, the game was not easy to understand. We need to spend enough time to familiarize the students with the game and prepare them for the exercises using the game to get the best results.

I’m very glad to see games like Cornucopia evaluated. If we’re going to push the use of Cornucopia (or Elevation of Privilege) for teaching, then we ought to be thinking about how well they work in comparison to other techniques. We have anecdotes, but to improve, we must test and measure.

Usable Security: History, Themes, and Challenges (Book Review)

Simson Garfinkel and Heather Lipford’s Usable Security: History, Themes, and Challenges should be on the shelf of anyone who is developing software that asks people to make decisions about computer security.

We have to ask people to make decisions because they have information that the computer doesn’t. My favorite example is the Windows “new network” dialog, which asks what sort of network you’re connecting, home or coffee shop. The information is used to configure the firewall. My least favorite example is phishing, where people are asked to make decisions about technical minutiae before authenticating. Regardless, we are not going to entirely remove the need for people to make decisions about computer security. So we can either learn to gain their participation in more effective ways, or we can accept a very high failure rate. The former option is better, and this book is a substantial contribution.

It’s common for designers to throw up their hands at these challenges, saying things like “given a choice between security and dancing babies, people will choose dancing babies every time,” or “you can’t patch human stupidity.” However, in a recently published study by Google and UCSD, they found that the best sites only fooled 45% of the people who clicked through, while overall only 13% did. (There’s a good summary of that study available.) Claiming that “people will choose dancing babies 13% of the time” just doesn’t seem like a compelling argument against trying.

This slim book is a review of the academic work that’s been published, almost entirely in the last 20 years, on how people interact with information security systems. It summarizes and contextualizes the many things we’ve learned, mistakes that have been made, and does so in a readable and concise way. The book has six chapters:

  • Intro
  • A brief history
  • Major Themes in UPS Academic Research
  • Lessons Learned
  • Research Challenges
  • Conclusion/The Next Ten Years

The “Major themes” chapter is 61 or so pages, which is over half of the 108 pages of content. (The book also has 40 pages of bibliography). Major themes include authentication, email security and PKI, anti-phishing, storage, device pairing, web privacy, policy specification, mobile, social media and security administration.

The “Lessons Learned” chapter is quite solid, covering “reduce decisions,” “safe and secure defaults,” “provide users with better information, not more information,” “users require clear context to make good decisions,” “information presentation is critical” and “education works but has limits.” I have a quibble, which is Sasse’s concept of mental ‘compliance budgets’ is also important, and I wish it were given greater prominence. (My other quibble is more of a pet peeve: the term “user” where “people” would serve. Isn’t it nicer to say “people require clear context to make good decisions”?) Neither quibble should take away from my key message, which is that this is an important new book.

The slim nature of the book is, I believe, an excellent usability property. The authors present what’s been done, lessons that they feel can be taken away, and move to the next topic. This lets you the reader design, build or deploy systems which help the person behind the keyboard make the decisions they want to make. To re-iterate, anyone building software that asks people to make decisions should read the lessons contained within.

Disclaimer: I was paid to review a draft of this book, and my name is mentioned kindly in the acknowledgements. I am not being paid to write or post reviews.

[Updated to correct the sentence about the last 20 years.]

Indicators of Impact — Ground Truth for Breach Impact Estimation

Ice bag might be a good Indicator of Impact for a night of excess.
Ice bag might be a good ‘Indicator of Impact’ for a night of excess.

One big problem with existing methods for estimating breach impact is the lack of credibility and reliability of the evidence behind the numbers. This is especially true if the breach is recent or if most of the information is not available publicly.  What if we had solid evidence to use in breach impact estimation?  This leads to the idea of “Indicators of Impact” to provide ‘ground truth’ for the estimation process.

It is premised on the view that breach impact is best measured by the costs or resources associated with response, recovery, and restoration actions taken by the affected stakeholders.  These activities can included both routine incident response and also more rare activities.  (See our paper for more.)  This leads to to ‘Indicators of Impact’, which are  evidence of the existence or non-existence of these activities. Here’s a definition (p 23 of our paper):

An ‘Indicator of Impact’ is an observable event, behavior, action, state change, or communication that signifies that the breached or affected organizations are attempting to respond, recover, restore, rebuild, or reposition because they believe they have been harmed. For our purposes, Indicators of Impact are evidence that can be used to estimate branching activity models of breach impact, either the structure of the model or key parameters associated with specific activity types. In principle, every Indicator of Impact is observable by someone, though maybe not outside the breached organization.

Of course, there is a close parallel to the now-widely-accepted idea of “Indicators of Compromise“, which are basically technical traces associated with a breach event.  There’s a community supporting an open exchange format — OpenIoC.  The big difference is that Indicators of Compromise are technical and are used almost exclusively in tactical information security.  In contrast, Indicators of Impact are business-oriented, even if they involve InfoSec activities, and are used primarily for management decisions.

From the Appendix B, here are a few examples:

  • Was there a forensic investigation, above and beyond what your organization would normally do?
  • Was this incident escalated to the executive level (VP or above), requiring them to make resource decisions or to spend money?
  • Was any significant business process or function disrupted for a significant amount of time?
  • Due to the breach, did the breached organization fail to meet any contractual obligations with it customers, suppliers, or partners? If so, were contractual penalties imposed?
  • Were top executives or the Board significantly diverted by the breach and aftermath, such that other important matters did not receive sufficient attention?

The list goes on for three pages in Appendix B but we fully expect it to grow much longer as we get experience and other people start participating.  For example, there will be indicators that only apply to certain industries or organization types.  In my opinion, there is no reason to have a canonical list or a highly structured taxonomy.

As signals, the Indicators of Impact are not perfect, nor do they individually provide sufficient evidence.  However, they have the very great benefit of being empirical, subject to documentation and validation, and potentially observable in many instances, even outside of InfoSec breach events.  In other words, they provide a ‘ground truth’ which has been sorely lacking in breach impact estimation. When assembled as a mass of evidence and using appropriate inference and reasoning methods (e.g. see this great book), Indicators of Impact could provide the foundation for robust breach impact estimation.

There are also applications beyond breach impact estimation.  For example, they could be used in resilience planning and preparation.  They could also be used as part of information sharing in critical infrastructure to provide context for other information regarding threats, attacks, etc. (See this video of a Shmoocon session for a great panel discussion regarding challenges and opportunities regarding information sharing.)

Fairly soon, it would be good to define a light-weight standard format for Indicators of Impact, possibly as an extension to VERIS.  I also think that Indicators of Impact could be a good addition to the up-coming NIST Cybersecurity Framework.  There’s a public meeting April 3rd, and I might fly out for it.  But I will submit to the NIST RFI.

Your thoughts and comments?

New paper: "How Bad Is It? — A Branching Activity Model for Breach Impact Estimation"

Adam just posted a question about CEO “willingness to pay” (WTP) to avoid bad publicity regarding a breach event.  As it happens, we just submitted a paper to Workshop on the Economics of Information Security (WEIS) that proposes a breach impact estimation method that might apply to Adam’s question.  We use the WTP approach in a specific way, by posing this question to all affected stakeholders:

Ex ante, how much would you be willing to spend on response and recovery for a breach of a particular type?  Through what specific activities and processes?”

We hope this approach can bridge theoretical and empirical research, and also professional practice.  We also hope that this method can be used in public disclosures.

Paper: How Bad is it? – A Branching Activity Model to Estimate the Impact of Information Security Breaches

Infographic from the example in the paper
Infographic from the example in the paper

In the next few months we will be applying this to half a dozen historical breach episodes to see how it works out.  This model will also probably find its way into my dissertation as “substrate”.  The dissertation focus is on social learning and institutional innovation.

Comments and feedback are most welcome.

Base Rate & Infosec

At SOURCE Seattle, I had the pleasure of seeing Jeff Lowder and Patrick Florer present on “The Base Rate Fallacy.” The talk was excellent, lining up the idea of the base rate fallacy, how and why it matters to infosec. What really struck me about this talk was that about a week before, I had read a presentation of the fallacy with exactly the same example in Kahneman’s “Thinking, Fast and Slow.” The problem is you have a witness who’s 80% accurate, describing a taxi as orange; what are the odds she’s right, given certain facts about the distribution of taxis in the city?

I had just read the discussion. I recognized the problem. I recognized that the numbers were the same. I recalled the answer. I couldn’t remember how to derive it, and got the damn thing wrong.

Well played, sirs! Game to Jeff and Patrick.

Beyond that, there’s an important general lesson in the talk. It’s easy to make mistakes. Even experts, primed for the problems, fall into traps and make mistakes. If we publish only our analysis (or worse, engage in information sharing), then others can’t see what mistakes we might have made along the way.

This problem is exacerbated in a great deal of work by a lack of a methodology section, or a lack of clear definitions.

The more we publish, the more people can catch one anothers errors, and the more the field can advance.

Time for an Award for Best Data?

Yesterday, DAn Kaminsky said “There should be a yearly award for Best Security Data, for the best collection and disbursement of hard data and cogent analysis in infosec.” I think it’s a fascinating idea, but think that a yearly award may be premature. However, what I think is sorta irrelevant, absent data. So I’m looking for data on the question, do we have enough good data to issue an award yearly?

Please nominate in the comments.

Also, please discuss what the criteria should be.

Sharing Research Data

I wanted to share an article from the November issue of the Public Library of Science, both because it’s interesting reading and because of what it tells us about the state of security research. The paper is “Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results.” I’ll quote the full abstract, and encourage you to read the entire 6 page paper.

The widespread reluctance to share published research data is often hypothesized to be due to the authors’ fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically.

Methods and Findings
We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

Our findings on the basis of psychological papers suggest that statistical results are particularly hard to verify when reanalysis is more likely to lead to contrasting conclusions. This highlights the importance of establishing mandatory data archiving policies.

Despite the fact that the research was done on papers published in psychology journals, it can teach us a great deal about the state of security research.

First, the full paper is available for free online. Compare and contrast with too many venues in information security.

Second, the paper considers and tests alternative hypotheses:

Although our results are consistent with the notion that the reluctance to share data is generated by the author’s fear that reanalysis will expose errors and lead to opposing views on the results, our results are correlational in nature and so they are open to alternative interpretations. Although the two groups of papers are similar in terms of research fields and designs, it is possible that they differ in other regards. Notably, statistically rigorous researchers may archive their data better and may be more attentive towards statistical power than less statistically rigorous researchers. If so, more statistically rigorous researchers will more promptly share their data, conduct more powerful tests, and so report lower p-values. However, a check of the cell sizes in both categories of papers (see Text S2) did not suggest that statistical power was systematically higher in studies from which data were shared. [Ed: “Text S2” is supplemental data considering the discarded hypothesis.]

But most important, what does it say about the quality of the data we so avariciously hoard in information security? Could it have something to do with higher prevalence of apparent errors?

Probably not. It might surprise you to hear me saying that, but hear me out. We almost never have hypotheses to test, and so our ability to perform statistical re-analysis is almost irrelevant. We’re much for fond of saying things like “It calls the same DLLs as Stuxnet, so it’s clearly also by the Israelis.” Actually, there are several implied hypotheses in there:

  1. No code by different authors calls the same DLL
  2. No code calls any undocumented APIs
  3. Stuxnet DLLs are not documented

Stuxnet being written by the Israelis is clearly not a hypothesis, but a fact, as documented by Nostradamus.

More seriously, read the paper, see how good science is done, and ask if anyone is holding us back but ourselves.

Thanks to Cormac Herley for the pointer.

Paper: The Security of Password Expiration

The security of modern password expiration: an algorithmic framework and empirical analysis, by Yingian Zhang, Fabian Monrose and Michael Reiter. (ACM DOI link)

This paper presents the first large-scale study of the success of password expiration in meeting its intended purpose, namely revoking access to an account by an attacker who has captured the account’s password. Using a dataset of over 7700 accounts, we assess the extent to which passwords that users choose to replace expired ones pose an obstacle to the attacker’s continued access. We develop a framework by which an attacker can search for a user’s new password from an old one, and design an efficient algorithm to build an approximately optimal search strategy. We then use this strategy to measure the difficulty of breaking newly chosen passwords from old ones. We believe our study calls into question the merit of continuing the practice of password expiration.

This is the sort of work that we at the New School love. Take a best practice recommended by just about everyone for what seems like excellent reasons, and take notice of the fact that human beings are going to game your practice. Then get some actual data, and see how effective the practice is.

Unfortunately, we lack data on rates of compromise for organizations with different password change policies. So it’s hard to tell if password policies actually do any good, or which ones do good. However, we can guess that not making your default password “stratfor” is a good idea.

ACM gets a link because they allow you to post copies of your own papers, rather than inhibiting the progress of science by locking it all up.