Calls for an NTSB?

In September, Steve Bellovin and I asked “Why Don’t We Have an Incident Repository?.”

I’m continuing to do research on the topic, and I’m interested in putting together a list of such things. I’d like to ask you for two favors.

First, if you remember such things, can you tell me about it? I recall “Computers at Risk,” the National Cyber Leap Year report, and the Bellovin & Neumann editorial in IEEE S&P. Oh, and “The New School of Information Security.” But I’m sure there have been others.

In particular, what I’m looking for are calls like this one in Computers at Risk (National Academies Press, 1991):

3a. Build a repository of incident data. The committee recommends that a repository of incident information be established for use in research, to increase public awareness of successful penetrations and existing vulnerabilities, and to assist security practitioners, who often have difficulty persuading managers to invest in security. This database should categorize, report, and track pertinent instances of system security-related threats, risks, and failures. […] One possible model for data collection is the incident reporting system administered by the National Transportation Safety Board… (chapter 3)

Second, I am trying to do searches such as “cites “Computers at Risk” and contains ‘NTSB’.” I have tried without luck to do this on Google Scholar, Microsoft Academic and Semantic Scholar. Only Google seems to be reliably identifying that report. Is there a good way to perform such a search?

Dear Mr. President

U.S. President Barack Obama says he’s ”concerned” about the country’s cyber security and adds, ”we have to learn from our mistakes.”

Dear Mr. President, what actions are we taking to learn from our mistakes? Do we have a repository of mistakes that have been made? Do we have a “capability” for analysis of these mistakes? Do we have a program where security experts can gain access to the repository, to learn from it?

I’ve written extensively on this problem, here on this blog, and in the book from which it takes its name. We do not have a repository of mistakes. We do not have a way to learn from those mistakes.

I’ve got to wonder why that is, and what the President thinks we’re doing to learn from our mistakes. I know he has other things on his mind, and I hope that our officials who can advise him directly take this opportunity to say “Mr. President, we do not learn from our mistakes.”

(Thanks to Chris Wysopal for the pointer to the comment.)

Usable Security: History, Themes, and Challenges (Book Review)

Simson Garfinkel and Heather Lipford’s Usable Security: History, Themes, and Challenges should be on the shelf of anyone who is developing software that asks people to make decisions about computer security.

We have to ask people to make decisions because they have information that the computer doesn’t. My favorite example is the Windows “new network” dialog, which asks what sort of network you’re connecting to..work, home or coffee shop. The information is used to configure the firewall. My least favorite example is phishing, where people are asked to make decisions about technical minutiae before authenticating. Regardless, we are not going to entirely remove the need for people to make decisions about computer security. So we can either learn to gain their participation in more effective ways, or we can accept a very high failure rate. The former option is better, and this book is a substantial contribution.

It’s common for designers to throw up their hands at these challenges, saying things like “given a choice between security and dancing babies, people will choose dancing babies every time,” or “you can’t patch human stupidity.” However, in a recently published study by Google and UCSD, they found that the best sites only fooled 45% of the people who clicked through, while overall only 13% did. (There’s a good summary of that study available.) Claiming that “people will choose dancing babies 13% of the time” just doesn’t seem like a compelling argument against trying.

This slim book is a review of the academic work that’s been published, almost entirely in the last 20 years, on how people interact with information security systems. It summarizes and contextualizes the many things we’ve learned, mistakes that have been made, and does so in a readable and concise way. The book has six chapters:

  • Intro
  • A brief history
  • Major Themes in UPS Academic Research
  • Lessons Learned
  • Research Challenges
  • Conclusion/The Next Ten Years

The “Major themes” chapter is 61 or so pages, which is over half of the 108 pages of content. (The book also has 40 pages of bibliography). Major themes include authentication, email security and PKI, anti-phishing, storage, device pairing, web privacy, policy specification, mobile, social media and security administration.

The “Lessons Learned” chapter is quite solid, covering “reduce decisions,” “safe and secure defaults,” “provide users with better information, not more information,” “users require clear context to make good decisions,” “information presentation is critical” and “education works but has limits.” I have a quibble, which is Sasse’s concept of mental ‘compliance budgets’ is also important, and I wish it were given greater prominence. (My other quibble is more of a pet peeve: the term “user” where “people” would serve. Isn’t it nicer to say “people require clear context to make good decisions”?) Neither quibble should take away from my key message, which is that this is an important new book.

The slim nature of the book is, I believe, an excellent usability property. The authors present what’s been done, lessons that they feel can be taken away, and move to the next topic. This lets you the reader design, build or deploy systems which help the person behind the keyboard make the decisions they want to make. To re-iterate, anyone building software that asks people to make decisions should read the lessons contained within.

Disclaimer: I was paid to review a draft of this book, and my name is mentioned kindly in the acknowledgements. I am not being paid to write or post reviews.

[Updated to correct the sentence about the last 20 years.]

Modeling Attackers and Their Motives

There are a number of reports out recently, breathlessly presenting their analysis of one threatening group of baddies or another. You should look at the reports for facts you can use to assess your systems, such as filenames, hashes and IP addresses. Most readers should, at most, skim their analysis of the perpetrators. Read on for why.

There are a number of surface reasons that you might reject or ignore these reports. For example, these reports are funded by marketing. Even if they are, that’s not a reason to reject them. The baker does not bake bread for fun, and the business goal of marketing can give us useful information. You might reject them for their abuse of adjectives like “persistent”, “stealthy”, or “sophisticated.” (I’m tempted to just compile a wordcloud and drop it in place of writing.) No, the reason to only skim these is what the analysis does to the chance of your success. There are two self-inflicted wounds that often happen when people focus on attackers:

  • You miss attackers
  • You misunderstand what the attackers will do

You may get a vicarious thrill from knowing who might be attacking you, but that very vicarious thrill is likely to make those details available to your conscious mind, or anchor your attention on them, causing you to miss other attackers. Similarly, you might get attached to the details of how they attacked last year, and not notice how those details change.

Now, you might think that your analysis won’t fall into those traps, but let me be clear: the largest, best-funded analysis shops in the world routinely make serious and consequential mistakes about their key areas of responsibility. The CIA didn’t predict the collapse of the Soviet Union, and it didn’t predict the rise of ISIS.

If your organization believes that it’s better at intelligence analysis than the thousands of people who work in US intelligence, then please pay attention to my raised eyebrow. Maybe you should be applying that analytic awesomesauce to your core business, maybe it is your core business, or maybe you should be carefully combing through the reports and analysis to update your assessments of where these rapscallions shall strike next. Or maybe you’re over-estimating your analytic capabilities.

Let me lay it out for you: the “sophisticated” attackers are using phishing to get a foothold, then dropping malware which talks to C&C servers in various ways. The phishing has three important variants you need to protect against: links to exploit web pages, documents containing exploits, and executables disguised as documents. If you can’t reliably prevent those things, detect them when you’ve missed, and respond when you discover you’ve missed, then digging into the motivations of your attackers may not be the best use of your time.

The indicators that can help you find the successful attacks are an important value from these reports, and that’s what you should use them for. Don’t get distracted by the motivations.

Published Data Empowers

There’s a story over at Bloomberg, “Experian Customers Unsafe as Hackers Steal Credit Report Data.” And much as I enjoy picking on the credit reporting agencies, what I really want to talk about is how the story came to light.

The cyberthieves broke into an employee’s computer in September 2011 and stole the password for the bank’s online account with Experian Plc, the credit reporting agency with data on more than 740 million consumers. The intruders then downloaded credit reports on 847 people, said Dana Pardee, a branch manager at the bank. They took Social Security numbers, birthdates and detailed financial data on people across the country who had never done business with Abilene Telco, which has two locations and serves a city of 117,000.

The incident is one of 86 data breaches since 2006 that expose flaws in the way credit-reporting agencies protect their databases. Instead of directly targeting Experian, Equifax Inc. and TransUnion Corp., hackers are attacking affiliated businesses, such as banks, auto dealers and even a police department that rely on reporting agencies for background credit checks.

This approach has netted more than 17,000 credit reports taken from the agencies since 2006, according to Bloomberg.com’s examination of hundreds of pages of breach notification letters sent to victims. The incidents were outlined in correspondence from the credit bureaus to victims in six states — Maine, Maryland, New Hampshire, New Jersey, North Carolina and Vermont. The letters were discovered mostly through public-records requests by a privacy advocate who goes by the online pseudonym Dissent Doe…

There are three key lessons. The first is for those who still say “anonymized, of course.” The second is for those who are ok with naming the victims, and think we’ve mined this ore, and should move on to other things.

So the first lesson is what enabled us to learn this? Obviously, it’s work by Dissent, but it’s more than that. It’s breach disclosure laws. We don’t anonymize the breaches, we report them.

These sorts of random discoveries are only possible when breaches and their details are reported. We don’t know what details are important, and so ensuring that we get descriptions of what happened is highly important. From that, we discover new things.

The second lesson is that this hard work is being done by volunteers, working with an emergent resource. (Dissent’s post on her work is here.) There’s lots of good questions about what a breach law should be. Some proposals for 24 hour notice appear to be being drafted by people who’ve never talked to anyone who’s investigated a breach. There are interesting questions of active investigations, or those few cases where revealing information about the breach could enable attackers to hurt others. But it seems reasonably obvious that the effort put into gathering data from many services is highly inefficient. That data ought to be available in one place, so that researchers like Dissent can spend their time learning new things.

The final lesson is one that we at the New School have been talking about for a while. Public data transforms our profession and our ability to protect people. If I may borrow a line, we’re not at the beginning of the end of that process, we’re at the end of the beginning, and what comes next is going to be awesome.

Base Rate & Infosec

At SOURCE Seattle, I had the pleasure of seeing Jeff Lowder and Patrick Florer present on “The Base Rate Fallacy.” The talk was excellent, lining up the idea of the base rate fallacy, how and why it matters to infosec. What really struck me about this talk was that about a week before, I had read a presentation of the fallacy with exactly the same example in Kahneman’s “Thinking, Fast and Slow.” The problem is you have a witness who’s 80% accurate, describing a taxi as orange; what are the odds she’s right, given certain facts about the distribution of taxis in the city?

I had just read the discussion. I recognized the problem. I recognized that the numbers were the same. I recalled the answer. I couldn’t remember how to derive it, and got the damn thing wrong.

Well played, sirs! Game to Jeff and Patrick.

Beyond that, there’s an important general lesson in the talk. It’s easy to make mistakes. Even experts, primed for the problems, fall into traps and make mistakes. If we publish only our analysis (or worse, engage in information sharing), then others can’t see what mistakes we might have made along the way.

This problem is exacerbated in a great deal of work by a lack of a methodology section, or a lack of clear definitions.

The more we publish, the more people can catch one anothers errors, and the more the field can advance.

Active Defense: Show me the Money!

Over the last few days, there’s been a lot of folks in my twitter feed talking about “active defense.” Since I can’t compress this into 140 characters, I wanted to comment quickly: show me the money. And if you can’t show me the money, show me the data.

First, I’m unsure what’s actually meant by active defense. Do the folks arguing have a rough consensus on what’s in and what’s out? If not, (or more) would be useful. Just so others can follow the argument.

So anyway, my questions:

  1. Do organizations that engage in Active Defense suffer fewer incidents than those who don’t?
  2. Do organizations that engage in Active Defense see smaller cost-per-incident when using it than when not? (or in comparison to other orgs?)
  3. How much does an Active Defense program cost?
  4. Is that the low cost way to achieve the better outcomes than other ways to get the outcomes from 1 & 2?

I’m sure some of the folks advocating active defense in this age of SEC-mandated incident disclosure can point to incidents, impacts and outcomes.

I look forward to learning more about this important subject.

Why Sharing Raw Data is Important

Bob Rudis has a nice post up “Off By One : The Importance Of Fact Checking Breach Reports,” in which he points out some apparent errors in the Massachusetts 2011 breach report, and also provides some graphs.

Issues like this are why it’s important to release data. It enables independent error checking, but also allows people to slice and dice the issues in ways that otherwise are only accessible to a privileged few with the raw numbers.

Time for an Award for Best Data?

Yesterday, DAn Kaminsky said “There should be a yearly award for Best Security Data, for the best collection and disbursement of hard data and cogent analysis in infosec.” I think it’s a fascinating idea, but think that a yearly award may be premature. However, what I think is sorta irrelevant, absent data. So I’m looking for data on the question, do we have enough good data to issue an award yearly?

Please nominate in the comments.

Also, please discuss what the criteria should be.