Just Culture and Information Security

Yesterday Twitter revealed they had accidentally stored plain-text passwords in some log files. There was no indication the data was accessed and users were warned to update their passwords. There was no known breach, but Twitter went public anyway, and was excoriated in the press and… on Twitter.

This is a problem for our profession and industry. We get locked into a cycle where any public disclosure of a breach or security mistake results in…

Well, you can imagine what it results in, or you can go read “The Security Profession Needs to Adopt Just Culture” by Rich Mogull. It’s a very important article, and you should read it, and the links, and take the time to consider what it means. In that spirit, I want to reflect on something I said the other night. I was being intentionally provocative, and perhaps crossed the line away from being just. What I said was a password management company had one job, and if they expose your passwords, you should not use their password management software.

Someone else in the room, coming from a background where they have blameless post-mortems, challenged my use of the phrase ‘you had one job,’ and praised the company for coming forward. And I’ve been thinking about that, and my take is, the design where all the passwords are at a single site is substantially and predictably worse than a design where the passwords are distributed in local clients and local data storage. (There are tradeoffs. With a single site, you may be able to monitor for and respond to unusual access patterns rapidly, and you can upgrade all the software at once. There is an availability benefit. My assessment is that the single-store design is not worth it, because of the catastrophic failure modes.)

It was a fair criticism. I’ve previously said “we live in an ‘outrage world’ where it’s easier to point fingers and giggle in 140 characters and hurt people’s lives or careers than it is to make a positive contribution.” Did I fall into that trap myself? Possibly.

In “Just Culture: A Foundation for Balanced Accountability and Patient Safety,” which Rich links, there’s a table in Figure 2, headed “Choose the column that best describes the caregiver’s action.” In reading that table, I believe that a password manager with central storage falls into the reckless category, although perhaps it’s merely risky. In either case, the system leaders are supposed to share in accountability.

Could I have been more nuanced? Certainly. Would it have carried the same impact? No. Justified? I’d love to hear your thoughts!

Why is “Reply” Not the Strongest Signal?

So apparently my “friends” at outlook.com are marking my email as junk today, with no explanation. They’re doing this to people who have sent me dozens of emails over the course of months or years.

Why does no spam filter seem to take repeated conversational turns into account? Is there a stronger signal that I want to engage with someone than…repeatedly engaging?

Consultants Say Their Cyber Warnings Were Ignored

Back in October, 2014, I discussed a pattern of “Employees Say Company Left Data Vulnerable,” and its a pattern that we’ve seen often since. Today, I want to discuss the consultant’s variation on the story. This is less common, because generally smart consultants don’t comment on the security of their consultees. In this case, it doesn’t seem like the consultant’s report was leaked, but people are discussing it after a high-profile issue.

In brief, the DNC was hacked, probably by Russian intelligence, and emails were given to Wikileaks. Wikileaks published them without redacting things like credit card numbers or social security numbers. The head of the DNC has stepped down. (This is an unusual instance of someone losing their job, which is rare post-breach. However, she did not lose her job because of the breach, she lost it because the breach included information about how her organization tilted the playing field, and how she lied about doing so.)

This story captures a set of archetypes. I want to use this story as a foil for those archetypes, not to critique any of the parties. I’ll follow the pattern from “employess vs company” present those three sections: “I told you so”, “potential spending”, and “how to do better.” I also comments on preventability and “shame.”

Was it preventable?

Computer security consultants hired by the DNC made dozens of recommendations after a two-month review, the people said. Following the advice, which would typically include having specialists hunt for intruders on the network, might have alerted party officials that hackers had been lurking in their network for weeks… (“Democrats Ignored Cybersecurity Warnings Before Theft,” Michael Riley, Bloomberg.)

People are talking about this as if the DNC was ever likely to stop Russian intelligence from breaking into their computers. That’s a very, very challenging goal, one at which both US and British intelligence have failed. (And as I write this, an FBI agent has been charged with espionage on behalf of China.) There’s a lot of “might,” “could have,” and other words that say “possible” without assessing “probable.”

I told you so!

The report included “dozens of recommendations,” some of which, such as “taking special precautions to protect any financial information related to donors” might be a larger project than a PCI compliance initiative. (The logic is that financial information collected by a party is more than just card numbers; there seems to be a lot of SSNs in the data as well). If one recommendation is “get PCI compliant,” than “dozens of recommendations” might be a Sysyphean task, or perhaps the Agean Stables are a better analogy. In either case, only in mythology can the actions be completed.

Missing from the discussion I’ve seen so far is any statement of what was done. Did the organization do the top-5 things the consultants said to do? (Did they even break things out into a top-5?)

Potential Spending

The review found problems ranging from an out-of-date firewall to a lack of advanced malware detection technology on individual computers, according to two of the people familiar with the matter.

It sounds like “advanced malware detection technology” would be helpful here, right? Hindsight is 20:20. An out-of-date firewall? Was it missing security updates (which would be worrisome, but less worrisome than one might think, depending on what those updates fix), or was it just not the latest revision? If it’s not the latest revision, it can probably still do its job. In carefully reading the article, I saw no evidence that any single recommendation, or even implementing all of them, would have prevented the breach.

The DNC is a small organization. They were working with a rapidly shifting set of campaign workers working for the Sanders and Clinton campaigns. I presume they’re also working on a great many state races, and the organizations those politicians are setting up.

I do not believe that doing everything in the consultant’s report could reasonably be expected to prevent a breakin by a determined mid-sized intelligence agency.


“Shame on them. It looks like they just did the review to check a box but didn’t do anything with it,” said Ann Barron-DiCamillo, who was director of US-Cert, the primary agency protecting U.S. government networks, until last February. “If they had acted last fall, instead of those thousands of e-mails exposed it might have been much less.”

Via Meredith Patterson, I saw “The Left’s Self-Destructive Obsession with Shame,” and there’s an infosec analog. Perhaps they would have found the attackers if they’d followed the advice, perhaps not. Does adding shame work to improve the cybers? If it did, it should have done so by now.

How to do better

I stand by what I said last time. The organization has paid the PR price, and we have learned nothing. What a waste. We should talk about exactly what happened at a technical level.

We should stop pointing fingers and giggling. It isn’t helping us. In many ways, the DNC is not so different from thousands of other small or mid-size organizations who hold sensitive information. Where is the list of effective practice for them to follow? How different is the set of recommendations in this instance from other instances? Where’s the list of “security 101” that the organization should have followed before hiring these consultants? (My list is at “Security 101: Show Your List!.”)

We should define the “smoke alarms and sprinklers” of cyber. Really, why does an organization like the DNC need to pay $60,000 for a cybersecurity policy? It’s a little hard to make sense of, but I think that the net operating expenditures of $5.1m is a decent proxy for the size of the organization, and (if that’s right) 1% of their net operating expenses went to a policy review. Was that a good use of money? How else should they have spent that? What concrete things do we, as a profession or as a community, say they should have done? Is there a reference architecture, or a system in a box which they could have run that we expect would resist Russian intelligence?

We cannot expect every small org to re-invent this wheel. We have to help them better.

PCI & the 166816 password

This was a story back around RSA, but I missed it until RSnake brought it up on Twitter: “[A default password] can hack nearly every credit card machine in the country.” The simple version is that Charles Henderson of Trustwave found that “90% of the terminals of this brand we test for the first time still have this code.” (Slide 30 of RSA deck.) Wow.

Now, I’m not a fan of the “ha-ha in hindsight” or “that’s security 101!” responses to issues. In fact, I railed against it in a blog post in January, “Security 101: Show Your List!

But here’s the thing. Credit card processors have a list. It’s the Payment Card Industry Data Security Standard. That standard is imposed, contractually, on everyone who processes payment cards through the big card networks. In version 3, requirement 2 is “Do not use vendor-supplied defaults for system passwords.” This is not an obscure sub-bullet. As far as I can tell, it is not a nuanced interpretation, or even an interpretation at all. In fact, testing procedure 2.1.a of v3 of the standard says:

2.1.a Choose a sample of system components, and attempt to
log on (with system administrator help) to the devices and
applications using default vendor-supplied accounts and
passwords, to verify that ALL default passwords (including
those on … POS terminals…) have been changed. [I’ve elided a few elements of the list for clarity.]

Now, the small merchant may not be aware that their terminal has a passcode. They may have paid someone else to set it up. But shouldn’t that person have set it up properly? The issue is not that the passcodes are not reset, the issue that I’m worried about is that the system appears broken. We appear to have evidence that to get security right, the system requires activity by busy, possibly undertrained people. Why is that still required ten years into PCI?

This isn’t a matter of “checklists replacing security.” I have in the past railed against checklists (including in the book this blog is named after). But after I read “The Checklist Manifesto”, I’ve moderated my views a bit, such as in “Checklists and Information Security.” Here we have an example of exactly what checklists are good for: avoiding common, easy-to-make and easy-to-check mistakes.

When I raised some of these questions on Twitter someone said that the usual interpretation is that the site selects the sample (where they lack confidence). And to an extent, that’s understandable, and I’m working very hard avoid hindsight bias here. But I think it’s not hindsight bias to say that a sample should be a random sample unless there’s a very strong reason to choose otherwise. I think it’s not hindsight bias to note that your financial auditors don’t let you select which transactions are audited.

Someone else pointed out that it’s “first time audits” which is ok, but why, a decade after PCI 1.0, is this still an issue? Shouldn’t the vendor have addressed this by now? Admittedly, may be hard to manage device PINs at scale — if you’re Target with tens of thousands of PIN pads, and your techs need access to the PINs, what do you do to ensure that the tech has access while at the register, but not after they’ve quit? But even given such challenges, shouldn’t the overall payment card security system be forcing a fix to such issues?

All bellyaching and sarcastic commentary aside, if the PCI process isn’t catching this, what does that tell us? Serious question. I have ideas, but I’m really curious as to what readers think. Please keep it professional.

Security 101: Show Your List!

Lately I’ve noted a lot of people quoted in the media after breaches saying “X was Security 101. I can’t believe they didn’t do X!” For example, “I can’t believe that LinkedIn wasn’t salting passwords! That’s security 101!”

Now, I’m unsure if that’s “security 101” or not. I think security 101 for passwords is “don’t store them in plaintext”, or “don’t store them with a crypto algorithm you designed”. Ten years ago, it would have included salting, but with the speed of GPU crackers, maybe it doesn’t anymore. A good library would probably still include it. Maybe LinkedIn was spending more on preventing XSS or SQL injection, and that pushed password storage off their list. Maybe that’s right, maybe it’s wrong. To tell you the truth, I don’t want to argue about it.

What I want to argue about is the backwards looking nature of these statements. I want to argue because I did some searching, and not one of those folks I searched for has committed to a list of security 101, or what are the “simple controls” every business should have.

This is important because otherwise, hindsight is 20/20. It’s easy to say in hindsight that an organization should have done A or B or C. It’s harder to offer up a complete list in advance, and harder yet to justify the budget required to deploy and operate it.

So I’m going to make three requests for 2015:

  • If you’re an expert (or even play one on the internet), and if you want to say “X is Security 101,” please publish your full list of what’s “101.”
  • If you’re a reporter and someone tells you “X is security 101” please ask them for their list.
  • Finally, if you’re someone who wants to see security improve, and you hear claims about “101”, please ask for the list.

Oh, and since it’s sauce for the gander, here’s my list for individuals:

  • Stay up to date–get most of your machines on the latest revisions of software and get patches for security issues installed, especially in your browser and AV software.
  • Use a firewall that blocks most inbound traffic.
  • Ensure you have a working backup of your data.

(There are complex arguments about AV software, and a lack of agreements about how to effectively test it. Do you need it? Will it block the wildlist? There’s nuance, but that nuance doesn’t play into a 101 list. I won’t be telling people not to use AV software.)

*By “lately,” I meant in 2012, when I wrote this, right after the Linkedin breach. But I recently discovered that I hadn’t posted.

[Update: I’m happy to see Ira Winkler and Araceli Treu Gomes took up the idea in “The Irari rules for declaring a cyberattack ‘sophisticated’.” Good for them!]

Threat Modeling At a Startup

I’ve been threat modeling for a long time, and at Microsoft, had the lovely opportunity to put some rigor into not only threat modeling, but into threat modeling in a consistent, predictable, repeatable way. Because I did that work at Microsoft, sometimes people question how it would work for a startup, and I want to address that from having done it. I’ve done threat modeling for startups like Zero Knowledge (some of the output is in the Freedom security whitepaper), and while a consultant or advisor, for other startups who decided against the open kimono approach.

Those threat modeling sessions have taken anywhere from a few hours to a few days, and the same techniques that we developed to make threat modeling more effective can be used to make it more efficient. One key aspect is to realize that what security folks call threat modeling is comprised of several related tasks, and each of them can be done in different ways. Even more important, only one of the steps is purely focused on threats.

I’ve seen a four-question framework work at all sorts of organization. The four questions are: “what are we building/deploying,” “what could go wrong”, “what are we going to do about it” and “did we do an acceptable job at this?”

The “what are we building” question is about creating a shared model of the system. This can be done on a napkin or a whiteboard, or for a large complex system which has grown organically, can involve people working for months. In each case, there’s value beyond security. Your code is cleaner, more consistent, more maintainable, and sometimes even faster when everyone agrees on what’s running where. Discussing trust boundaries can be quick and easy, and understanding them can lead to better marshaling and fewer remote calls. Of course, if your code is spaghetti and no one’s ever bothered to whiteboard it, this step can involve a lot of discussion and debate, because it will lead to refactoring. Maybe you can launch without that refactoring, maybe it’s important to do it now. It can be hard to tell if no one knows how your system is put together.

Understanding “what could go wrong” can happen in as little as an hour for people new to security using the Elevation of Privilege game. You should go breadth-first here, getting an understanding of where the threats might cluster, and if you need to dig deeper.

What are you going to do about it? You’re going to track issues as bugs. (Startups should not spend energy thinking about bugs and flaws as two separate things with their own databases.) If your bugs are post-its, that’s great. If you’re using a database, fine. Just track the issues. Some of them will get turned into tests. If you’re using Test-Driven Development, threat modeling is a great way to help you think about security testing and where it needs to focus. (If you’re not using TDD, it’s still a great way to focus your security testing.)

Total time spent for a startup can be as low as a few hours, especially if you’ve been thinking about software architecture and boundaries along the way. I’ve never spent more than a few days on threat modeling with a small company, and never heard it called a poor use of time. Of course, it’s easy to say that each thing that someone wants a startup to do “only takes a little time,” but all those things can add up to a substantial distraction from the core product. However, a media blow-up or a Federal investigation can also distract you, and distract you for a lot longer. A few hours of threat modeling can help you avoid the sorts of problems that distracted Twitter or Snapchat. The landscape continues to change rapidly post-Snowden, and threat modeling is the fastest way to consider what security investments a startup should be making.

[Update, Feb 5: David Cowan has a complimentary document, “Security for Startups in 10 Steps. He provides context and background on Techcrunch.]

Account Recovery Fail

“Please note that your password will be stored in clear text in our database which will allow us to send it back to you in case you lost it. Try avoid using the same password as accounts you may have in other systems.” — a security conference’s speaker website

This is a silly pattern. At least these folks realize it’s a bad idea, and warn the person, but that’s the wrong approach, since it relies on people reading text, and then relies on them being willing to make up and recall yet another password.

The origin of this problem is a broken model of what people want from their online accounts. As I discuss in chapter 14 of Threat Modeling: Designing for Security, the right goal is almost always account recovery, not password recovery. It’s irksome to see a security conference get this so wrong.

To be fair, the person who selected the conference management system was likely not a security expert, and had other functional goals on which they were focused. It would be easier to be annoyed if someone had a comprehensive and precise list of security problems which they could look for. But while we’re being fair, this isn’t like a XSS or SQLi issue which require some skill to discover. It’s right there in the primary workflow.

How to Ask Good Questions at RSA

So this week is RSA, and I wanted to offer up some advice on how to engage. I’ve already posted my “BlackHat Best Practices/Survival kit.

First, if you want to ask great questions, pay attention. There are things more annoying than a question that was answered while the questioner was tweeting, but you still don’t want to be that person.

Second, if you want to ask a good question, ask a question that you think others will want to hear answered. If your question is narrow, go up to the speaker afterwards.

Now, there are some generic best practice questions that I love to ask, and want to encourage you to ask.

  • You claimed “X”, but didn’t explain why. Could you briefly cover your methodology and data for that claim?
  • You said “X” is a best practice. Can you cover what practices you would cut to ensure there’s resources available to do “X”?
  • You said “if you get breached, you’ll go out of business. Last year, 2600 companies announced data breaches. How many of them are out of business?”
  • You said that “X” dramatically increased your organization’s security. Since we live in an era of ‘assume breach’, can I assume that your organization is now committed to publishing details of any breaches that happen despite X?
      I’m sure there’s other good questions, please share your favorites, and I’ll try for a new post tomorrow.

HHS & Breach Disclosure

There’s good analysis at “HHS breach investigations badly backlogged, leaving us in the dark

To say that I am frequently frustrated by HHS’s “breach tool” would be an understatement. Their reporting form and coding often makes it impossible to know – simply by looking at their entries – what type of breach occurred. Consider this description from one of their entries:

“Theft, Unauthorized Access/Disclosure”,”Laptop, Computer, Network Server, Email”

So what happened there? What was stolen? Everything? And what types of patient information were involved?

Or how about this description:

“Unauthorized Access/Disclosure,Paper”

What happened there? Did a mailing expose SSN in the mailing labels or did an employee obtain and share patients’ information with others for a tax refund fraud scheme? Your guess is as good as mine. And HHS’s breach tool does not include any data type fields that might let us know whether patients’ SSN, Medicare numbers, diagnoses, or other information were involved.

What can I say but, I agree?

Disclosures should talk about the incident and the data. Organizations are paying the PR cost, let’s start learning.

The incident should be specified using either the Broad Street taxonomy (covered in the MS Security Intel Report here) or Veris. It would be helpful to include details like the social engineering mail used (so we can study tactics), and detection rates for the malware, from something like VirusTotal.

For the data, it would be useful to explain (as Dissent says) what was taken. This isn’t simply a matter of general analysis, but can be used for consumer protection. For example, if you use knowledge-based backup authentication, then knowing that every taxpayer in South Carolina has had their addresses exposed tells you something about the efficacy of a question about what address you lived in in 2000. (I don’t know if that data was exposed in the SC tax breach, I’m just picking a recent example.)

Anyway, the post is worth reading, and the question of how we learn from breaches is worth discussing in depth.