Conway’s Law and Software Security

In “Conway’s Law: does your organization’s structure make software security even harder?,” Steve Lipner mixes history and wisdom:

As a result, the developers understood pretty quickly that product security was their job rather than ours. And instead of having twenty or thirty security engineers trying to “inspect (or test) security in” to the code, we had 30 or 40 thousand software engineers trying to create secure code. It made a big difference.

The DREAD Pirates

Then he explained the name was important for inspiring the necessary fear. You see, no one would surrender to the Dread Pirate Westley.

The DREAD approach was created early in the security pushes at Microsoft as a way to prioritize issues. It’s not a very good way, you see no one would surrender to the Bug Bar Pirate, Roberts. And so the approach keeps going, despite its many problems.

There are many properties one might want in a bug ranking system for internally found bugs. They include:

  • A cool name
  • A useful mnemonic
  • A reduction in argument about bugs
  • Consistency between raters
  • Alignment with intuition
  • Immutability of ratings: the bug is rated once, and then is unlikely to change
  • Alignment with post-launch/integration/ship rules

DREAD certainly meets the first of these, and perhaps the second two. And it was an early attempt at a multi-factor rating of bugs. But there are many problems which DREAD brings that newer approaches deal with.

The most problematic aspect of DREAD is that there’s little consistency, especially in the middle. What counts as a 6 damage versus 7, or 6 versus 7 exploitability? Without calibration, different raters will not be consistent. Each of the scores can be mis-estimated, and there’s a tendency to underestimate things like discoverability of bugs in your own product.

The second problem is that you set an arbitrary bar for fixes, for example, everything above a 6.5 gets fixed. That makes the distinction between a 6 and a 7 sometimes matters a lot. The score does not relate to what needs to get fixed when found externally.

This illustrates why Discoverability is an odd things to bring into the risk equation. You may have a discoverability of “1” on Monday, and 10 on Tuesday. (“Thanks, Full-Disclosure!”) So something could have a 5.5 DREAD score because of low discoverability but require a critical update. Suddenly the DREAD score of the issue is mutable. So it’s hard to use DREAD on an externally discovered bug, or one delivered via a bug bounty. So now you have two bug-ranking systems, and what do you do when they disagree? This happened to Microsoft repeatedly, and led to the switch to a bug bar approach.

Affected users is also odd: does an RCE in Flight Simulator matter less than one in Word? Perhaps in the grand scheme of things, but I hope the Flight Simulator team is fixing their RCEs.

Stepping beyond the problems internal to DREAD to DREAD within a software organization, it only measures half of what you need to measure. You need to measure both the security severity and the fix cost. Otherwise, you run the risk of finding something with a DREAD of 10, but it’s a key feature (Office Macros), and so it escalates, and you don’t fix it. There are other issues which are easy to fix (S3 bucket permissions), and so it doesn’t matter if you thought discoverability was low. This is shared by other systems, but the focus on a crisp line in DREAD, everything above a 6.5 gets fixed, exacerbates the issue.

For all these reasons, with regards to DREAD? Fully skeptical, and I have been for over a decade. If you want to fix these things, the thing to do is not create confusion by saying “DREAD can also be a 1-3 system!”, but to version and revise DREAD, for example, by releasing DREAD 2. I’m exploring a similar approach with DFDs.

I’m hopeful that this post can serve as a collection of reasons to not use DREAD v1, or properties that a new system should have. What’d I miss?

NTSB on Uber (Preliminary)

The NTSB has released “Preliminary Report Highway HWY18MH010,” on the Uber self-driving car which struck and killed a woman. I haven’t had a chance to read the report carefully.

Brad Templeton has excellent analysis of the report at “NTSB Report implies serious fault for Uber in fatality” (and Brad’s writings overall on the subject have been phenomenal.)

A few important things to note, cribbed from Brad.

  • The driver was not looking at her phone, but a screen with diagnostic information from the self-driving systems.
  • The car detected a need to brake with approximately enough time to stop had it automatically applied the brakes.
  • That system was turned off for a variety of reasons that look bad (in hindsight, and probably could have been critiqued at the time).

My only comment right now is wouldn’t it be nice to have this level of fact finding in the world of cyber?

Also, it’s very clear that the vehicle was carefully preserved. Can anyone say how the NTSB and/or Uber preserved the data center, cloud or other remote parts of the computer systems involved, including the algorithms that were deployed that day (versus reconstructing them later)?

Threat Model Thursday: Talking, Dialogue and Review

As we head into RSA, I want to hold the technical TM Thursday post, and talk about how we talk to others in our organizations about particular threat models, and how we frame those conversations.

I’m a big fan of the whiteboard-driven dialogue part of threat modeling. That’s where we look at a design, find issues, and make tradeoffs together with developers, operations, and others. The core is the tradeoff: if we do this, it has this effect. I’m borrowing here John Allspaw’s focus on the social nature of dialogue: coming together to explore ideas. It’s rare to have a consultant as an active participant in these dialogues, because a consultant does not have ‘skin in the game,’ they do not carry responsibility for the tradeoffs. These conversations involve a lot of “what about?” and “what if” statements, and active listening is common.

Let me contrast that with the “threat model review.” When reviews happen late in a cycle, they are unlikely to be dialogues about tradeoffs, because the big decisions have been made. At their best, they are validation that the work has been done appropriately. Unfortunately, they frequently devolve into tools for re-visiting decisions that have been made, or arguments for bringing security in next time. Here, outside consultants can add a lot of value, because they’re less tied to the social aspects of the conversation, offer a “review” or “assessment.” These conversations involve a lot of “why” and “did you” questions. They often feel inquisitorial, investigatory and judgmental. Those being questioned often spend time explaining the tradeoffs that were made, and recording those tradeoff discussions was rarely a priority as decisions were made.

These social frames interleave with the activities and deliverables involved in threat modeling. We can benefit from a bit more reductionism in taking ‘threat modeling’ down to smaller units so we can understand and experiment. For example, my colleagues at RISCS refer to “traditional threat modeling approaches,” and we can read that lots of ways. At a technical level, was that an attacker-centric approach grounded in TARA? STRIDE-per-element? At a social level, was it a matter of security champs coming in late and offering their opinions on the threat modeling that had been done?

So I can read the discussion about the ThoughtWorks “Sensible Conversations” as a social shift from a review mode to a dialogue mode, in which case it seems very sensible to me, and I can read it as about the technical shift about their attacker/asset cards. My first read is that their success is more about the social shift which is the headline. The technical shift (or shifts) may be a part of enabling that by saying “hey, lets try a different approach.”

Image: Štefan Štefančík. Thanks to FS & SW for feedback on the draft.

Security Engineering: Computers versus Bridges

Joseph Lorenzo Hall has a post at the Center for Democracy and Technology, “Taking the Pulse of Security Research.” One part of the post is an expert statement on security research, and I’m one of the experts who has signed on.

I fully support what CDT chose to include in the statement, and I want to go deeper. The back and forth of design and critique is not only a critical part of how an individual design gets better, but fields in which such criticism is the norm advance faster.

A quick search in Petroski’s Engineers of Dreams: Great Bridge Builders and the Spanning of America brings us the following. (The Roeblings built the Brooklyn Bridge, Lindenthal had proposed a concept for the crossing, which lost to Roebling’s, and he built many others.)

In Lindenthal’s case, he was so committed to the suspension concept for bridging the Hudson River that he turned the argument naturally and not unfairly to his use. Lindenthal admitted, for example, that it was “a popular assumption that suspension bridges cannot be well used for railroad purposes,” and further conceded that throughout the world there was only one suspension bridge then carrying railroad tracks, Roebling’s Niagara Gorge Bridge, completed in 1854, over which trains had to move slowly. However, rather than seeing this as scant evidence for his case, Lindenthal held up as a model the “greater moral courage and more abiding faith in the truth of constructive principles” that Roebling needed to build his bridge in the face of contemporary criticism by the “most eminent bridge engineers then living.” In Lindenthal’s time, three decades later, it was not merely a question of moral courage; “nowadays bridges are not built on faith,” and there was “not another field of applied mechanics where results can be predicted with so much precision as in bridges of iron and steel.” (“Engineers of Dreams: Great Bridge Builders and the Spanning of America,” Henry Petroski)

Importantly for the case which CDT is making, over the span of thirty years, we went from a single suspension bridge to “much precision” in their construction. That progress happened because criticisms and questions are standard while a bridge is proposed, and if it fails, there are inquests and inquiries as to why.

In his The Great Bridge: The Epic Story of the Building of the Brooklyn Bridge, David McCullough describes the prolonged public discussion of the engineering merits:

It had been said repeatedly by critics of the plan that a single span of such length was impossible, that the bridge trains would shake the structure to pieces and, more frequently, that no amount of calculations on paper could guarantee how it might hold up in heavy winds, but the odds were that the great river span would thrash and twist until it snapped in two and fell, the way the Wheeling Bridge had done (a spectacle some of his critics hoped to be on hand for, to judge by the tone of their attacks).

The process of debating plans for a bridge strengthen, not weaken, the resulting structure. Both books are worth reading as you think about how to advance the field of cybersecurity.

Image credit: Cleveland Electric, on their page about a fiber optic structural monitoring system which they retro-fitted onto the bridge in question.

Gartner on DevSecOps Toolchain

I hadn’t seen “Integrating Security Into the DevSecOps Toolchain,” which is a Gartner piece that’s fairly comprehensive, grounded and well-thought through.

If you enjoyed my “Reasonable Software Security Engineering,” then this Gartner blog does a nice job of laying out important aspects which didn’t fit into that ISACA piece.

Thanks to Stephen de Vries of Continuum for drawing my attention to it.

Ries on Gatekeepers

Eric Ries wrote the excellent book Lean Startup. In a recent interview with Firstround, he talks about how to integrate gatekeeping functions into a lean business.

There is a tremendous amount of wisdom in there, and almost all of it applies to security. The core is that the gatekeeper has compassion for the work and ambiguity of engineering, and that compassion comes from being embedded into the work.

Engineering involves starting with problem statements that are incomplete or inaccurate, and dialog about those problems leading to refinement of the understanding of both the problem and the solution. It’s hard to do that from a remote place in the organization.

This is an argument for what Ries calls embedding, which is appropriate for some gatekeeping functions. What’s more important for security is “a seat at the table.” They’re importantly different. Embedding is a matter of availability when a problem comes up where we need the voice of legal or finance. A seat at the table is that the person is invited to the meetings where the problems and solutions are being refined. That happens naturally when the person invited is a productive contributor. Many functions, from program management to test to usability have won a seat at the table, and sometimes lost it as well.

The first hurdle to a seat at the table, and the only one which is non-negotiable, is productive engagement. “We get more done because we invite Alice to our meetings.” That more might be shipping faster, it might be less rework, it might be higher quality. It is always things which matter to the organization.

The more productive the engagement, the more willing people will be to overlook soft skills issues. The famed BOFH doesn’t get a seat at the table, because as much as IT might want one, he’s abusive. Similarly, security people will often show up and say things like “one breach could sink the company,” or “your design is crap.” Hyperbole, insults, anger, all of the crassly negative emotions will cost not just Angry Bob but the whole security team their seat. These are behaviors that get drawn to the attention of management or even HR. They limit careers, and they also make it hard to give feedback. Who wants to get insulted when you’re trying to help someone? They limit teams. Who wants to work with people like that?

There are other, less crass behaviors with similar effect: not listening, not delivering on time, not taking on work that needs taking on. These soft skills will not get you to the table, but they’ll ease the journey, and most importantly, get you the feedback you may need to get there. But if you are in a gatekeeper role today, or if your security team aspires to rise to the point where you have a rope you can pull to stop the production line, the new article on gatekeepers by Mr. Ries is well worth your time.

One of the aspects of the post that’s worthwhile is providing crisp guidance, which reminds me of what Izar Tarandach talked about at Appsec 2018. (My notes, the video.)

Photo by Aryok Mateus.

Threat Model Thursday: Synopsys

There’s an increasing — and valuable — trend to publish sample threat models. These might be level sets for customers: “we care about these things.” They might be reassurance for customers: “we care about these things.” They might be marketing, they might serve some other purpose. All are fine motives, and whatever the motive, publishing them gives us an opportunity to look and compare the myriad ways models are created, recorded and used. And so I’m kicking off a series I’m calling “threat modeling Thursdays” to do exactly that.

Up front, let me be clear that I’m looking at these to learn and to illustrate. It’s a dangerous trap to think that “the way to threat model is.” There’s more than one way to do it, as the Perl mavens say. Nothing here is intended to say “this is better or worse.” Rather, I want to say things like “if you’re a consultant, starting with scope is more important than when you’re a developer.”

So today’s model, kicking off the series, comes to us from Synopsys, in a blog post titled “The 5 pillars of a successful threat model.” And again, what’s there is great, and what’s there is very grounded in their consulting practice.

Thus, step 1 includes “define the scope and depth. Once a reasonable scope is determined with stakeholders, it needs to be broken down in terms of individual development teams…” Well, sure! That’s one way to do it. If your threat models are going to be executed by consultants, then it’s essential. And if your threat models are going to be done as an integral part of development, scoping is often implicit. But it’s a fine way to start answering the question of “what are we working on?”

Step 2 is “Gain an understanding of what is being threat modeled.” This is also aligned with my question 1, “what are we working on.”

A diagram of a system

The diagram is great, and I initially wanted the internet trust boundary to be more pronounced, but leaving it the same as the other boundaries is a nice way to express “threats come from everywhere.”

The other thing I want to say about the diagram is that it looks like a nice consulting deliverable. “We analyzed the system, discovered these components, and if there’s stuff we missed, you should flag it.” And again, that’s a reasonable choice. In fact, any other choice would be unreasonable from consultants. And there are other models. For example, a much less formal whiteboard model might be a reasonable way for a team starting to threat model to document and align around an understanding of “what we’re building.” The diagrams Synopsys present take more time than the less formal ones. They also act as better, more formal records. There are scenarios where those more formal records are important. For example, if you expect to have to justify your choices to a regulator, a photo of a whiteboard does not “convey seriousness.”

Their step 3 is to model the attack possibilities. Their approach here is a crisp version of the “asset/entry point” that Frank Swiderski and Window Snyder present in their book. “Is there any path where a threat agent can reach an asset without going through a control?”

They draw in assets, threat agents and controls here, and while I’m not a advocate of including them in diagrams (it makes for a lot of complexity), using two diagrams lets you study the system, then look at a more in depth version, which works nicely. Also, their definitions of threat agents is pretty interesting, for example, “unauthorized internal user.” It says nothing of their motivation or capabilities, just their starting position and privileges. Compare and contrast that with a threat persona like “Sean “Keech” Purcell – Defacer.” (Keech is one of the personas created by Aucsmith, Dixon, and Martin-Emerson.)

Synopsys’s step 3, along with their step 4, “interpret the threat model,” answer the question “what can go wrong?” Here I do want to mildly critique their use of the word “the.” There are at least four models in play in the threat modeling activity (System, assets, agents, and controls are all modeled.) There’s strength in thinking of threat modeling as a collection of activities. Calling a particular something ‘the threat model’ is both very common and needlessly restrictive.

Their step 5 is to “create a traceability matrix to record missing or weak controls.” This is a fine input to the question that the readers of that matrix will ask, “what are we going to do about it?” Which happens to be my question 3. They have a somewhat complex analytic frame of a threat agent targets an asset via an attack over a surface… Also interesting in the traceability matrix is the presentation of “user credentials” as an attack goal. I treat those as ‘stepping stones,’ rather than goals. Also, in their discussion of the traceability matrix, we see handoffs: “it takes time and repetition to become proficient at [threat modeling],” and “With experience, you’ll be able to develop a simplified traceability matrix.” These are very important points — how we threat model is not simply a function of our job function, it’s also a function of experience, and the ways in which we work through issues changes as we gain experience. There’s another trap in thinking the ways that work for an experienced expert will work for a novice, and the support tools that someone new to threat modeling may use will hinder the expert.

Lastly, they have no explicit analog to my step 4, “did we do a good job?” I believe that has nothing to do with different interests in quality, but rather that the threat model deliverable with their logo on it will go through stages of document preparation, review and critique, and so that quality check is an implicit one in their worlds.

To close, threat modeling shares the property, common in security, that secrecy makes it harder to learn. I have a small list of threat models to look at, and if you know of some that we can look at together, I would love to hear that, or other feedback you might have on what we can learn from this model.

Speculative Execution Threat Model

There’s a long and important blog post from Matt Miller, “Mitigating speculative execution side channel hardware vulnerabilities.”

What makes it important is that it’s a model of these flaws, and helps us understand their context and how else they might appear. It’s also nicely organized along threat modeling lines.

What can go wrong? There’s a set of primitives (conditional branch misprediction, indirect branch misprediction, and exception delivery or deferral). These are built into gadgets for windowing and disclosure gadgets.

There’s also models for mitigations including classes of ways to prevent speculative execution, removing sensitive content from memory and removing observation channels.