The slides from my Blackhat talk, “Threat Modeling in 2018: Attacks, Impacts and Other Updates” are now available either as a PDF or online viewer.
Since I wrote my book on the topic, people have been asking me “what’s new in threat modeling?” My Blackhat talk is my answer to that question, and it’s been taking up the time that I’d otherwise be devoting to the series.
As I’ve been practicing my talk*, I discovered that there’s more new than I thought, and I may not be able to fit in everything I want to talk about in 50 minutes. But it’s coming together nicely.
The current core outline is:
- What are we working on
- The fast moving world of cyber
- The agile world
- Models are scary
- What can go wrong? Threats evolve!
- Machine Learning
And of course, because it’s 2018, there’s cat videos and emoji to augment logic. Yeah, that’s the word. Augment. 🤷♂️
Wednesday, August 8 at 2:40 PM.
* Oh, and note to anyone speaking anywhere, and especially large events like Blackhat — as the speaker resources say: practice, practice, practice.
So this week’s threat model Thursday is simply two requests:
- What would you like to see in the series?
- What would you like me to cover in my Blackhat talk, “Threat Modeling in 2018?”
“Attacks always get better, and that means your threat modeling needs to evolve. This talk looks at what’s new and important in threat modeling, organizes it into a simple conceptual framework, and makes it actionable. This includes new properties of systems being attacked, new attack techniques (like biometrics confused by LEDs) and a growing importance of threats to and/or through social media platforms and features. Take home ways to ensure your security engineering and threat modeling practices are up-to-date.”
Over at the Leviathan blog, Crispin Cowan writes about “The Calculus Of Threat Modeling.” Crispin and I have collaborated and worked together over the years, and our approaches are explicitly aligned around the four question frame.
What are we working on?
One of the places where Crispin goes deeper is definitional. He’s very precise about what a security principal is:
A principal is any active entity in system with access privileges that are in any way distinct from some other component it talks to. Corollary: a principal is defined by its domain of access (the set of things it has access to). Domains of access can, and often do, overlap, but that they are different is what makes a security principal distinct.
This also leads to the definition of attack surface (where principals interact), trust boundaries (the sum of the attack surfaces) and security boundaries (trust boundaries for which the engineers will fight). These are more well-defined than I tend to have, and I think it’s a good set of definitions, or perhaps a good step forward in the discussion if you disagree.
What can go wrong?
His approach adds much more explicit description of principals who own elements of the diagram, and several self-check steps (“Ask again if we have all the connections..”) I think of these as part of “did we do a good job?” and it’s great to integrate such checks on an ongoing basis, rather than treating it as a step at the end.
What are we going to do about it?
Here Crispin has assessing complexity and mitigations. Assessing complexity is an interesting approach — a great many vulnerabilities appear on the most complex interfaces, and I think it’s a useful strategy, similar to ‘easy fixes first’ for a prioritization approach.
He also has “c. Be sure to take a picture of the white board after the team is done describing the system.” “d. Go home and create a threat model diagram.” These are interesting steps, and I think deserve some discussion as to form (I think this is part of ‘what are we working on?’) and function. To function, we already have “a threat model diagram,” and a record of it, in the picture of the whiteboard. I’m nitpicking here for two very specific reasons. First, the implication that what was done isn’t a threat model diagram isn’t accurate, and second, as the agile world likes to ask “why are you doing this work?”
I also want to ask, is there a reason to go from whiteboard to Visio? Also, as Crispin says, he’s not simply transcribing, he’s doing some fairly nuanced technical editing, “Collapse together any nodes that are actually executing as the same security principal.” That means you can’t hand off the work to a graphic designer, but you need an expensive security person to re-consider the whiteboard diagram. There are times that’s important. If the diagram will be shown widely across many meetings; if the diagram will go outside the organization, say, to regulators; if the engineering process is waterfall-like.
Crispin says that tools are substitutes for expertise, and that (a? the?) best practice is for a security expert and the engineers to talk. I agree, this is a good way to do it — I also like to train the engineers to do this without security experts each time.
And that brings me to the we/you distinction. Crispin conveys the four question frame in the second person (What are you doing, what did you do about it), and I try to use the first person plural (we; what are we doing). Saying ‘we’ focuses on collaboration, on dialogue, on exploration. Saying ‘you’ frames this as a review, a discussion, and who knows, possibly a fight. Both of us used that frame at a prior employer, and today when I consult, I use it because I’m really not part of the doing team.
That said, I think this was a super-interesting post for the definitions, and for showing the diagram evolution and the steps taken from a whiteboard to a completed, colored diagram.
The image is the frontspiece of Leviathan by Thomas Hobbes, with its famous model of the state, made up of the people.
Continuum has released a video of me and Stuart Winter-Tear in conversation at the Open Security Summit:
“At the recent Open Security Summit we had the great pleasure of interviewing Adam Shostack about his keynote presentation “A seat at the table” and the challenge of getting security involved in product and application design. We covered numerous topics from the benefits brought to business by threat modeling to pooping unicorns.”
For Threat Model Thursday, I want to use current events here in Seattle as a prism through which we can look at technology architecture review. If you want to take this as an excuse to civilly discuss the political side of this, please feel free.
Seattle has a housing and homelessness crisis. The cost of a house has risen nearly 25% above the 2007 market peak, and has roughly doubled in the 6 years since April 2012. Fundamentally, demand has outstripped supply and continues to do so. As a city, we need more supply, and that means evaluating the value of things that constrain supply. This commentary from the local Libertarian party lists some of them.
The rules on what permits are needed to build a residence, what housing is acceptable, or how many unrelated people can live together (no more than eight) are expressions of values and priorities. We prefer that the developers of housing not build housing rather than build housing that doesn’t comply with the city’s Office of Planning and Community Development 32 pages of neighborhood design guidelines. We prefer to bring developers back after a building is built if the siding is not the agreed color. This is a choice that expresses the values of the city. And because I’m not a housing policy expert, I can miss some of the nuances and see the effect of the policies overall.
Let’s transition from the housing crisis here in Seattle to the architecture crisis that we face in technology.
No, actually, I’m not quite there. The city killed micro-apartments, only to replace them with … artisanal micro-houses. Note the variation in size and shape of the two houses in the foreground. Now, I know very little about construction, but I’m reasonably confident that if you read the previous piece on micro-housing, many of the concerns regulators were trying to address apply to “True Hope Village,” construction pictured above. I want you, dear reader, to read the questions about how we deliver housing in Seattle, and treat them as a mirror into how your organization delivers software. Really, please, go read “How Seattle Killed Micro-Housing” and the “Neighborhood Design Guidelines” carefully. Not because you plan to build a house, but as a mirror of your own security design guidelines.
They may be no prettier.
In some companies, security is valued, but has no authority to force decisions. In others, there are mandatory policies and review boards. We in security have fought for these mandatory policies because without them, products ignored security. And similarly, we have housing rules because of unsafe, unsanitary or overcrowded housing. To reduce the blight of slums.
Security has design review boards which want to talk about the color of the siding a developer installed on the now live product. We have design regulation which kills apodments and tenement housing, and then glorifies tiny houses. From a distance, these rules make no sense. I didn’t find it sensible, myself. I remember a meeting with the Microsoft Crypto board. I went in with some very specific questions regarding parameters and algorithms. Should we use this hash algorithm or that one? The meeting took not five whole minutes to go off the rails with suggestions about non-cryptographic architecture. I remember shipping the SDL Threat Modeling Tool, going through the roughly five policy tracking tools we had at the time, discovering at the very last minute that we had extra rules that were not documented in the documents that I found at the start. It drives a product manager nuts!
Worse, rules expand. From the executive suite, if a group isn’t growing, maybe it can shrink? From a security perspective, the rapidly changing threat landscape justifies new rules. So there’s motivation to ship new guidelines that, in passing, spend a page explaining all the changes that are taking place. And then I see “Incorporate or acknowledge the best features of existing early to mid-century buildings in new development.” What does that mean? What are the best features of those buildings? How do I acknowledge them? I just want to ship my peer to peer blockchain features! And nothing in the design review guidelines is clearly objectionable. But taken as a whole, they create a complex and unpredictable, and thus expensive path to delivery.
We express values explicitly and implicitly. In Seattle, implicit expression of values has hobbled the market’s ability to address a basic human need. One of the reasons that embedding is effective is that the embedded gatekeepers can advise, interpret in relation to real questions. Embedding expresses the value of collaboration, of dialogue over review. Does your security team express that security is more important than product delivery? Perhaps it is. When Microsoft stood down product shipping for security pushes, it was an explicit statement. Making your values explicit and debating prioritization is important.
What side effects do your security rules have? What rule is most expensive to comply with? What initiatives have you killed, accidentally or intentionally?
Today’s Threat Model Thursday is a look at “Post-Spectre Threat Model Re-Think,” from a dozen or so folks at Google. As always, I’m looking at this from a perspective of what can we learn and to encourage dialogue around what makes for a good threat model.
What are we working on?
From the title, I’d assume Chromium, but there’s a fascinating comment in the introduction that this is wider: “any software that both (a) runs (native or interpreted) code from more than one source; and (b) attempts to create a security boundary inside a single address space, is potentially affected.” This is important, and in fact, why I decided to hightlight the model. The intro also states, “we needed to re-think our threat model and defenses for Chrome renderer processes.” In the problem statement, they mention that there are other, out of scope variants such as “a renderer reading the browser’s memory.”
It would be helpful to me, and probably others, to diagram this, both for the Chrome case (the relationship between browser and renderer) and the broader case of that other software, because the modern web browser is a complex beast. As James Mickens says:
who weren’t even INVITED to the party…
What can go wrong
There is a detailed set of ways that confidentiality breaks current boundaries. Most surprising to me is the claim that clock jitter is not as useful as we’d expect, and even enumerating all the clocks is tricky! (Webkit seems to have a different perspective, that reducing timer precision is meaningful.)
There is also an issue of when to pass autofilled data to a renderer, and a goal of “Ensure User Intent When Sending Data To A Renderer.” This is good, but usability may depend on normal people understanding that their renderer and browser are different. That’s mitigated by taking user gestures as evidence of intent. That seems like a decent balancing of usability and security, but as I watch people using devices, I see a lot of gesturing to explore and discover the rapidly changing meanings of gestures, both within applications and across different applications and passwords.
What are we going to do about it?
As a non-expert in browser design, I’m not going to attempt to restate the mitigations. Each of the defensive approaches is presented with clear discussion of its limitations and the current intent. This is both great to see, and hard to follow for those not deep in browser design. That form of writing is probably appropriate, because otherwise the meaning gets lost in verbosity that’s not useful to the people most impacted. I would like to see more out-linking as an aide to those trying to follow along.
Did we do a good job?
I’m very glad to see Google sharing this because we can see inside the planning of the architects, the known limits, and the demands on the supply chain (changes to compilers to reduce gadgets, changes to platforms to increase inter-process isolation), and in the end, “we now assume any active code can read any data in the same address space. The plan going forward must be to keep sensitive cross-origin data out of address spaces that run untrustworthy code.” Again, that’s more than just browsers. If your defensive approaches, mitigations or similar sections are this clear, you’re doing a good job.
I have a new essay at Dark Reading, “‘EFAIL’ Is Why We Can’t Have Golden Keys.” It starts:
There’s a newly announced set of issues labeled the “EFAIL encryption flaw” that reduces the security of PGP and S/MIME emails. Some of the issues are about HTML email parsing, others are about the use of CBC encryption. All show how hard it is to engineer secure systems, especially when those systems are composed of many components that had disparate design goals.
Then he explained the name was important for inspiring the necessary fear. You see, no one would surrender to the Dread Pirate Westley.
The DREAD approach was created early in the security pushes at Microsoft as a way to prioritize issues. It’s not a very good way, you see no one would surrender to the Bug Bar Pirate, Roberts. And so the approach keeps going, despite its many problems.
There are many properties one might want in a bug ranking system for internally found bugs. They include:
- A cool name
- A useful mnemonic
- A reduction in argument about bugs
- Consistency between raters
- Alignment with intuition
- Immutability of ratings: the bug is rated once, and then is unlikely to change
- Alignment with post-launch/integration/ship rules
DREAD certainly meets the first of these, and perhaps the second two. And it was an early attempt at a multi-factor rating of bugs. But there are many problems which DREAD brings that newer approaches deal with.
The most problematic aspect of DREAD is that there’s little consistency, especially in the middle. What counts as a 6 damage versus 7, or 6 versus 7 exploitability? Without calibration, different raters will not be consistent. Each of the scores can be mis-estimated, and there’s a tendency to underestimate things like discoverability of bugs in your own product.
The second problem is that you set an arbitrary bar for fixes, for example, everything above a 6.5 gets fixed. That makes the distinction between a 6 and a 7 sometimes matters a lot. The score does not relate to what needs to get fixed when found externally.
This illustrates why Discoverability is an odd things to bring into the risk equation. You may have a discoverability of “1” on Monday, and 10 on Tuesday. (“Thanks, Full-Disclosure!”) So something could have a 5.5 DREAD score because of low discoverability but require a critical update. Suddenly the DREAD score of the issue is mutable. So it’s hard to use DREAD on an externally discovered bug, or one delivered via a bug bounty. So now you have two bug-ranking systems, and what do you do when they disagree? This happened to Microsoft repeatedly, and led to the switch to a bug bar approach.
Affected users is also odd: does an RCE in Flight Simulator matter less than one in Word? Perhaps in the grand scheme of things, but I hope the Flight Simulator team is fixing their RCEs.
Stepping beyond the problems internal to DREAD to DREAD within a software organization, it only measures half of what you need to measure. You need to measure both the security severity and the fix cost. Otherwise, you run the risk of finding something with a DREAD of 10, but it’s a key feature (Office Macros), and so it escalates, and you don’t fix it. There are other issues which are easy to fix (S3 bucket permissions), and so it doesn’t matter if you thought discoverability was low. This is shared by other systems, but the focus on a crisp line in DREAD, everything above a 6.5 gets fixed, exacerbates the issue.
For all these reasons, with regards to DREAD? Fully skeptical, and I have been for over a decade. If you want to fix these things, the thing to do is not create confusion by saying “DREAD can also be a 1-3 system!”, but to version and revise DREAD, for example, by releasing DREAD 2. I’m exploring a similar approach with DFDs.
I’m hopeful that this post can serve as a collection of reasons to not use DREAD v1, or properties that a new system should have. What’d I miss?
There’s a recent post on the Google Cloud Platform Blog, “Exploring container security: Isolation at different layers of the Kubernetes stack” that’s the subject of our next Threat Modeling Thursday post. As always, our goal is to look and see what we can learn, not to say ‘this is bad.’ There’s more than one way to do it. Also, last time, I did a who/what/why/how analysis, which turned out to be pretty time consuming, and so I’m going to avoid that going forward.
The first thing to point out is that there’s a system model that intended to support multiple analyses of ‘what can go wrong.’ (“Sample scenario…Time to do a little threat modeling.”) This is a very cool demonstration of how to communicate about security along a supply chain. In this instance, the answers to “what are we working on” vary with who “we” are. That might be the Kubernetes team, or it might be someone using Kubernetes to implement a Multi-tenant SaaS workload.
What are we working on?
The answers to this are either Kubernetes or the mutli-tenant system. The post includes a nice diagram (reproduced above) of Kubernetes and its boundaries. Speaking of boundaries, they break out security boundaries which enforce the trust boundaries. I’ve also heard ‘security boundaries’ referred to as ‘security barriers’ or ‘boundary enforcement.’ They also say “At Google, we aim to protect all trust boundaries with at least two different security boundaries that each need to fail in order to cross a trust boundary.”
But you can use this diagram to help you either improve Kubernetes, or to improve the security of systems hosted in Kubernetes.
What can go wrong?
Isolation failures. You get a resource isolation failure…you get a network isolation failure, everyone gets an isolation failure! Well, no, not really. You only get an isolation fail if your security boundaries fail. (Sorry! Sorry?)
This use of isolation is interestingly different from STRIDE or ATT&CK. In many threat models of userland code on a desktop, the answer is ‘and then they can run code, and do all sorts of things.’ The isolation failure was the end of a chain, rather than the start, and you focus on the spoof, tamper or EoP/RCE that gets you onto the chain. In that sense, isolation may seem frustratingly vague for many readers. But isolation is a useful property to have, and more importantly, it’s what we’re asking Kubernetes to provide.
There’s also mention of cryptomining (and cgroups as a fix) and running untrusted code (use sandboxing). Especially with regards to untrusted code, I’d like to see more discussion of how to run untrusted code, which may or may not be inside a web trust boundary, semi-safely, which is to say either attempting to control its output, or safely, which is to say, in a separate web namespace.
What are we going to do about it?
You can use this model to decide what you’re going to do about it. How far up or down the list of isolations should you be? Does your app need its own container, pod, node, cluster?
I would like to see more precision in the wording of the controls — what does ‘some’ control-pane isolation mean? Is it a level of effort to overcome, a set of things you can rely on and some you can’t? The crisp expression of these qualities isn’t easy, but the authors are in a better position to express them than their readers. (There may be more on this elsewhere in the series.)
Did we do a good job?
There’s no explicit discussion, but my guess is that the post was vetted by a great many people.
To sum up, this is a great example of using threat modeling to communicate between supplier and customer. By drawing the model and using it to threat model, they help people decide if GCP is right, and if so, how to configure it in the most secure way.
What do you see in the models?