Vulnerabilities Equities Process and Threat Modeling

[Update: More at DarkReading, “ The Critical Difference Between Vulnerabilities Equities & Threat Equities.”]

The Vulnerabilities Equities Process (VEP) is how the US Government decides if they’ll disclose a vulnerability to the manufacturer for fixing. The process has come under a great deal of criticism, because it’s never been clear what’s being disclosed, what fraction of vulnerabilities are disclosed, if the process is working, or how anyone without a clearance is supposed to evaluate that beyond “we’re from the government, we’re here to help,” or perhaps “I know people who managed this process, they’re good folks.” Neither of those is satisfactory.

So it’s a very positive step that on Wednesday, White House Cybersecurity Coordinator Rob Joyce published “Improving and Making the Vulnerability Equities Process Transparent is the Right Thing to Do,” along with the process. Schneier says “I am less [pleased]; it looks to me like the same old policy with some new transparency measures — which I’m not sure I trust. The devil is in the details, and we don’t know the details — and it has giant loopholes.”

I have two overall questions, and an observation.

The first question is, was the published policy written when we had commitments to international leadership and being a fair dealer, or was it created or revised with an “America First” agenda?

The second question relates to there being four equities to be considered. These are the “major factors” that senior government officials are supposed to consider in exercising their judgement. But, surprisingly, there’s an “additional” consideration. (“At a high level we consider four major groups of equities: defensive equities; intelligence / law enforcement / operational equities; commercial equities; and international partnership equities. Additionally, ordinary people want to know the systems they use are resilient, safe, and sound.”) Does that imply that those officials are not required to weigh public desire for resilient and safe systems? What does it mean that the “additionally” sentence is not an equity being considered?

Lastly, the observation is that the VEP is all about vulnerabilities, not about flaws or design tradeoffs. From the charter, page 9-10:

The following will not be considered to be part of the vulnerability evaluation process:

  • Misconfiguration or poor configuration of a device that sacrifices security in lieu of availability, ease of use or operational resiliency.
  • Misuse of available device features that enables non-standard operation.
  • Misuse of engineering and configuration tools, techniques and scripts that increase/decrease functionality of the device for possible nefarious operations.
  • Stating/discovering that a device/system has no inherent security features by design.

Threat Modeling is the umbrella term for security engineering to discover and deal with these issues. It’s what I spend my days on, because I see the tremendous effort in dealing with vulnerabilities is paying off, and we see fewer of them in well-engineered systems.

In October, I wrote about the fact we’re getting better at dealing with vulnerabilities, and need to think about design issues. I closed:

In summary, we’re doing a great job at finding and squishing bugs, and that’s opening up new and exciting opportunities to think more deeply about design issues. (Emergent Design Issues)

Here, I’m going to disagree with Bruce, because I think that this disclosure shows us an important detail that we didn’t previously know. Publication exposes it, and lets us talk about it.

So, I’m going to double-down on what I wrote in October, and say that we need the VEP to expand to cover those issues. I’m not going to claim that will be easy, that the current approach will translate, or that they should have waited to handle those before publishing. One obvious place it gets harder is the sources and methods tradeoff. But we need the internet to be a resilient and trustworthy infrastructure. As Bill Gates wrote 15 years ago, we need systems that people “will always be able to rely on, [] to be available and to secure their information. Trustworthy Computing is computing that is as available, reliable and secure as electricity, water services and telephony.”

We cannot achieve that goal with the VEP being narrowly scoped. It must evolve to deal with the sorts of flaws and design tradeoffs that threat modeling helps us find.

Photo by David Clode on Unsplash.

Data Flow Diagrams 3.0

In the Brakesec podcast, I used a new analogy for why we need to name our work. When we talk about cooking, we have very specific recipes that we talk about: Julia Child’s beef bourguignon. Paul Prudhomme’s blackened fish. We hope that new cooks will follow the recipes until they get a feel for them, and that they can then start adapting and modifying them, as they generate mental models of what they’re doing.

But we talk about threat modeling we don’t label our recipes. We say this is how to threat model, as if that’s not as broad as “this is how to cook.”

And in that podcast, I realized that I’ve been guilty of definition drift in how I talk about data flow diagrams. Data flow diagrams, DFDs are also called ‘threat model diagrams’ because they’re so closely associated with threat modeling. And as I’ve used them over the course of a decade, there have been many questions:

  • Do you start with a context diagram?
  • What’s a multi-process, and when should I use one?
  • Do I really need to draw single-headed arrows? They make my diagram hard to read!
  • Is this process inside this arc? Is an arc the best way to show a trust boundary?
  • Should I color things?

Those questions I’ve initiated changes, such as showing a process as a rounded rectangle (versus a circle), eliminating rules such as all arrows are uni-directional, and advocating for trust boundaries as labeled boxes.

What I have not done is been crisp about what these changes are in a way that lets a team say “we use v3 DFDs” the way they might say “we use Python 3.” (ok, no one says either, I know!)

I’m going to retroactively label all of these changes as DFD3.0. DFD v1 was a 1970s construct. DFD2 was the critical addition of trust boundaries. And a version 3 DFD is defined as follows:

  1. It uses 5 symbols. A rectangle represents an external entity, a person or code outside your control. A rounded rectangle represents a process. They’re connected by arrows, which can be single or double headed. Data stores are represented by parallel lines. A trust boundary is a closed shape, usually a box. All lines are solid, except those used for trust boundaries, which are dashed or dotted. (There is no “multi-process” symbol in DFD3.)
  2. It must not* depend on the use of color, but can use color for additional information.
  3. All elements should have a label.
  4. You may have a context diagram if the system is complex. One is not required.

* Must, must not, should, should not are used per IETF norms.

This also allows us to talk about what might be in a DFD3.1. I know that I usually draw disks with the “drum” symbol, and I see a lot of people using that. It seems like a reasonable addition.


Using specific naming also allows us to fork. If you want to define a different type of DFD, have at it. If we have a bunch, we can figure out how keep things clear. Oh, and speaking of forking, I put this on github: DFD3.

Using specific naming allows us to talk about testing and maturity in the sense of “this is in alpha test.” “This has been used for several years, we took feedback, adjusted, and now it’s release quality.” I think that DFD3 is release quality, but it probably needs some beta testing for the definitions.

Similarly, DREAD has a bunch of problems, including a lack of definition. I use mention of DREAD as a way to see if people are threat modeling well. And one challenge there is that people silently redefine DREAD to mean something other than what it meant when Michael Howard and David LeBlanc talked about it in Writing Secure Code (2nd ed, 2003). If you want to build something new, your customers and users need to understand that it’s new, so they don’t get confused by it. Therefore, you need to give your new thing a new name. You could call it DREAD2, a DRE4D, DRECK, I don’t really care. What I care about is that it’s easily distinguished, and the first step towards that is a new name.

[Update: What’s most important is not the choices that I’ve made for what’s in DFD3, but the grouping of those choices into DFD3, so that you can make your own choices and our tools can compete in the market.

Emergent Design Issues

It seems like these days, we want to talk about everything in security as if it’s a vulnerability. For example:

German researchers have discovered security flaws that could let hackers, spies and criminals listen to private phone calls and intercept text messages on a potentially massive scale – even when cellular networks are using the most advanced encryption now available.

Experts say it’s increasingly clear that SS7, first designed in the 1980s, is riddled with serious vulnerabilities that undermine the privacy of the world’s billions of cellular customers. The flaws discovered by the German researchers are actually functions built into SS7 for other purposes – such as keeping calls connected as users speed down highways, switching from cell tower to cell tower – that hackers can repurpose for surveillance because of the lax security on the network. (“German researchers discover a flaw that could let anyone listen to your cell calls.” Washington Post, 2014).

But these are not vulnerabilities, because we can have endless debate about it they should be fixed. (Chrome exposing passwords is another example.) If they’re not vulnerabilities, what are they? Perhaps they’re flaws? One definition of flaws reads:

“Flaws are often much more subtle than simply an off-by-one error in an array reference or use of an incorrect system call,” the report notes. “A flaw might be instantiated in software code, but it is the result of a mistake or oversight at the design level.”


An example of such a flaw noted in the report is the failure to separate data and control instructions and the co-mingling of them in a string – a situation that can lead to injection vulnerabilities. (IEEE Report Reveals Top 10 Software Security Design Flaws)

In this sense, the SS7 issues are probably not flaws in the sense that the system behavior is unanticipated. But we don’t know. We don’t know what properties we should expect SS7 to have. For most software, the design requirements, the threat model, is not clear or explicit. Even when it’s explicit, it’s often not public. (Larry Loeb makes the same point here.)

For example, someone decided to write code to run a program on mouse over in Powerpoint, that code was tested, dialog text was written and internationalized, and so on. Someone documented it, and it’s worth pointing out that the documentation doesn’t apply to Powerpoint 2016. Was there a debate over the security of that feature when it shipped? I don’t know. When it was removed? Probably.

There’s a set of these, and I’m going to focus on how they manifest in Windows for reasons that I’ll get to. Examples include:

The reason I’m looking at these is because design questions like these emerge when a system is successful. Whatever else you want to say about it, Windows was successful and very widely deployed. As a system becomes more successful, the easily exploitable bugs are fixed, and the hard to fix design tradeoffs become relatively more important. As I wrote in “The Evolution of Secure Things:”

It’s about the constant imperfection of products, and how engineering is a response to perceived imperfections. It’s about the chaotic real world from which progress emerges. In a sense, products are never perfected, but express tradeoffs between many pressures, like manufacturing techniques, available materials, and fashion in both superficial and deep ways.

That chaotic real world exposes a set of issues that may or may not have been visible during product design. In threat modeling, identification of issues is the most crucial step. If you fail to identify issues, you will not manage those issues well. Another way to say that is: identifying issues is a necessary but not sufficient step.

The design choices listed above almost all predate threat modeling as a structured practice at Microsoft. But there have been other choices, like Windows Wifi sense or new telemetry in Windows 10. We can disagree with those design choices, but it’s clear that there were internal discussion of the right business tradeoffs. So we go back to the definition of a flaw, “a mistake or oversight at the design level.” These were not oversights. Were they design mistakes? That’s harder. The designers knew exactly what they were designing, and the software worked as planned. It was not received as planned, and it is certainly being used in unexpected ways.

There are interesting issues of composition, especially in backup authentication. That problem is being exploited in crypto currency thefts:

Mr. Perklin and other people who have investigated recent hacks said the assailants generally succeeded by delivering sob stories about an emergency that required the phone number to be moved to a new device — and by trying multiple times until a gullible agent was found.

“These guys will sit and call 600 times before they get through and get an agent on the line that’s an idiot,” Mr. Weeks said.

Coinbase, one of the most widely used Bitcoin wallets, has encouraged customers to disconnect their mobile phones from their Coinbase accounts.

One can imagine a lot of defenses, but “encouraging” customers to not use a feature may not be enough. As online wallet companies grow, they need to have threat modeled better, and perhaps that entails turning off the feature. (I don’t know their businesses well enough to simply assert an answer.)

In summary, we’re doing a great job at finding and squishing bugs, and that’s opening up new and exciting opportunities to think more deeply about design issues.

PowerPoint Screen capture via Casey Smith.

Threat Modeling “App Democracy”

Direct Republican Democracy?” is a fascinating post at Prawfsblog, a collective of law professors. In it, Michael T. Morley describes a candidate for Boulder City Council with a plan to vote “the way voters tell him,” and discusses how that might not be really representative of what people want, and how it differs from (small-r) republican government. Worth a few moments of your time.

Threat Modeling and Architecture

Threat Modeling and Architecture” is the latest in a series at Infosec Insider.

After I wrote my last article on Rolling out a Threat Modeling Program, Shawn Chowdhury asked (on Linkedin) for more informatioin on involving threat modeling in the architecture process. It’s a great question, except it involves the words “threat, “modeling,” and “architecture.” And each of those words, by itself, is enough to get some people twisted around an axle.

Continue reading “Threat Modeling and Architecture”

Open for Business

Recently, I was talking to a friend who wasn’t aware that I’m consulting, and so I wanted to share a bit about my new life, consulting!

I’m consulting for companies of all sizes and in many sectors. The services I’m providing include threat modeling training, engineering and strategy work, often around risk analysis or product management.

Some of the projects I’ve completed recently include:

  • Threat modeling training – Engineers learn how to threat model, and how to make threat modeling part of their delivery. Classes range from 1 to 5 days, and are customized to your needs.
  • Process re-engineering for a bank – Rebuilt their approach to a class of risks, increasing security, consistently and productively across the org.
  • Feature analysis for a security company – Identifying market need, what features fit those needs, and created a compelling and grounded story to bring the team together.

If you have needs like these, or other issues where you think my skills and experience could help, I’d love to hear from you. And if you know someone who might, I’m happy to talk to them.

I have a to-the-point website at associates.shostack.org and some details of my threat modeling services are at associates.shostack.org/threatmodeling.

Star Wars, Star Trek and Getting Root on a Star Ship

It’s time for some Friday Star Wars blogging!

Reverend Robert Ballecer, SJ tweeted: “as a child I learned a few switches & 4 numbers gives you remote code ex on a 23rd century starship.” I responded, asking “When attackers are on the bridge and can flip switches, how long a password do you think is appropriate?”

It went from there, but I’d like to take this opportunity to propose a partial threat model for 23rd century starships.

First, a few assumptions:

  • Sometimes, officers and crewmembers of starships die, are taken prisoner, or are otherwise unable to complete their duties.
  • It is important that the crew can control the spaceship, including software and computer hardware.
  • Unrestricted physical access to the bridge means you control the ship (with possible special cases, and of course, the Holodeck because lord forgive me, they need to shoot a show every week. Scalzi managed to get a surprisingly large amount from this line of inquiry in Red Shirts. But I digress.)

I’ll also go so far as to say that as a derivative of the assumptions, the crew may need a rapid way to assign “Captain” privileges to someone else, and starship designers should be careful to design for that use case.

So the competing threats here are denial of service (and possibly denial of future service) and elevation of privilege. There’s a tension between designing for availability (anyone on the bridge can assume command relatively easily) and proper authorization. My take was that the attackers on the bridge are already close to winning, and so defenses which impede replacing command authority are a mistake.

Now, in responding, I thought that “flipping switches” meant physically being there, because I don’t recall the episode that he’s discussing. But further in further conversation, what became clear is that the switches can be flipped remotely, which dramatically alters the need for a defense.

It’s not clear what non-dramatic requirement such remote switch flipping serves, and so on balance, it’s easy to declare that the added risk is high and we should not have remote switch flipping. It is always easy to declare that the risk is high, but here I have the advantage that there’s no real product designer in the room arguing for the feature. If there was, we would clarify the requirement, and then probably engineer some appropriate defenses, such as exponential backoff for remote connections. Of course, in the future with layers of virtualization, what a remote connection is may be tricky to determine in software.

Which brings me to another tweet, by Hongyi Hu, who said he was “disappointed that they still use passwords for authentication in the 23rd century. I hope the long tail isn’t that long! 😛” What can I say but, “we’ll always have passwords.” We’ll just use them for less.

As I’ve discussed, the reason I use Star Wars over Star Trek in my teaching and examples is that no one is confused about the story in the core movies. I made precisely this mistake.

Image: The Spaceship Discovery, rendered by Trekkie5000. Alert readers will recall issues that could have been discovered with better threat modeling.

Organizing threat modeling magic

I was inspired to develop and share my thoughts after Adam’s previous post (magical approaches to threat modeling) regarding selection of the threats and predictions. Since a 140 characters limit quickly annoys me, Adam gave me an opportunity to contribute on his blog, thanks to him I can now explain how I believe in magic during threat modeling.

I have noticed that most of what I do, because it is timeboxed due to carbon based lifeforms constraints, needs to be a finite choice selection from what appears to me as an infinite array of possibilities. I also enjoy pulling computer related magic tricks, or guesses, because it’s amusing and more engaging than reading a checklist. Magic, in this case, is either pure luck or based on some skills the spectators can’t see. I like when I think I’m having both.

During the selection phase of what to do, there’s a few tradeoffs that have been proposed such as coverage, time and skills required. Those are attack based and come from the knowledge of what an attacker can do. While I think that those effectively describe the selection of granular technical efforts, I prefer to look at what are his motivation rather than the constrains he’ll face. And for all that, I have a way or organizing it and showing it.

 

Attack Tree

When I think about the actual threats of a system, I don’t see a list, but rather a tree. That tree has the ultimate goals on top, and then descend into sub-goals that breaks down how you get there. It finally ends up in leaves that are the vulnerabilities to be exploited.

Here’s an unfinished example for an unnamed application:

A fun thing to do with a tree is to apply a weight on a branch. In this case the number represent attacker made tradeoffs and is totally arbitrary.

If you keep it relatively consistent to itself, you end up with an appropriate weighting system. For this example, let’s say it’s the amount of efforts you estimate it takes. You can sum the branches in the tree and get sub-goals weight without having to think about them.

And from that we can get a sum for the root goals:

But then how do I choose to prioritize or just work on something?

I could just say, well I’m going to do the easiest things to do, maybe because finding an SQLi in the application is easier than finding a slow API request, so better start looking at that first.

But regarding to decision, I often decide to do the most common human behavior: just don’t do it myself.

With the help of the tree, I just let the actual business reality do the selection on which root goals to pick. By that I mean the literal definition of reality, although nowadays people seems to forget what it really means:

“reality · noun · 1. the world or the state of things as they actually exist, as opposed to an idealistic or notional idea of them.”
– Google Almighty

I never ask the business line if they think they’ll have SQLi, but rather, if they worry more about denial of service or information stealing.

One advantage of that, is that those decisions are at the root goals. The tree is a hierarchy; the higher level you are, the bigger impact you’ll have. Like spinning a big cog wheel versus a smaller one:

3 gears

If you were to pick on each vulnerability at the time, you’ll spin your working wheel a lot, while just really advancing the root goal a bit. Work on doing the selection on the root goals, then you’ll see that it’s impact is far greater for about the same amount of time. That’s efficiency to me.

And that’s how I turn magic into engineering 😀

Of course, in order for it to be proper engineering, the next step would be to QA it. And at that point, you can fetch all the checklists or threats repository you can find, and verify that you covered everything in your tree. Simply add what you have missed, and then bask in the glory of perceived completeness.

 

For the curious practitioners, I’ve used PlantUML in order to generate the tree examples as seen above. The tool let you textually define the tree using simple markup and auto balance it for you when you are updating it. A more detailed example can be found on my Threat Modeling Toolkit presentation.