Threat Model Thursday: Architectural Review and Threat Modeling

For Threat Model Thursday, I want to use current events here in Seattle as a prism through which we can look at technology architecture review. If you want to take this as an excuse to civilly discuss the political side of this, please feel free.

Seattle has a housing and homelessness crisis. The cost of a house has risen nearly 25% above the 2007 market peak, and has roughly doubled in the 6 years since April 2012. Fundamentally, demand has outstripped supply and continues to do so. As a city, we need more supply, and that means evaluating the value of things that constrain supply. This commentary from the local Libertarian party lists some of them.

The rules on what permits are needed to build a residence, what housing is acceptable, or how many unrelated people can live together (no more than eight) are expressions of values and priorities. We prefer that the developers of housing not build housing rather than build housing that doesn’t comply with the city’s Office of Planning and Community Development 32 pages of neighborhood design guidelines. We prefer to bring developers back after a building is built if the siding is not the agreed color. This is a choice that expresses the values of the city. And because I’m not a housing policy expert, I can miss some of the nuances and see the effect of the policies overall.

Let’s transition from the housing crisis here in Seattle to the architecture crisis that we face in technology.

No, actually, I’m not quite there. The city killed micro-apartments, only to replace them with … artisanal micro-houses. Note the variation in size and shape of the two houses in the foreground. Now, I know very little about construction, but I’m reasonably confident that if you read the previous piece on micro-housing, many of the concerns regulators were trying to address apply to “True Hope Village,” construction pictured above. I want you, dear reader, to read the questions about how we deliver housing in Seattle, and treat them as a mirror into how your organization delivers software. Really, please, go read “How Seattle Killed Micro-Housing” and the “Neighborhood Design Guidelines” carefully. Not because you plan to build a house, but as a mirror of your own security design guidelines.

They may be no prettier.

In some companies, security is valued, but has no authority to force decisions. In others, there are mandatory policies and review boards. We in security have fought for these mandatory policies because without them, products ignored security. And similarly, we have housing rules because of unsafe, unsanitary or overcrowded housing. To reduce the blight of slums.

Security has design review boards which want to talk about the color of the siding a developer installed on the now live product. We have design regulation which kills apodments and tenement housing, and then glorifies tiny houses. From a distance, these rules make no sense. I didn’t find it sensible, myself. I remember a meeting with the Microsoft Crypto board. I went in with some very specific questions regarding parameters and algorithms. Should we use this hash algorithm or that one? The meeting took not five whole minutes to go off the rails with suggestions about non-cryptographic architecture. I remember shipping the SDL Threat Modeling Tool, going through the roughly five policy tracking tools we had at the time, discovering at the very last minute that we had extra rules that were not documented in the documents that I found at the start. It drives a product manager nuts!

Worse, rules expand. From the executive suite, if a group isn’t growing, maybe it can shrink? From a security perspective, the rapidly changing threat landscape justifies new rules. So there’s motivation to ship new guidelines that, in passing, spend a page explaining all the changes that are taking place. And then I see “Incorporate or acknowledge the best features of existing early to mid-century buildings in new development.” What does that mean? What are the best features of those buildings? How do I acknowledge them? I just want to ship my peer to peer blockchain features! And nothing in the design review guidelines is clearly objectionable. But taken as a whole, they create a complex and unpredictable, and thus expensive path to delivery.

We express values explicitly and implicitly. In Seattle, implicit expression of values has hobbled the market’s ability to address a basic human need. One of the reasons that embedding is effective is that the embedded gatekeepers can advise, interpret in relation to real questions. Embedding expresses the value of collaboration, of dialogue over review. Does your security team express that security is more important than product delivery? Perhaps it is. When Microsoft stood down product shipping for security pushes, it was an explicit statement. Making your values explicit and debating prioritization is important.

What side effects do your security rules have? What rule is most expensive to comply with? What initiatives have you killed, accidentally or intentionally?

The DREAD Pirates

Then he explained the name was important for inspiring the necessary fear. You see, no one would surrender to the Dread Pirate Westley.

The DREAD approach was created early in the security pushes at Microsoft as a way to prioritize issues. It’s not a very good way, you see no one would surrender to the Bug Bar Pirate, Roberts. And so the approach keeps going, despite its many problems.

There are many properties one might want in a bug ranking system for internally found bugs. They include:

  • A cool name
  • A useful mnemonic
  • A reduction in argument about bugs
  • Consistency between raters
  • Alignment with intuition
  • Immutability of ratings: the bug is rated once, and then is unlikely to change
  • Alignment with post-launch/integration/ship rules

DREAD certainly meets the first of these, and perhaps the second two. And it was an early attempt at a multi-factor rating of bugs. But there are many problems which DREAD brings that newer approaches deal with.

The most problematic aspect of DREAD is that there’s little consistency, especially in the middle. What counts as a 6 damage versus 7, or 6 versus 7 exploitability? Without calibration, different raters will not be consistent. Each of the scores can be mis-estimated, and there’s a tendency to underestimate things like discoverability of bugs in your own product.

The second problem is that you set an arbitrary bar for fixes, for example, everything above a 6.5 gets fixed. That makes the distinction between a 6 and a 7 sometimes matters a lot. The score does not relate to what needs to get fixed when found externally.

This illustrates why Discoverability is an odd things to bring into the risk equation. You may have a discoverability of “1” on Monday, and 10 on Tuesday. (“Thanks, Full-Disclosure!”) So something could have a 5.5 DREAD score because of low discoverability but require a critical update. Suddenly the DREAD score of the issue is mutable. So it’s hard to use DREAD on an externally discovered bug, or one delivered via a bug bounty. So now you have two bug-ranking systems, and what do you do when they disagree? This happened to Microsoft repeatedly, and led to the switch to a bug bar approach.

Affected users is also odd: does an RCE in Flight Simulator matter less than one in Word? Perhaps in the grand scheme of things, but I hope the Flight Simulator team is fixing their RCEs.

Stepping beyond the problems internal to DREAD to DREAD within a software organization, it only measures half of what you need to measure. You need to measure both the security severity and the fix cost. Otherwise, you run the risk of finding something with a DREAD of 10, but it’s a key feature (Office Macros), and so it escalates, and you don’t fix it. There are other issues which are easy to fix (S3 bucket permissions), and so it doesn’t matter if you thought discoverability was low. This is shared by other systems, but the focus on a crisp line in DREAD, everything above a 6.5 gets fixed, exacerbates the issue.

For all these reasons, with regards to DREAD? Fully skeptical, and I have been for over a decade. If you want to fix these things, the thing to do is not create confusion by saying “DREAD can also be a 1-3 system!”, but to version and revise DREAD, for example, by releasing DREAD 2. I’m exploring a similar approach with DFDs.

I’m hopeful that this post can serve as a collection of reasons to not use DREAD v1, or properties that a new system should have. What’d I miss?

Joining the Continuum Team

I’m pleased to share the news that I’ve joined Continuum Security‘s advisory board. I am excited about the vision that Continuum is bringing to software security: “We help you design, build and manage the security of your software solutions.” They’re doing so for both happy customers and a growing community. And I’ve come to love their framing: “Security is not special. Performance, quality and availability is everyone’s responsibility and so is security. After all, who understands the code and environment better than the developers and ops teams themselves?” They’re right. Security has to earn our seat at the table. We have to learn to collaborate better, and that requires software that helps the enterprise manage application security risk from the start of, and throughout, the software development process.

Not Bugs, but Features

“[Mukhande Singh] said “real water” should expire after a few months. His does. “It stays most fresh within one lunar cycle of delivery,” he said. “If it sits around too long, it’ll turn green. People don’t even realize that because all their water’s dead, so they never see it turn green.”
(Unfiltered Fervor: The Rush to Get Off the Water Grid, Nellie Bowles, NYTimes, Dec 29, 2017.)
So those things turning the water green? Apparently, not bugs, but features. In unrelated “not understanding food science” news, don’t buy the Mellow sous vide machine. Features.

The Fights We Have to Fight: Fixing Bugs

One of the recurring lessons from Petroski is how great engineers overcome not only the challenges of physical engineering: calculating loads, determining build orders, but they also overcome the real world challenges to their ideas, including financial and political ones. For example:

Many a wonderful concept, beautifully drawn by an inspired structural artist, has never risen off the paper because its cost could not be justified. Most of the great bridges of the nineteenth century, which served to define bridge building and other technological achievements for the twentieth century, were financed by private enterprise, often led by the expanding railroads. Engineers acting as entrepreneurs frequently put together the prospectuses, and in some cases almost single-handedly promoted their dreams to the realists. […] Debates over how to pay for them were common. (Engineers of Dreams: Great Bridge Builders and the Spanning of America, Henry Petroski)

Many security professionals have a hobby of griping that products get rushed to market, maybe to be secured later. We have learned to be more effective at building security in, and in doing so, reduce product costs and increase on-time delivery. But some products were built before we knew how to do that, and others are going to get built by companies who choose not to do that. And in that sense, Collin Greene’s retrospective, “Fixing Security Bugs” is very much worth your time. It’s a retrospective on the Vista security program from a pen-test perspective.

Hacking: Exciting.
Finding bugs: Exciting.
Fixing those bugs: Not exciting.
The thing is, the finish line for our job in security is getting bugs fixed¹, not just found and filed. Doing this effectively is not a technology problem. It is a communications, organizational² and psychology problem.

I joined Microsoft while the Vista pen test was finishing up, and so my perspective is complimentary. I’d like to add a few additional perspectives to his points.

First, he asks “is prioritization correct?” After Vista, the SDL team created security bug bars, and then later refined them to align with the MSRC update priorities. That alignment with the MSRC priorities was golden. It made it super-clear that if you didn’t fix this before ship, you were going to have to do an update later. As a security engineer, you need to align your prioritization to the all up delivery priorities. Having everything be “extremely critical,” “very critical,” or “moderately critical” means you don’t know what matters, and so nothing does.

Second, “why security matters” was still a fight to be fought in Vista. By Windows 7, security had completed its “move left.” The spec form contained sections for security and privacy. Threat model review was a gate for start of coding. These process changes happened while developers were “rebelling” against Vista’s “overweight” engineering process. They telegraphed that security mattered to management and executives. As a security engineer, you need to get management to spend time talking about how security is balanced with other priorities.

Third, he points out that escalating to a manager can feel bad, but he’s right: “Often the manager has the most context on priorities.” Management saying “get this fixed” is an expression of prioritization. If you’ve succeeded in your work on “why security matters,” then management will know that they need to reinforce that message. Bringing the issues to them, responsibly, helps them get their job done. If it feels bad to escalate, then it’s worth asking if you have full buy in on security.

Now, I’m talking about security as if it matters to management. More and more, that’s the case. Something in the news causes leadership to say “we have to do better,” and they believe that there are things that they can do. In part that belief is because very large companies have been talking about how to make it work. But when that belief isn’t there, it’s your job as an engineer to, as Petroski says, single-handedly promote your dreams to the realists. Again, Greene’s post is full of good ideas.

Lastly, not everything is a bug. I discussed vulnerabilities versus design recently in “Emergent Design Issues.”

(Photo: https://www.pexels.com/photo/black-and-brown-insect-37733/)

Why is “Reply” Not the Strongest Signal?

So apparently my “friends” at outlook.com are marking my email as junk today, with no explanation. They’re doing this to people who have sent me dozens of emails over the course of months or years.

Why does no spam filter seem to take repeated conversational turns into account? Is there a stronger signal that I want to engage with someone than…repeatedly engaging?

20 Year Software: Engineering and Updates

Twenty years ago, Windows 95 was the most common operating system. Yahoo and Altavista were our gateways to the internet. Steve Jobs just returned to Apple. Google didn’t exist yet. America Online had just launched their Instant Messenger. IPv6 was coming soon. That’s part of the state of software in 1997, twenty years ago. We need to figure out what engineering software looks like for a twenty year lifespan, and part of that will be really doing such engineering, because theory will only take us to the limits of our imaginations.

Today, companies are selling devices that will last twenty years in the home, such as refrigerators and speakers, and making them with network connectivity. That network connectivity is both full internet connectivity, that is, Internet Protocol stacks, and also local network connectivity, such as Bluetooth and Zigbee.

We have very little idea how to make software that can survive as long as AOL IM did. (It’s going away in December, if you missed the story.)

Recently, there was a news cycle around Sonos updating their privacy policy. I don’t want to pick on Sonos, because I think what they’re experiencing is just a bit of the future, unevenly distributed. Responses such as this were common: “Hardware maker: Give up your privacy and let us record what you say in your home, or we’ll destroy your property:”

“The customer can choose to acknowledge the policy, or can accept that over time their product may cease to function,” the Sonos spokesperson said, specifically.

Or, as the Consumerist, part of Consumer Reports, puts it in “Sonos Holds Software Updates Hostage If You Don’t Sign New Privacy Agreement:

Sonos hasn’t specified what functionality might no longer work in the future if customers don’t accept the new privacy policy.

There are some real challenges here, both technical and economic. Twenty years ago, we didn’t understand double-free or format string vulnerabilities. Twenty years of software updates aren’t going to be cheap. (I wrote about the economics in “Maintaining & Updating Software.”)

My read is that Sonos tried to write a privacy policy that balanced their anticipated changes, including Alexa support, with a bit of future-proofing. I think that the reason they haven’t specified what might not work is because they really don’t know, because nobody knows.

The image at the top is the sole notification that I’ve gotten that Office 2011 is no longer getting security updates. (Sadly, it’s only shown once per computer, or perhaps once per user of the computer.) Microsoft, like all tech companies, will cut functionality that it can’t support, like <"a href="https://www.macworld.com/article/1154785/business/welcomebackvisualbasic.html">Visual Basic for Mac and also “end of lifes” its products. They do so on a published timeline, but it seems wrong to apply that to a refrigerator, end of lifeing your food supply.

There’s probably a clash coming between what’s allowable and what’s economically feasible. If your contract says you can update your device at any time, it still may be beyond “the corners of the contract” to shut it off entirely. Beyond economically challenging, it may not even be technically feasible to update the system. Perhaps the chip is too small, or its power budget too meager, to always connect over TLS4.2, needed addresses the SILLYLOGO attack.

What we need might include:

  • A Dead Software Foundation, dedicated to maintaining the software which underlies IoT devices for twenty years. This is not only the Linux kernel, but things like tinybox and openssl. Such a foundation could be funded by organizations shipping IoT devices, or even by governments, concerned about the externalities, what Sean Smith called “the Cyber Love Canal” in The Internet of Risky Things. The Love Canal analogy is apt; in theory, the government cleans up after the firms that polluted are gone. (The practice is far more complex.)
  • Model architectures that show how to engineer devices, such as an internet speaker, so that it can effectively be taken offline when the time comes. (There’s work in the mobile app space on making apps work offline, which is related, but carries the expectation that the app will eventually reconnect.)
  • Conceptualization of the legal limits of what you can sign away in the fine print. (This may be everything; between severability and arbitration clauses, the courts have let contract law tilt very far towards the contract authors, but Congress did step in to write the Consumer Review Fairness Act.) The FTC has commented on issues of device longevity, but not (afaik) on limits of contracts.

What else do we need to build software that survives for twenty years?

The Dope Cycle and a Deep Breath

Back in January, I wrote about “The Dope Cycle and the Two Minutes Hate.” In that post, I talked about:

Not kidding: even when you know you’re being manipulated into wanting it, you want it. And you are being manipulated, make no mistake. Site designers are working to make your use of their site as pleasurable as possible, as emotionally engaging as possible. They’re caught up in a Red Queen Race, where they must engage faster and faster just to stay in place. And when you’re in such a race, it helps to steal as much as you can from millions of years of evolution. [Edit: I should add that this is not a moral judgement on the companies or the people, but rather an observation on what they must do to survive.] That’s dopamine, that’s adrenaline, that’s every hormone that’s been covered in Popular Psychology. It’s a dope cycle, and you can read that in every sense of the word dope.

I just discovered a fascinating tool from a company called Dopamine Labs. Dopamine Labs is a company that helps their corporate customers drive engagement: “Apps use advanced software tools that shape and control user behavior. We know because [we sell] it to them.” They’ve released a tool called Space: “Space uses neuroscience and AI to help you kick app addiction. No shame. No sponsors. Just a little breathing room to help you take back control.” As they say: “It’s the same math that we use to get people addicted to apps, just run backwards.”

Space app
There are some fascinating ethical questions involved in selling both windows and bricks. I’m going to say that you participants in a red queen race might as well learn what countermeasures to their techniques are by building them. Space works as a Chrome plugin and as an iOS and Android App. I’ve installed it, and I like it more than I like another tool I’ve been using (Dayboard). I really like Dayboard’s todo list, but feel that it cuts me off in the midst of time wasting, rather than walking me away.)

The app is at http://youjustneedspace.com/.

As we go into big conferences, it might be worth installing. (Also as we head into conferences, be excellent to each other. Know and respect your limits and those of others. Assume good intent. Avoid getting pulled into a “Drama Triangle.”)

Threat Modeling Password Managers

There was a bit of a complex debate last week over 1Password. I think the best article may be Glenn Fleishman’s “AgileBits Isn’t Forcing 1Password Data to Live in the Cloud,” but also worth reading are Ken White’s “Who moved my cheese, 1Password?,” and “Why We Love 1Password Memberships,” by 1Password maker AgileBits. I’ve recommended 1Password in the past, and I’m not sure if I agree with Agilebits that “1Password memberships are… the best way to use 1Password.” This post isn’t intended to attack anyone, but to try to sort out what’s at play.

This is a complex situation, and you’ll be shocked, shocked to discover that I think a bit of threat modeling can help. Here’s my model of

what we’re working on:

Password manager

Let me walk you through this: There’s a password manager, which talks to a website. Those are in different trust boundaries, but for simplicity, I’m not drawing those boundaries. The two boundaries displayed are where the data and the “password manager.exe” live. Of course, this might not be an exe; it might be a .app, it might be Javascript. Regardless, that code lives somewhere, and where it lives is important. Similarly, the passwords are stored somewhere, and there’s a boundary around that.

What can go wrong?

If password storage is local, there is not a fat target at Agilebits. Even assuming they’re stored well (say, 10K iterations of PBKDF2), they’re more vulnerable if they’re stolen, and they’re easier to steal en masse [than] if they’re on your computer. (Someone might argue that you, as a home user, are less likely to detect an intruder than Agilebits. That might be true, but that’s a way to detect; the first question is how likely is an attacker to break in? They’ll succeed against you and they’ll succeed against Agilebits, and they’ll get a boatload more from breaking into Agilebits. This is not intended as a slam of Agilebits, it’s an outgrowth of ‘assume breach.’) I believe Agilebits has a simpler operation than Dropbox, and fewer skilled staff in security operations than Dropbox. The simpler operation probably means there are fewer usecases, plugins, partners, etc, and means Agilebits is more likely to notice some attacks. To me, this nets out as neutral. Fleishman promises to explain “how AgileBits’s approach to zero-knowledge encryption… may be less risky and less exposed in some ways than using Dropbox to sync vaults.” I literally don’t see his argument, perhaps it was lost in the complexity of writing a long article? [Update: see also Jeffrey Goldberg’s comment about how they encrypt the passwords. I think of what they’ve done as a very strong mitigation; with a probably reasonable assumption they haven’t bolluxed their key generation. See this 1Password Security Design white paper.]

To net it out: local storage is more secure. If your computer is compromised, your passwords are compromised with any architecture. If your computer is not compromised, and your passwords are nowhere else, then you’re safe. Not so if your passwords are somewhere else and that somewhere else is compromised.

The next issue is where’s the code? If the password manager executable is stored on your device, then to replace it, the attacker either needs to compromise your device, or to install new code on it. An attacker who can install new code on your computer wins, which is why secure updates matter so much. An attacker who can’t get new code onto your computer must compromise the password store, discussed above. When the code is not on your computer but on a website, then the ease of replacing it goes way up. There’s two modes of attack. Either you can break into one of the web server(s) and replace the .js files with new ones, or you can MITM a connection to the site and tamper with the data in transit. As an added bonus, either of those attacks scales. (I’ll assume that 1Password uses certificate pinning, but did not chase down where their JS is served.)

Netted out, getting code from a website each time you run is a substantial drop in security.

What should we do about it?

So this is where it gets tricky. There are usability advantages to having passwords everywhere. (Typing a 20 character random password from your phone into something else is painful.) In their blog post, Agilebits lists more usability and reliability wins, and those are not to be scoffed at. There are also important business advantages to subscription revenue, and not losing your passwords to a password manager going out of business is important.

Each 1Password user needs to make a decision about what the right tradeoff is for them. This is made complicated by family and team features. Can little Bobby move your retirement account tables to the cloud for you? Can a manager control where you store a team vault?

This decision is complicated by walls of text descriptions. I wish is that Agilebits would do a better job of crisply and cleanly laying out the choice that their customers can make, and the advantages and disadvantages of each. (I suggest a feature chart like this one as a good form, and the data should also be in each app as you set things up.) That’s not to say that Agilebits can’t continue to choose and recommend a default.

Does this help?

After years of working in these forms, I think it’s helpful as a way to break out these issues. I’m curious: does it help you? If not, where could it be better?

Secure updates: A threat model

Software updates

Post-Petya there have been a number of alarming articles on insecure update practices. The essence of these stories is that tax software, mandated by the government of Ukraine, was used to distribute the first Petya, and that this can happen elsewhere. Some of these stories are a little alarmist, with claims that unnamed “other” software has also been used in this way. Sometimes the attack is easy because updates are unsigned, other times its because they’re also sent over a channel with no security.

The right answer to these stories is to fix the damned update software before people get more scared of updating. That fear will survive long after the threat is addressed. So let me tell you, [as a software publisher] how to do secure upadtes, in a nutshell.

The goals of an update system are to:

  1. Know what updates are available
  2. Install authentic updates that haven’t been tampered with
  3. Strongly tie updates to the organization whose software is being updated. (Done right, this can also enable whitelisting software.)

Let me elaborate on those requirements. First, know what updates are available — the threat here is that an attacker stores your message “Version 3.1 is the latest revision, get it here” and sends it to a target after you’ve shipped version 3.2. Second, the attacker may try to replace your update package with a new one, possibly using your keys to sign it. If you’re using TLS for channel security, your TLS keys are only as secure as your web server, which is to say, not very. You want to have a signing key that you protect.

So that’s a basic threat model, which leads to a system like this:

  1. Update messages are signed, dated, and sequenced. The code which parses them carefully verifies the signatures on both messages, checks that the date is later than the previous message and the sequence number is higher. If and only if all are true does it…
  2. Get the software package. I like doing this over torrents. Not only does that save you money and improve availability, but it protects you against the “Oh hello there Mr. Snowden” attack. Of course, sometimes a belief that torrents have the “evil bit” set leads to blockages, and so you need a fallback. [Note this originally called the belief “foolish,” but Francois politely pointed out that that was me being foolish.]
  3. Once you have the software package, you need to check that it’s signed with the same key as before.
    Better to sign the update and the update message with a key you keep offline on a machine that has no internet connectivity.

  4. Since all of the verification can be done by software, and the signing can be done with a checklist, PGP/GPG are a fine choice. It’s standard, which means people can run additional checks outside your software, it’s been analyzed heavily by cryptographers.

What’s above follows the four-question framework for threat modeling: what are we working on? (Delivering updates securely); what can go wrong? (spoofing, tampering, denial of service); what are we going to do about it? (signatures and torrents). The remaining question is “did we do a good job?” Please help us assess that! (I wrote this quickly on a Sunday morning. Are there attacks that this design misses? Defenses that should be in place?)