Shostack + Friends Blog

 

Emergent Design Issues

[no description provided]

It seems like these days, we want to talk about everything in security as if it's a vulnerability. For example:

German researchers have discovered security flaws that could let hackers, spies and criminals listen to private phone calls and intercept text messages on a potentially massive scale – even when cellular networks are using the most advanced encryption now available.
...
Experts say it’s increasingly clear that SS7, first designed in the 1980s, is riddled with serious vulnerabilities that undermine the privacy of the world’s billions of cellular customers. The flaws discovered by the German researchers are actually functions built into SS7 for other purposes – such as keeping calls connected as users speed down highways, switching from cell tower to cell tower – that hackers can repurpose for surveillance because of the lax security on the network. ("German researchers discover a flaw that could let anyone listen to your cell calls." Washington Post, 2014).

But these are not vulnerabilities, because we can have endless debate about it they should be fixed. (Chrome exposing passwords is another example.) If they're not vulnerabilities, what are they? Perhaps they're flaws? One definition of flaws reads:

"Flaws are often much more subtle than simply an off-by-one error in an array reference or use of an incorrect system call," the report notes. "A flaw might be instantiated in software code, but it is the result of a mistake or oversight at the design level."

An example of such a flaw noted in the report is the failure to separate data and control instructions and the co-mingling of them in a string - a situation that can lead to injection vulnerabilities. (IEEE Report Reveals Top 10 Software Security Design Flaws)

In this sense, the SS7 issues are probably not "flaws" in the sense that the system behavior is unanticipated. But we don't know. We don't know what properties we should expect SS7 to have. For most software, the design requirements, the threat model, is not clear or explicit. Even when it's explicit, it's often not public. (Larry Loeb makes the same point here.)

For example, someone decided to write code to run a program on mouse over in Powerpoint, that code was tested, dialog text was written and internationalized, and so on. Someone documented it, and it's worth pointing out that the documentation doesn't apply to Powerpoint 2016. Was there a debate over the security of that feature when it shipped? I don't know. When it was removed? Probably.

There's a set of these, and I'm going to focus on how they manifest in Windows for reasons that I'll get to. Examples include:

The reason I'm looking at these is because design questions like these emerge when a system is successful. Whatever else you want to say about it, Windows was successful and very widely deployed. As a system becomes more successful, the easily exploitable bugs are fixed, and the hard to fix design tradeoffs become relatively more important. As I wrote in "The Evolution of Secure Things:"

It’s about the constant imperfection of products, and how engineering is a response to perceived imperfections. It’s about the chaotic real world from which progress emerges. In a sense, products are never perfected, but express tradeoffs between many pressures, like manufacturing techniques, available materials, and fashion in both superficial and deep ways.

That chaotic real world exposes a set of issues that may or may not have been visible during product design. In threat modeling, identification of issues is the most crucial step. If you fail to identify issues, you will not manage those issues well. Another way to say that is: identifying issues is a necessary but not sufficient step.

The design choices listed above almost all predate threat modeling as a structured practice at Microsoft. But there have been other choices, like Windows Wifi sense or new telemetry in Windows 10. We can disagree with those design choices, but it's clear that there were internal discussion of the right business tradeoffs. So we go back to the definition of a flaw, "a mistake or oversight at the design level." These were not oversights. Were they design mistakes? That's harder. The designers knew exactly what they were designing, and the software worked as planned. It was not received as planned, and it is certainly being used in unexpected ways.

There are interesting issues of composition, especially in backup authentication. That problem is being exploited in crypto currency thefts:

Mr. Perklin and other people who have investigated recent hacks said the assailants generally succeeded by delivering sob stories about an emergency that required the phone number to be moved to a new device — and by trying multiple times until a gullible agent was found.

“These guys will sit and call 600 times before they get through and get an agent on the line that’s an idiot,” Mr. Weeks said.

Coinbase, one of the most widely used Bitcoin wallets, has encouraged customers to disconnect their mobile phones from their Coinbase accounts.

One can imagine a lot of defenses, but "encouraging" customers to not use a feature may not be enough. As online wallet companies grow, they need to have threat modeled better, and perhaps that entails turning off the feature. (I don't know their businesses well enough to simply assert an answer.)

In summary, we're doing a great job at finding and squishing bugs, and that's opening up new and exciting opportunities to think more deeply about design issues.

PowerPoint Screen capture via Casey Smith.

[Update Dec 13: After a conversation with Gary McGraw, I think I may have read the CSD definition of flaw too narrowly.]