Category: research papers

Valuing CyberSecurity Research Datasets

There was a really interesting paper at the Workshop on the Economics of Information Security. The paper is “Valuing CyberSecurity Research Datasets.”

The paper focuses on the value of the IMPACT data sharing platform at DHS, and how the availability of data shapes the research that’s done.

On its way to that valuation, a very useful contribution of the paper is the analysis of types of research data which exist, and the purposes for which it can be used:

Note that there has been considerable attention paid to information sharing among operators through organizations such as ISACs. In contrast, we examine data provisioning done primarily for research purposes. Cybersecurity data resides on a use spectrum – some research data is relevant for operations and vice versa. Yet, as difficult as it can be to make the case for data sharing among operators, its even harder for researchers. Data sharing for research is generally not deemed as important as for operations. Outcomes are not immediately quantifiable. Bridging the gap between operators and researchers, rather than between operators alone, is further wrought with coordination and value challenges. Finally, research data is often a public good, which means it will likely be undervalued by the parties involved.

The paper enumerates benefits of research, including advancing scientific understanding, enabling infrastructure, creating parity in access to ground truth(s) for academics, technology developers, and others who don’t directly gather data. It also enumerates a set of barriers to research, including legal and ethical risk, costs, value uncertainty, and incentives.

These issues were highly resonant for me, because our near miss work certainly encounters these issues of value uncertainty and cost as we consider how to move beyond the operational data sharing that ISACs enable.

I’m very glad to see the challenges crystalized in this way, and we haven’t even reached the main goal of the paper, which is to assess how much value we get from sharing data.

While talking about this paper Robert Lemos has a story at Dark Reading, and Ross Anderson liveblogged the WEIS conference.

Safety and Security in Automated Driving

Safety First For Automated Driving” is a big, over-arching whitepaper from a dozen automotive manufacturers and suppliers.

One way to read it is that those disciplines have strongly developed safety cultures, which generally do not consider cybersecurity problems. This paper is the cybersecurity specialists making the argument that cyber will fit into safety, and how to do so.

In a sense, this white paper captures a strategic threat model. What are we working on? Autonomous vehicles. What can go wrong? Security issues of all types. What are we going to do? Integrate with and extend the existing safety discipline. Give specific threat information and mitigation strategies to component designers.

I find some parts of it surprising. (I would find it more surprising if I were to look at a 150 page document and not find anything surprising.)

Contrary to the commonly used definition of an [minimal risk condition, (MRC)], which describes only a standstill, this publication expands the definition to also include degraded operation and takeovers by the vehicle operator. Final MRCs refer to MRCs that allow complete deactivation of the automated driving system, e.g. standstill or takeover by the vehicle operator.

One of the “minimal risk” maneuvers listed (table 4) is an emergency stop. And while an emergency stop may certainly be a risk minimizing action in some circumstances, describing it as such is surprising, especially when presented in contrast to a “safe stop” maneuver.

It’s important to remember that driving is incredibly dangerous. In the United States in 2018, an estimated 40,000 people lost their lives in car crashes, and 4.5 million people were seriously injured. (I’ve seen elsewhere that a million of those are hospitalized.) A great many of those injuries are caused by either drunk or distracted drivers, and autonomous vehicles could save many lives, even if imperfect.

Which brings me to a part that I really like, which is the ‘three dimensions of risk treatment’ figure (Figure 8, shown). Words like “risk” and “risk management” encompass a lot, and this figure is a nice side contribution of the paper.

I also like Figure 27 & 28 (shown), showing risks associated with a generic architecture. Having this work available allows systems builders to consider the risks to various components they’re working on. Having it available lets us have a conversation about the systematic risks that exist, but also, allows security experts to ask “is this the right set of risks for systems builders to think about?”

A chart of system components in an autonomous vehicle

Polymorphic Warnings On My Mind

There’s a fascinating paper, “Tuning Out Security Warnings: A Longitudinal Examination Of Habituation Through Fmri, Eye Tracking, And Field Experiments.” (It came out about a year ago.)

The researchers examined what happens in people’s brains when they look at warnings, and they found that:

Research in the fields of information systems and human-computer interaction has shown that habituation—decreased response to repeated stimulation—is a serious threat to the effectiveness of security warnings. Although habituation is a neurobiological phenomenon that develops over time, past studies have only examined this problem cross-sectionally. Further, past studies have not examined how habituation influences actual security warning adherence in the field. For these reasons, the full extent of the problem of habituation is unknown.

We address these gaps by conducting two complementary longitudinal experiments. First, we performed an experiment collecting fMRI and eye-tracking data simultaneously to directly measure habituation to security warnings as it develops in the brain over a five-day workweek. Our results show not only a general decline of participants’ attention to warnings over time but also that attention recovers at least partially between workdays without exposure to the warnings. Further, we found that updating the appearance of a warning—that is, a polymorphic design—substantially reduced habituation of attention.

Second, we performed a three-week field experiment in which users were naturally exposed to privacy permission warnings as they installed apps on their mobile devices. Consistent with our fMRI results, users’ warning adherence substantially decreased over the three weeks. However, for users who received polymorphic permission warnings, adherence dropped at a substantially lower rate and remained high after three weeks, compared to users who received standard warnings. Together, these findings provide the most complete view yet of the problem of habituation to security warnings and demonstrate that polymorphic warnings can substantially improve adherence.

It’s not short, but it’s not hard reading. Worthwhile if you care about usable security.

I’m pleased to be able to share work that Shostack & Associates and the Cyentia Institute have been doing for the Global Cyber Alliance. In doing this, we created some new threat models for email, and some new statistical analysis of

It shows the 1,046 domains that have successfully activated strong protection with GCA’s DMARC tools will save an estimated $19 million to $66 million dollars from limiting BEC for the year of 2018 alone. These organizations will continue to reap that reward every year in which they maintain the deployment of DMARC. Additional savings will be realized as long as DMARC is deployed.

Their press release from this morning is at here and the report download is here.

Learning from Near Misses

[Update: Steve Bellovin has a blog post]

One of the major pillars of science is the collection of data to disprove arguments. That data gathering can include experiments, observations, and, in engineering, investigations into failures. One of the issues that makes security hard is that we have little data about large scale systems. (I believe that this is more important than our clever adversaries.) The work I want to share with you today has two main antecedents.

First, in the nearly ten years since Andrew Stewart and I wrote The New School of Information Security, and called for more learning from breaches, we’ve seen a dramatic shift in how people talk about breaches. Unfortunately, we’re still not learning as much as we could. There are structural reasons for that, primarily fear of lawsuits.

Second, last year marked 25 years of calls for an “NTSB for infosec.” Steve Bellovin and I wrote a short note asking why that was. We’ve spent the last year asking what else we might do. We’ve learned a lot about other Aviation Safety Programs, and think there are other models that may be better fits for our needs and constraints in the security realm.

Much that investigation has been a collaboration with Blake Reid, Jonathan Bair, and Andrew Manley of the University of Colorado Law School, and together we have a new draft paper on SSRN, “Voluntary Reporting of Cybersecurity Incidents.”

A good deal of my own motivation in this work is to engineer a way to learn more. The focus of this work, on incidents rather than breaches, and on voluntary reporting and incentives, reflects lessons learned as we try to find ways to measure real world security. The writing and abstract reflect the goal of influencing those outside security to help us learn better:

The proliferation of connected devices and technology provides consumers immeasurable amounts of convenience, but also creates great vulnerability. In recent years, we have seen explosive growth in the number of damaging cyber-attacks. 2017 alone has seen the Wanna Cry, Petya, Not Petya, Bad Rabbit, and of course the historic Equifax breach, among many others. Currently, there is no mechanism in place to facilitate understanding of these threats, or their commonalities. While information regarding the causes of major breaches may become public after the fact, what is lacking is an aggregated data set, which could be analyzed for research purposes. This research could then provide clues as to trends in both attacks and avoidable mistakes made on the part of operators, among other valuable data.

One possible regime for gathering such information would be to require disclosure of events, as well as investigations into these events. Mandatory reporting and investigations would result better data collection. This regime would also cause firms to internalize, at least to some extent, the externalities of security. However, mandatory reporting faces challenges that would make this regime difficult to implement, and possibly more costly than beneficial. An alternative is a voluntary reporting scheme, modeled on the Aviation Safety Reporting System housed within NASA, and possibly combined with an incentive scheme. Under it, organizations that were the victims of hacks or “near misses” would report the incident, providing important details, to some neutral party. This database could then be used both by researchers and by industry as a whole. People could learn what does work, what does not work, and where the weak spots are.

Please, take a look at the paper. I’m eager to hear your feedback.

“Comparing the Usability of Cryptographic APIs”

Obstacles Frame

 

(The abstract:) Potentially dangerous cryptography errors are well documented in many applications. Conventional wisdom suggests that many of these errors are caused by cryptographic Application Programming Interfaces (APIs) that are too complicated, have insecure defaults, or are poorly documented. To address this problem, researchers have created several cryptographic libraries that they claim are more usable; however, none of these libraries have been empirically evaluated for their ability to promote more secure development. This paper is the first to examine both how and why the design and resulting usability of different cryptographic libraries affects the security of code written with them, with the goal of understanding how to build effective future libraries. We conducted a controlled experiment in which 256 Python developers recruited from GitHub attempt common tasks involving symmetric and asymmetric cryptography using one of five different APIs.
We examine their resulting code for functional correctness and security, and compare their results to their self-reported sentiment about their assigned library. Our results suggest that while APIs designed for simplicity can provide security
benefits—reducing the decision space, as expected, prevents choice of insecure parameters—simplicity is not enough. Poor
documentation, missing code examples, and a lack of auxiliary features such as secure key storage, caused even participants
assigned to simplified libraries to struggle with both basic functional correctness and security. Surprisingly, the
availability of comprehensive documentation and easy-to use code examples seems to compensate for more complicated APIs in terms of functionally correct results and participant reactions; however, this did not extend to security results. We find it particularly concerning that for about 20% of functionally correct tasks, across libraries, participants believed their code was secure when it was not. Our results suggest that while new cryptographic libraries that want to promote effective security should offer a simple, convenient interface, this is not enough: they should also, and perhaps more importantly, ensure support for a broad range of common tasks and provide accessible documentation with secure, easy-to-use code examples.

It’s interesting that even when developers took care to consider usability of their APIs, usability testing revealed serious issues. But it’s not surprising. The one constant of usability testing is that people surprise you.

The paper is: “Comparing the Usability of Cryptographic APIs,” Yasemin Acar (CISPA, Saarland University), Michael Backes (CISPA, Saarland University & MPI-SWS), Sascha Fahl (CISPA, Saarland University), Simson Garfinkel (National Institute of Standards and Technology), Doowon Kim (University of Maryland), Michelle Mazurek (University of Maryland), Christian Stransky (CISPA, Saarland University), The Increasingly-misnamed Oakland Conference, 2017.

“…the Elusive Goal of Security as a Scientific Pursuit”

That’s the subtitle of a new paper by Cormac Herley and Paul van Oorschot, “SoK: Science, Security, and the Elusive Goal of Security as a Scientific Pursuit,” forthcoming in IEEE Security & Privacy.

The past ten years has seen increasing calls to make security research more “scientific”. On the surface, most agree that this is desirable, given universal recognition of “science” as a positive force. However, we find that there is little clarity on what “scientific” means in the context of computer security research, or consensus on what a “Science of Security” should look like. We selectively review work in the history and philosophy of science and more recent work under the label “Science of Security”. We explore what has been done under the theme of relating science and security, put this in context with historical science, and offer observations and insights we hope may motivate further exploration and guidance. Among our findings are that practices on which the rest of science has reached consensus appear little used or recognized in security, and a pattern of methodological errors continues unaddressed.

Cyber Grand Shellphish

There’s a very interesting paper on the Cyber Grand Challenge by team Shellphish. Lots of details about the grand challenge itself, how they designed their software, how they approached the scoring algorithm, and what happened in the room.

There’s lots of good details, but perhaps my favorite is:

How would a team that did *nothing* do? That is, if a team connected and then ceased to play, would they fare better or worse than the other players? We ran a similar analysis to the “Never patch” strategy previously (i.e., we counted a CS as exploited for all rounds after its first exploitation against any teams), but this time removed any POV-provided points. In the CFE, this “Team NOP” would have scored 255,678 points, barely *beating* Shellphish and placing 3rd in the CGC.

The reason I like this is that scoring systems are hard. Really, really hard. I know that DARPA spent substantial time and energy on the scoring system, and this outcome happened anyway. We should not judge either DARPA or the contest on that basis, because it was hard to see that that would happen ahead of time: it’s a coincidence of the scores teams actually achieved.

Navigation