Threat Modeling: Citizens Versus Systems

Recently, we shared a privacy threat model which was centered on the people of Seattle, rather than on the technologies they use.

Because of that, we had different scoping decisions than I’ve made previously. I’m working through what those scoping decisions mean.

First, we cataloged how data is being gathered. We didn’t get to “what can go wrong?” We didn’t ask about secondary uses or transfers — yet. I think that was a right call for the first project, because the secondary data flows are a can of worms, and drawing them would, frankly, look like a can of worms. We know that most of the data gathered by most of these systems is weakly protected from government agencies. Understanding what secondary data flows can happen will be quite challenging. Many organizations don’t disclose them beyond saying “we share your data to deliver and improve the service,” those that do go farther disclose little about the specifics of what data is transferred to who. So I’d like advice: how would you tackle secondary data flows?

Second, we didn’t systematically look at the question of what could go wrong. Each of those examinations could be roughly the size and effort of a product threat model. Each requires an understanding of a person’s risk profile: victims of intimate partner violence are at risk differently than immigrants. We suspect there’s models there, and working on them is a collaborative task. I’d like advice here. Are there good models of different groups and their concerns on which we could draw?

5 thoughts on “Threat Modeling: Citizens Versus Systems”

  1. Have you considered drawing the experiences of K-12 education and their experiences with the 1st through 10th amendments and threats to them?

  2. Re: getting at the risk profiles, I suspect our colleagues in the social science community have produced some great qualitative models on these sorts of risks. Turning these into personas and using them to support ideation around threat modelling could be worth exploring [disclaimer: I presented a paper on “Persona Cases” a CHI a few years ago illustrating how to do this]. I suspect these personas would also be a good resource for privacy engineers in general.

  3. Adam-
    Yeah wow that’s massive.
    How about doing an individual model for each type of PII? For example start with the home address. And then do one for their phone number. For each there are various social engineering attacks that are possible with that data alone.
    Based on history of previous attacks we could pair the necessary different bits of information together that were required to perpetrate the attack and based on a brand of probability figure out how often those combinations would occur for a given population. For example if a phone number and address were all that where necessary to do a bank withdrawal (based on the previous attack ) note that. Take a stab at figuring out how many banks would be privy to that attack and assume everyone has a bank account. Another opportunity to weight things!
    Then take all of the different attacks from the different combinations and give them a weight from 1 to 100. Now guess the population for each type of asset cobination that was necessary for historical attacks… calculate the probability for the necessary intersections to perform all of the attacks that we were able to recall from history. At some point your weight will be a dollar calculation based on money stolen from previous attacks.
    A little tricky but seems extremely worthwhile!

    1. Thanks for the idea Greg! Tell me more about your idea of modeling each type of PII? What exactly would you draw? Would the PII be a line from a person to a ??

      1. Of course the drawing part-
        My thoughts are probably more high level but let’s start with the simple case.
        For the simple case we have the asset (PII) as a data store. We of course have the person (actor) whose PII is at stake as an actor who volunteers that data through an application (process) layer, which then flows through to the data store. The attacker (actor) at a high level may go directly to the data store or through the application layer. Of course it varies. If he gained privilege through some pivot point in the datacenter it would look like a direct access to the data store. If he managed to grab the data through CSRF or some other elevation of privilege at the application layer he would attack through the process.
        There are so many applications and situations that it _feels_ like the DFD matters a bit less at this point. It seems useful to build a somewhat complex Venn diagram of

        -historical statistical probability of being attacked on a per person basis showing all attacks that have happened to a population
        -average financial impact per _type of_ attack which is multiplied by probability of attack that caused each particular impact
        to come up with (roughly) probability*damage to get average cost per person.

        It may be very interesting from a prevention/threat modeling perspective to look at the forensics for each perpetration (is that a word?) to help us discern which type of threats / combination of attacks are most powerful when you count the financial damage that resulted. From a multi-system mass population perspective that may be a good approach but not much like a typical threat model diagram.

        There was a bit of Siri damage in my previous post so a little confusing, sorry!

Comments are closed.