346,000 Wuhan Citizens’ Secrets

“346,000 Wuhan Citizens’ Secrets” was an exhibition created with $800 worth of data by Deng Yufeng. From the New York Times:

Chinese Personal Data framed

Six months ago, Mr. Deng started buying people’s information, using the Chinese messaging app QQ to reach sellers. He said that the data was easy to find and that he paid a total of $800 for people’s names, genders, phone numbers, online shopping records, travel itineraries, license plate numbers — at a cost at just over a tenth of a penny per person.

The Personal Data of 346,000 People, Hung on a Museum Wall
,” by Sui-Lee Wee and Elsie Chen.

Citizen Threat Modeling and more data

Last week, in “Threat Modeling: Citizens Versus Systems,” I wrote:

I think that was a right call for the first project, because the secondary data flows are a can of worms, and drawing them would, frankly, look like a can of worms.

Many organizations don’t disclose them beyond saying “we share your data to deliver and improve the service,” those that do go farther disclose little about the specifics of what data is transferred to who.

Paypal Partnerships
Today, via Bruce Schneier, we see that Paypal has disclosed the list of over 600 companies they might share your data with. He rightly asks if that’s unusual. We don’t know. My instinct is that it’s not unusual for a financial multi-national.

I’m standing by the questions I asked; the first level of categories in the Paypal list may act as a good third level for our analysis. It will be interesting to see if others use the same categories. If they don’t, the analysis process is magnified.

Their categories are:

  1. Payment Processors
  2. Audit
  3. Customer Service outsourcing
  4. Credit reference and fraud agencies
  5. Financial products
  6. Commercial partnerships
  7. Marketing and public relations
  8. Operational services
  9. Group companies
  10. Commercial partners
  11. Legal
  12. Agencies

It’s unclear to me how 6 (“Commercial partnerships”) differs from 10 (“Commercial partners”). I say this because I’m curious, not to point and laugh. We should cut Paypal some slack and appreciate that this is a new process to handle a new legal requirement. I’m also curious if 12 (“agencies”) means “law enforcement agencies” or something else.

Visualization from How PayPal Shares Your Data.

Threat Modeling: Citizens Versus Systems

Recently, we shared a privacy threat model which was centered on the people of Seattle, rather than on the technologies they use.

Because of that, we had different scoping decisions than I’ve made previously. I’m working through what those scoping decisions mean.

First, we cataloged how data is being gathered. We didn’t get to “what can go wrong?” We didn’t ask about secondary uses or transfers — yet. I think that was a right call for the first project, because the secondary data flows are a can of worms, and drawing them would, frankly, look like a can of worms. We know that most of the data gathered by most of these systems is weakly protected from government agencies. Understanding what secondary data flows can happen will be quite challenging. Many organizations don’t disclose them beyond saying “we share your data to deliver and improve the service,” those that do go farther disclose little about the specifics of what data is transferred to who. So I’d like advice: how would you tackle secondary data flows?

Second, we didn’t systematically look at the question of what could go wrong. Each of those examinations could be roughly the size and effort of a product threat model. Each requires an understanding of a person’s risk profile: victims of intimate partner violence are at risk differently than immigrants. We suspect there’s models there, and working on them is a collaborative task. I’d like advice here. Are there good models of different groups and their concerns on which we could draw?

Threat Modeling Privacy of Seattle Residents

On Tuesday, I spoke at the Seattle Privacy/TechnoActivism 3rd Monday meeting, and shared some initial results from the Seattle Privacy Threat Model project.

Overall, I’m happy to say that the effort has been a success, and opens up a set of possibilities.

  • Every participant learned about threats they hadn’t previously considered. This is surprising in and of itself: there are few better-educated sets of people than those willing to commit hours of their weekends to threat modeling privacy.
  • We have a new way to contextualize the decisions we might make, evidence that we can generate these in a reasonable amount of time, and an example of that form.
  • We learned about how long it would take (a few hours to generate a good list of threats, a few hours per category to understand defenses and tradeoffs), and how to accelerate that. (We spent a while getting really deep into threat scenarios in a way that didn’t help with the all-up models.)
  • We saw how deeply and complexly mobile phones and apps play into privacy.
  • We got to some surprising results about privacy in your commute.

More at the Seattle Privacy Coalition blog, “Threat Modeling the Privacy of Seattle Residents,” including slides, whitepaper and spreadsheets full of data.

The Carpenter Case

On Wednesday, the supreme court will consider whether the government must obtain a warrant before accessing the rich trove of data that cellphone providers collect about cellphone users’ movements. Among scholars and campaigners, there is broad agreement that the case could yield the most consequential privacy ruling in a generation. (“Supreme court cellphone case puts free speech – not just privacy – at risk.”)

Bruce Schneier has an article in the Washington Post, “How the Supreme Court could keep police from using your cellphone to spy on you,” as does Stephen Sachs:

The Supreme Court will hear arguments this Wednesday in Carpenter v. United States, a criminal case testing the scope of the Fourth Amendment’s right to privacy in the digital age. The government seeks to uphold Timothy Carpenter’s conviction and will rely, as did the lower court, on the court’s 1979 decision in Smith v. Maryland, a case I know well.

I argued and won Smith v. Maryland when I was Maryland’s attorney general. I believe it was correctly decided. But I also believe it has long since outlived its suitability as precedent. (“The Supreme Court’s privacy precedent is outdated.”)

I am pleased to have been able to help with an amicus brief in the case, and hope that the Supreme Court uses this opportunity to protect all of our privacy. Good luck to the litigants!

Amicus brief in “Carpenter” Supreme Court Case

“In an amicus brief filed in the U.S. Supreme Court, leading technology experts represented by the Knight First Amendment Institute at Columbia University argue that the Fourth Amendment should be understood to prohibit the government from accessing location data tracked by cell phone providers — “cell site location information” — without a warrant.”

For more, please see “In Supreme Court Brief, Technologists Warn Against Warrantless Access to Cell Phone Location Data.” [Update: Susan Landau has a great blog post “Phones Move – and So Should the Law” in which she frames the issues at hand.]

I’m pleased to be one of the experts involved.

A Privacy Threat Model for The People of Seattle

Some of us in the Seattle Privacy Coalition have been talking about creating a model of a day in the life of a citizen or resident in Seattle, and the way data is collected and used; that is the potential threats to their privacy. In a typical approach, we focus on a system that we’re building, analyzing or testing. In this model, I think we need to focus on the people, the ‘data subjects.’

I also want to get away from the one by one issues, and help us look at the problems we face more holistically.

Feds Sue Seattle over FBI Surveillance

The general approach I use to threat model is based on 4 questions:

  1. What are you working on? (building, deploying, breaking, etc)
  2. What can go wrong?
  3. What are you going to do about it?
  4. Did you do a good job?

I think that we can address the first by building a model of a day, and driving into specifics in each area. For example, get up, check the internet, go to work (by bus, by car, by bike, walking), have a meal out…

One question that we’ll probably have to work on is how to address what can go wrong in a model this general? Usually I threat model specific systems or technologies where the answers are more crisp. Perhaps a way to break it out would be:

  1. What is a Seattlite’s day?
  2. What data is collected, how, and by whom? What models can we create to help us understand? Is there a good balance between specificity and generality?
  3. What can go wrong? (There are interesting variations in the answer based on who the data is about)
  4. What could we do about it? (The answers here vary based on who’s collecting the data.)
  5. Did we do a good job?

My main goal is to come away from the exercise with a useful model of the privacy threats to Seattleites. If we can, I’d also like to understand how well this “flipped” approach works.

[As I’ve discussed this, there’s a lot of interest in what comes out and what it means, but I don’t expect that to be the main focus of discussion on Saturday. For example,] There are also policy questions like, “as the city takes action to collect data, how does that interact with its official goal to be a welcoming city?” I suspect that the answer is ‘not very well,’ and that there’s an opportunity for collaboration here across the political spectrum. Those who want to run a ‘welcoming city’ and those who distrust government data collection can all ask how Seattle’s new privacy program will help us.

In any event, a bunch of us will be getting together at the Delridge Library this Saturday, May 13, at 1PM to discuss for about 2 hours, and anyone interested is welcome to join us. We’ll just need two forms of ID and your consent to our outrageous terms of service. (Just kidding. We do not check ID, and I simply ask that you show up with a goal of respectful collaboration, and a belief that everyone else is there with the same good intent.)

The Evolution of Apple’s Differential Privacy

Bruce Schneier comments on “Apple’s Differential Privacy:”

So while I applaud Apple for trying to improve privacy within its business models, I would like some more transparency and some more public scrutiny.

Do we know enough about what’s being done? No, and my bet is that Apple doesn’t know precisely what they’ll ship, and aren’t answering deep technical questions so that they don’t mis-speak. I know that when I was at Microsoft, details like that got adjusted as we learned from a bigger pile of real data from real customer use informed things. I saw some really interesting shifts surprisingly late in the dev cycle of various products.

I also want to challenge the way Matthew Green closes: “If Apple is going to collect significant amounts of new data from the devices that we depend on so much, we should really make sure they’re doing it right — rather than cheering them for Using Such Cool Ideas.”

But that is a false dichotomy, and would be silly even if it were not. It’s silly because we can’t be sure if they’re doing it right until after they ship it, and we can see the details. (And perhaps not even then.)

But even more important, the dichotomy is not “are they going to collect substantial data or not?” They are. The value organizations get from being able to observe their users is enormous. As product managers observe what A/B testing in their web properties means to the speed of product improvement, they want to bring that same ability to other platforms. Those that learn fastest will win, for the same reasons that first to market used to win.

Next, are they going to get it right on the first try? No. Almost guaranteed. Software, as we learned a long time ago, has bugs. As I discussed in “The Evolution of Secure Things:”

Its a matter of the pressures brought to bear on the designs of even what (we now see) as the very simplest technologies. It’s about the constant imperfection of products, and how engineering is a response to perceived imperfections. It’s about the chaotic real world from which progress emerges. In a sense, products are never perfected, but express tradeoffs between many pressures, like manufacturing techniques, available materials, and fashion in both superficial and deep ways.

Green (and Schneier) are right to be skeptical, and may even be right to be cynical. We should not lose sight of the fact that Apple is spending rare privacy engineering resources to do better than Microsoft. Near as I can tell, this is an impressive delivery on the commitment to be the company that respects your privacy, and I say that believing that there will be both bugs and design flaws in the implementation. Green has an impressive record of finding and calling Apple (and others) on such, and I’m optimistic he’ll have happy hunting.

In the meantime, we can, and should, cheer Apple for trying.

RSA: Time for some cryptographic dogfood

One of the most effective ways to improve your software is to use it early and often.  This used to be called eating your own dogfood, which is far more evocative than the alternatives. The key is that you use the software you’re building. If it doesn’t taste good to you, it’s probably not customer-ready.  And so this week at RSA, I think more people should be eating the security community’s cryptographic dogfood.

As I evangelize the use of crypto to meet up at RSA, I’ve encountered many problems, such as choice of tool, availability of tool across a set of mobile platforms, cost of entry, etc.  Each of these is predictable, but with dogfooding — forcing myself to ask everyone why they want to use an easily wiretapped protocol — the issues stand out, and the companies that will be successful will start thinking about ways to overcome them.

So this week, as you prep for RSA, spend a few minutes to get some encrypted communications tool. The worst that can happen is you’re no more secure than you were before you read this post.

What Price Privacy, Paying For Apps edition

There’s a new study on what people would pay for privacy in apps. As reported by Techflash:

A study by two University of Colorado Boulder economists, Scott Savage and Donald Waldman, found the average user would pay varying amounts for different kinds of privacy: $4.05 to conceal contact lists, $2.28 to keep their browser history private, $2.12 to eliminate advertising on apps, $1.19 to conceal personal locations, $1.75 to conceal the phone’s ID number and $3.58 to conceal the contents of text messages.

Those numbers seem small, but they’re in the context of app pricing, which is generally a few bucks. If those numbers combine linearly, people being willing to pay up to $10 more for a private version is a very high valuation. (Of course, the numbers will combine in ways that are not strictly rational. Consumers satisfice.

A quick skim of the article leads me to think that they didn’t estimate app maker benefit from these privacy changes. How much does a consumer contact list go for? (And how does that compare to the fines for improperly revealing it?) How much does an app maker make per person whose eyeballs they sell to show ads?