Not-So-Private Metadata

( flickr/RumpledElf )

BOB GARFIELD: When Edward Snowden first revealed the NSA's controversial surveillance program last year, much of the NSA's defense of the practice relied on the fact that they were only collecting metadata and, therefore, weren't actually spying on Americans. Here is NSA Chief Keith Alexander, talking last year to John Miller on 60 Minutes.

[CLIP]:

JOHN MILLER: You don’t hear the call.

GEN. KEITH ALEXANDER: You don't hear the call.

JOHN MILLER: You don't see the name.

GEN. KEITH ALEXANDER: You don't see the names.

JOHN MILLER: You just see this number called that number.

GEN. KEITH ALEXANDER: That this number, the "to/from" number, the duration of the call and the date/time group, that's all you get.

[END CLIP]

BOB GARFIELD: So just numbers – phone numbers, time numbers, length numbers, no real privacy concerns, right? Wrong, at least according to the computer scientists at Stanford University, who argue that phone metadata is inherently revealing and have conducted their own simulation of the NSA's metadata collection to prove it.

Jonathan Mayer is a PhD student at Stanford and has been working on the project at Stanford Security Lab.

JONATHAN MAYER: We made an application available to users of Android smartphones that would enable them to upload their own phone metadata to us for analysis. The first step in analyzing the data was mapping numbers to names, names of organizations names of professional services, names of individuals. And we were able to do that pretty easily. There are a number of public databases you can use, from Yelp, from Google, from Facebook. We categorized the types of organizations that individuals called, and we found a number of fairly sensitive groups in there.

BOB GARFIELD: The law permits the NSA to also track the contacts of the numbers that have been dialed, right? This is called a hop.

JONATHAN MAYER: That’s right. So, for example, if I called you, the the NSA could make a hop over to your phone metadata. Under old rules, the NSA could follow three hops in phone metadata. Newer rules constrain the agency to two.

One of our findings was that two or three hops can be sufficient to reach a substantial proportion of the United States population, by virtue of very popular phone numbers. So, for example, you may have called FedEx. I certainly have called FedEx. That means we’re two hops away from each other.

BOB GARFIELD: I want to return to the idea of what inferences the intelligence agencies can actually draw. You began with simply trying to figure out people's romantic relationships. You were also able to predict religious affiliation, financial problems, legal issues, a lot.

JONATHAN MAYER: That's right. A participant in a study who owns a specific brand and model of assault rifle, that was identifiable based on a pattern of calls to local firearm dealers and national customer service lines for a particular firearm manufacturer.

BOB GARFIELD: You found at least one criminal.

JONATHAN MAYER: A participant in our study placed calls to a lumber yard and a locksmith, local hydroponics dealers and a bong shop. The inference, I think, is pretty obvious.

BOB GARFIELD: This person was setting up a marijuana grow house, as far as you can tell. And someone else, someone else ended a pregnancy.

JONATHAN MAYER: That was a, a plausible inference from someone who had a lengthy early morning phone call with someone we were able to identify as her sister, followed by some calls with the local Planned Parenthood location, and then some calls a couple of weeks later, and then a call a month after that. That pattern of calls certainly gives rise to, to an inference.

BOB GARFIELD: In terms of privacy of the individual ordinary American, I guess it almost doesn't matter whether the inferences made from this metadata are accurate or inaccurate.

JONATHAN MAYER: I would agree. Plausible inferences drawn from data, even inaccurate ones, can have a great impact. For example, in the consumer privacy space, one great concern is false inferences getting drawn that might lead to a bad credit score. In fact, we hear about that all the time in the news. Information derived from phone metadata could be used to build a case against an individual to justify later searches of an individual’s communications or even their home or physical possessions.

BOB GARFIELD: Do you have any reason to believe that the NSA is, in fact, using the metadata to make educated guesses about individual behavior? Or is this just a researcher’s parlor game?

JONATHAN MAYER: [LAUGHS] Ha. We can't be certain, of course, of what individual NSA analysts do with the information at their disposal. However, I'm not sure it really matters. The agency has tried to defend its practices on the basis that these programs don't allow for very sensitive inferences about ordinary Americans. That claim is contradicted by the science. Metadata appears to be highly sensitive, and certainly privacy risks arise from the availability of this sort of information, whether or not it's actually used that way.

BOB GARFIELD: And if they weren’t doing it yesterday, in all probability, thanks to you, [LAUGHS] they will be doing it tomorrow.

JONATHAN MAYER: Ha [LAUGHS], I, I hope not.

BOB GARFIELD: Jonathan, thank you so much.

JONATHAN MAYER: Thank you.

BOB GARFIELD: Jonathan Mayer is a PhD candidate in computer science at Stanford University. His metadata research is a project of the Stanford Security Lab.

Hosted by Bob Garfield

Produced by WNYC Studios