Posted by stannenb on Mar 4, 2014.

Big Data: The New Extractive Industry

"The business model of the internet is surveillance." – Bruce Schneier

"Data and data analytics are a powerful new fuel of the American economy." – Secretary of Commerce Penny Pritzker

"Assume you have collected data" is an inauspicious way to start a privacy workshop. But that's how Monday's Big Data Privacy Workshop organized by MIT and the White House began. And like extractive industries that take natural resources from the ground, the emerging big data industry is more interested in preserving value than the condition it leaves those from whom that value has been extracted.

The workshop, intended to "advanc[e] the state of the art in technology and practice," is part of a 90 day review, announced by President Obama in response to Edward Snowden's disclosures of the data collection practices of the National Security Agency (NSA). It was a full court press by the administration, featuring Counselor to the President John Podesta and Secretary of Commerce Penny Pritzker. MIT used it as an opportunity to showcase its research on maintaining the security of already-collected information. But privacy interests start prior to the collection of data. To that point, this workshop was virtually silent.

A morning panel discussion of "Big Data Opportunities and Challenges" was long on opportunities and short on challenges. Professor Sam Madden, one of MIT's rising stars in Big Data, spoke about how data from mobile phones could be used to detect risky driving. Associate Professor Manolis Kellis explained the need to access large amounts of data to help understand the interaction of the human genome and disease process. Professor John Guttag talked about privacy as an obstacle preventing medical researchers from saving lives lost to hospital acquired infections.

"It is difficult to get a man to understand something, when his salary depends on his not understanding it." – Upton Sinclair

When an MIT expert asserts that privacy of medical records is keeping him from saving lives, one should take notice. But Guttag is not just a disinterested scholar. As his web site says:

Professor Guttag has had long-term consulting relationships with a number of industrial research and advanced development organizations. He has also worked for many years as a consultant specializing in the analysis of information systems related business opportunities and risks. He currently serves on the technical advisory boards of Vanu, Inc., on the Board of Directors of Empirix, Inc.

Similarly, Madden says:

I am the Chief Scientist of Cambridge Mobile Telematics, a Cambridge, MA-based startup that specializes in processing mobile sensor data for telematics, usage-based insurance (UBI), and other applications.

Michael Stonebraker, an Adjunct Professor who one industry publication labeled as the "forefather of Big Data", has interests in at least three different companies in the database industry. Yet the only participant to note his conflict of interest was Daniel Weiztner, saying that he has an interest in a company seeking to commercialize his research. It's not a coincidence that this disclosure came from the only MIT participant was trained as a lawyer and whose professional experience includes the government and public interest groups.

The MIT sponsor for the event, the Computer Science and Artificial Intelligence Lab's Big Data Initiative, is sponsored by AIG, Alior Bank, British Telecommunications (BT), EMC, Facebook, Huawei, Intel, Microsoft, Quanta, Samsung, SAP, Shell, and Thomson Reuters. No privacy, consumer, or civil liberties groups appear involved. In watching the emergence of this academic/industry partnership, we are seeing the seeds of a data industrial complex, and its growing influence. This is an extractive industry, mining our personal information to create wealth, wealth that rarely, if ever, trickles back to those from whom it was extracted. And like the traditional extractive industries, it is supported by our taxes, in this case by a public agency, The Massachusetts Technology Collaborative who sponsor hack/reduce, whose mission is to "create the talent and the technologies that will shape our future in a big data-driven economy." It's not clear why an industry said to have so much promise, and an organization like hack/reduce sponsored by multiple venture capital firms, as well as Google, IBM, and Microsoft, requires taxpayer support.

It was not until the last panel that the voices of public interest were heard. Latanya Sweeney, a pioneering researcher on how anonymous data isn't really anonymous and now the Chief Technology Officer of the Federal Trade Commission, was the first panelist to speak specifically of harms, that disclosure of data causes damage. To that end, she showed individuals' medical data spread through the entities involved in medical care and billing. Carol Rose, of the ACLU, talked about the intimate nature of some of the data being discussed, that location data from cell phones disclose things like affairs, substance abuse treatment, or abortions. But the majority of the panel were advocates for data, dismissing concerns almost as if they were quaint anachronisms.

As a final mark of the limitations of this workshop, one only has to look at the participation of the NSA's Director of Compliance John DeLong, the man in charge when the NSA's largest data breach - Snowden's leaks - took place. Relegated to the last panel, an hypothetical case study of big data usage, DeLong barely spoke and wasn't asked any questions about the NSA. It was not a big deal, DeLong told this reporter, he's often in Cambridge as part of his participation in the MIT Big Data Initiative.

There is no doubt that insight into the human condition through innovative analysis of data will improve the quality of life for some. But there is also no doubt that the improvement won't be equally distributed, and that there is a real risk of making life worse for others. Wealthy people already have privacy, Sweeney noted, and those who require government assistance for food, housing, or medical care have surrendered considerable privacy just to get that assistance. Perhaps it was too much to hope that members of the growing data industrial complex would grapple with the less savory impact of their work, or that a political establishment beholden to large corporations would structure a process that placed people first. But it shouldn't have been too much to hope that MIT, one of whose earlier presidents – Jerome Weisner – was placed on Richard Nixon's enemies list for opposing government led technology, could have found its own social justice advocates to widen this discussion. Perhaps Noam Chomsky, whose views broadened a previous MIT Big Data conference, was otherwise engaged.

Image of medical data flows from Latanya Sweeney's "Data Map".

