Earlier this week I went to an excellent discussion put on by danah boyd and her Data & Society Research Institute, entitled “Social, Cultural & Ethical Dimensions of ‘Big Data.’” Right off the top, I have to give major kudos to danah for organizing a fantastic panel that incorporated a great combination of voices – who, not for nothing (indeed, for a lot) were not just a bunch of white dudes (only one white dude, in fact) – from across different disciplines and perspectives. I’ll do a brief play-by-play to set the table for a couple of larger thoughts.
Following a rigorously on-message video from John Podesta and fairly anodyne talk (well, except for this) from Nicole Wong from the White House Office of Science and Technology Policy, danah led off with introductory remarks and passed off to Anil Dash, who served excellently as moderator (mostly by staying out of the way, as he made a point of noting). Alondra Nelson from Columbia University was first up, giving an account by turns moving, terrifying, and engaging on the state of play and human consequences flowing from DNA databases – both those managed by law enforcement and the loopholes that allow privately-managed data repositories to skirt privacy protections. She was followed by Shamina Singh from the MasterCard Center for Inclusive Growth, who provided several on-the-ground examples of working with governments, NGOs, and poor people to more efficiently deliver social benefits. In particular, she focused on a MasterCard program to provide direct transfers of cash to refugee populations, cutting out the vastly inefficient global aid infrastructure network.
Singh was followed by Steven Hodas from the New York City Department of Education, who laid out an illuminating picture of the lifecycle of data in education systems, the ways in which private actors subvert and undermine public privacy, and – not just a critic – offered a genuinely thought-provoking new way of thinking about how to regulate dissemination of private information. The excellent Kate Crawford batted cleanup, discussing predictive privacy harms and what she called “data due process.” Dash facilitated a very long and almost entirely productive audience question and discussion session (45 minutes, at the least), and I left with many more things on my mind than I entered with. I’d had the privilege of listening to eight different speakers, each from a background either subtly or radically different from one another. Not once did a speaker follow another just like them, and no small value came in the synthesis from those differing perspectives and those of the audience.
This week also saw the relaunch of FiveThirtyEight.com under its new ESPN/Disney instance. It was launched with a manifesto from founder Nate Silver, entitled “What the Fox Knows,” which is a bit meandering but generally comes down as setting FiveThirtyEight as opposed to both traditional journalism and science research, based on some fairly blithe generalizations of those fields. What it doesn’t quite do, oddly for a manifesto, is state just what FiveThirtyEight is for other than a sort of process and attitudinal approach. Marx (or even Levine/Locke/Searls/Weinberger) it ain’t.
Silver has come in for no small criticism, and not just from his normal antagonists. Emily Bell laid out the rather less-than-revolutionary staffing makeup of the current raft of new-media startups, led by Ezra Klein, Glenn Greenwald, and Silver. And Paul Krugman detailed some rather serious concerns about Silver’s approach:
you can’t be an effective fox just by letting the data speak for itself — because it never does. You use data to inform your analysis, you let it tell you that your pet hypothesis is wrong, but data are never a substitute for hard thinking. If you think the data are speaking for themselves, what you’re really doing is implicit theorizing, which is a really bad idea (because you can’t test your assumptions if you don’t even know what you’re assuming.)
These two critiques are not unrelated. Bell called out Silver for his desire for a “clubhouse,” and rightly so, because groupthink clubhouses – whether of insiders or outsiders – are the most fertile breeding grounds for implicit theorizing. Krugman revisited and expanded his critique, saying:
I hope that Nate Silver understands what it actually means to be a fox. The fox, according to Archilocus, knows many things. But he does know these things — he doesn’t approach each topic as a blank slate, or imagine that there are general-purpose data-analysis tools that absolve him from any need to understand the particular subject he’s tackling. Even the most basic question — where are the data I need? — often takes a fair bit of expertise.
Which brings me around to the beginning of this post. The value in Monday’s discussion flowed directly from both the diversity – in professional background, gender, ethnicity – and the expertise of the speakers present. They each spoke deeply from a particular perspective, and while “Big Data” was the through-line connecting them, the content which animated their discussion, approach, and theorizing was specific to their experience and expertise. The systems that create data have their own biases and agenda, which only discipline-specific knowledge can help untangle and correct for. There is still no Philosopher’s Stone, but base metals have their own stories. Knowing their essential properties isn’t easy or quick, but little is easy that’s of lasting and real value.