I’ve been hard on social science, even suggesting that “social science” is an oxymoron. Social science has enormous potential, however, especially when it combines rigorous empiricism with a resistance to absolute answers.
The work of Philip Tetlock possesses these qualities, and it addresses a fundamental question: How predictable are social events? His new book Superforecasting: The Art and Science of Prediction, co-written with journalist Dan Gardner, shows that some ordinary people can forecast events more accurately than many so-called experts. Tetlock, a social psychologist at the University of Pennsylvania, visited Stevens on October 14 to talk about Superforecasting. He also responded to a few questions about his research.
Horgan: You’re renowned for showing in your 2005 book Expert Political Judgment how hard it is to predict social phenomena. And yet your new book is much more optimistic about the possibility of accurate prediction. Is there anything in your first book that you take back?
Tetlock: Nothing springs to mind. The contradictions are, in my view, more apparent than real. There are two big geopolitical forecasting-tournament data sets, one linked to Expert Political Judgment, summarizing tournaments that ran from 1985–2002, and the other linked to GJP (Good Judgment Project), otherwise known as the IARPA (Intelligence Advanced Research Projects Agency) tournament, which ran from 2011–2015.
There are, of course, important similarities. Both tournaments pose questions about possible futures well specified enough to pass the clairvoyance test. And they ask forecasters to make judgments along probability scales.
But there are big differences — and these differences account for the different findings and emphases in interpretation. (Was it Heisenberg who said: “We know nature only as it is exposed to our methods of questioning”? Regardless, that truism is certainly true of forecasting tournaments.)
The cumulative effect of all these differences was that there were more opportunities and incentives for forecasters to shine in the later work than in the earlier work. Consider this list of differences:
(1) the shortest questions in the earlier work (asking people to look out about one year) were shorter than all but the very longest questions in the later work (the vast majority of questions that superforecasters required looking out several months but less than a year);
(2) forecasters in the earlier work wanted anonymity whereas forecasters in the later work wanted to be recognized on leaderboards;
(3) forecasters in the earlier work rarely had opportunities to update their beliefs whereas forecasters in the later work were strongly encouraged to update their probability estimates as often as they felt the news warranted.
Put differently, the much more publicly competitive nature of the IARPA tournaments pressures people to be more open minded, to be foxier, than they normally are (more so than do EPJ tournaments) — because they raise the reputational risks of closed-mindedness.
I suppose that is why people who have read both Expert Political Judgment and Superforecasting see the latter book as more upbeat, more about lighting candles than cursing the darkness. That is probably a pretty fair assessment. Deep down, I see the two books as complementary, not contradictory.
Horgan: You have discovered that certain people possess traits that make them “superforecasters,” who are much better than average at predicting social events. Can these traits be automated, that is, be codified in algorithms?
Tetlock: We describe in the book an opportunity to discuss this problem with David Ferrucci, the creator of WATSON (the artificial-intelligence world-champion in Jeopardy). He agreed, for instance, that WATSON would have little difficulty answering a question like: which two Russian leaders traded jobs in the last five years? But he noted it would be quite another matter to answer the question: will those same Russian leaders change jobs in the next five years? The second question is one that superforecasters would find pretty easy (I think) but that no artificial-intelligence system on the planet today could field in a compelling way. Why is the second question so much more difficult than the first? Because answering the second question requires a somewhat intricate causal model of the Russian political system, of the personalities involved, and of the evolving threats and opportunities they are likely to confront. It is not “just” a matter of scanning a massive database and triangulating in on the most plausible Bayesian-estimated answer. I put scare quotes around “just” because I do not in any way want to trivialize what an extraordinary achievement WATSON is.
Horgan: Are you a believer in the power of Big Data to revolutionize the social sciences? Will social science ever be as precise and rigorous as physics?
Tetlock: I’m not sure about “revolutionizing” social science, but Big Data will clearly make it possible to answer many categories of questions that were previously unanswerable. We now have massive databases on interpersonal relations (e.g. Facebook), search behavior (Google), consumer behavior (seemingly everywhere). Tangentially: Companies routinely do things to all of us that the human subjects review boards at universities would categorize as unconscionably unethical. Either university review boards are ridiculously hypersensitive or Big Data firms are ridiculously insensitive. I think it is a mix.
Horgan: Would you describe yourself as an optimist or pessimist about the prospects for humanity?
Tetlock: I suppose I would use the term used in Superforecasting: a cautious optimist.
John Horgan directs the Center for Science Writings, which is part of the College of Arts & Letters. A longer version of this Q&A was published on his ScientificAmerican.com blog, “Cross-check.”