I am a big fan of Philip Tetlock. From a recent article in The National Interest:
What experts think—where they fall along the Left-Right spectrum—is a weak predictor of accuracy. But how experts think is a surprisingly consistent predictor. Relative to foxes who are less encumbered by loyalties to an all-encompassing worldview, hedgehogs offer bolder forecasts and, although they hit occasional grand slams, they strike out a lot and wind up with decidedly poorer batting averages.
Readers now know my biases: a deep belief in the need for independent, objective scoring rules for gauging expert accuracy and a deep skepticism of big-idea hedgehogs. …
The best thing I can say for the superpundit model is likely to annoy virtually all of that ilk: they look a lot better when we blend them into a superpundit composite. Aggregation helps. As financial journalist James Surowiecki stressed in his insightful book The Wisdom of Crowds, if you average the predictions of many pundits, that average will typically outperform the individual predictions of the pundits from whom the averages were derived. This might sound magical, but averaging works when two fairly easily satisfied conditions are met: (1) the experts are mostly wrong, but they are wrong in different ways that tend to cancel out when you average; (2) the experts are right about some things, but they are right in partly overlapping ways that are amplified by averaging. Averaging improves the signal-to-noise ratio in a very noisy world. If you doubt this, try this demonstration. Ask several dozen of your coworkers to estimate the value of a large jar of coins. When my classes do this exercise, the average guess is closer to the truth than 80 or 90 percent of the individual guesses. …
Both private- and public-sector prognosticators must master the same tightrope-walking act. They know they need to sound as though they are offering bold, fresh insights into the future not readily available off the street. And they know they cannot afford to be linked to flat-out mistakes. Accordingly, they have to appear to be going out on a limb without actually going out on one. That is why (with the interesting exception of Bueno de Mesquita), they so uniformly appear to dislike affixing “artificially precise” subjective probability estimates to possible outcomes—the only reliable method we have of systematically tracking accuracy across pundits, methods, time and contexts. It is much safer to retreat into the vague language of possibilities and plausibilities—things that might or could happen if various difficult-to-determine preconditions were satisfied. The trick is to attach so many qualifiers to your vague predictions that you will be well positioned to explain pretty much whatever happens. China will fissure into regional fiefdoms, but only if the Chinese leadership fails to manage certain trade-offs deftly, and only if global economic growth stalls for a protracted period, and only if . . . And if you venture specific policy recommendations—such as invading Iraq or deregulating financial markets—make sure to leave yourself the fallback position: “Well, of course, my recommendation was fundamentally sound, but how was I to know that the idiots in charge would implement things so badly. If only they had . . . ”
Having mastered this subtle balancing act, why should these private- and public-sector pundits open their reputations and livelihoods to the unpredictable risks of competing against each other in level-playing-field forecasting exercises? Why stray from the cozy vagueness-zone equilibrium, which seems to be working well enough for the providers of forecasting services, even though the societal outcome is decidedly suboptimal?
(HT: Crooked Timber)