Nate Silver of 538 deserves praise for calling the election in all 50 states, using a fairly simple statistical model and lots of due diligence on the polling data. When the dust settles, I’ll be interested to see a more detailed objective evaluation of the forecast (e.g., some measure of skill, like likelihoods).
Many have noted that his approach stands in stark contrast to big-ego punditry:
- The Daily Show (source of the title)
- Paul Krugman
- RealClimate (don’t miss the comic)
- The Baseline Scenario framed the problem nicely before the election.
Another impressive model-based forecasting performance occurred just days before the election, with successful prediction of Hurricane Sandy’s turn to landfall on the East Coast, almost a week in advance.
On October 22, you blogged that there was a possibility it could hit the East Coast. How did you know that?
There are a few rather reliable global models. They’re models that run all the time, all year long, so they don’t focus on any one storm. They run for the entire globe, not just for North America. There are two types of runs these models can be configured to do. One is called a deterministic run and that’s where you get one forecast scenario. Then the other mode, and I think this is much more useful, especially at longer ranges where things become much more uncertain, is ensemble—where 20 or 40 or 50 runs can be done. They are not run at as high of a resolution as the deterministic run, otherwise it would take forever, but it’s still incredibly helpful to look at 20 runs.
Because you have variation? Do the ensemble runs include different winds, currents, and temperatures?
You can tweak all sorts of things to initialize the various ensemble members: the initial conditions, the inner-workings of the model itself, etc. The idea is to account for observational error, model error, and other sources of uncertainty. So you come up with 20-plus different ways to initialize the model and then let it run out in time. And then, given the very realistic spread of options, 15 of those ensemble members all recurve the storm back to the west when it reaches the East coast, and only five of them take it northeast. That certainly has some information content. And then, one run after the next, you can watch those. If all of the ensemble members start taking the same track, it doesn’t necessarily make them right, but it does mean it’s more likely to be right. You have much more confidence forecasting a track if the model guidance is in in good agreement. If it’s a 50/50 split, that’s a tough call.
On October 22, you blogged that there was a possibility it could hit the East Coast. How did you know that?
There are a few rather reliable global models. They’re models that run all the time, all year long, so they don’t focus on any one storm. They run for the entire globe, not just for North America. There are two types of runs these models can be configured to do. One is called a deterministic run and that’s where you get one forecast scenario. Then the other mode, and I think this is much more useful, especially at longer ranges where things become much more uncertain, is ensemble—where 20 or 40 or 50 runs can be done. They are not run at as high of a resolution as the deterministic run, otherwise it would take forever, but it’s still incredibly helpful to look at 20 runs.
Because you have variation? Do the ensemble runs include different winds, currents, and temperatures?
You can tweak all sorts of things to initialize the various ensemble members: the initial conditions, the inner-workings of the model itself, etc. The idea is to account for observational error, model error, and other sources of uncertainty. So you come up with 20-plus different ways to initialize the model and then let it run out in time. And then, given the very realistic spread of options, 15 of those ensemble members all recurve the storm back to the west when it reaches the East coast, and only five of them take it northeast. That certainly has some information content. And then, one run after the next, you can watch those. If all of the ensemble members start taking the same track, it doesn’t necessarily make them right, but it does mean it’s more likely to be right. You have much more confidence forecasting a track if the model guidance is in in good agreement. If it’s a 50/50 split, that’s a tough call.
On October 22, you blogged that there was a possibility it could hit the East Coast. How did you know that?
There are a few rather reliable global models. They’re models that run all the time, all year long, so they don’t focus on any one storm. They run for the entire globe, not just for North America. There are two types of runs these models can be configured to do. One is called a deterministic run and that’s where you get one forecast scenario. Then the other mode, and I think this is much more useful, especially at longer ranges where things become much more uncertain, is ensemble—where 20 or 40 or 50 runs can be done. They are not run at as high of a resolution as the deterministic run, otherwise it would take forever, but it’s still incredibly helpful to look at 20 runs.
Because you have variation? Do the ensemble runs include different winds, currents, and temperatures?
You can tweak all sorts of things to initialize the various ensemble members: the initial conditions, the inner-workings of the model itself, etc. The idea is to account for observational error, model error, and other sources of uncertainty. So you come up with 20-plus different ways to initialize the model and then let it run out in time. And then, given the very realistic spread of options, 15 of those ensemble members all recurve the storm back to the west when it reaches the East coast, and only five of them take it northeast. That certainly has some information content. And then, one run after the next, you can watch those. If all of the ensemble members start taking the same track, it doesn’t necessarily make them right, but it does mean it’s more likely to be right. You have much more confidence forecasting a track if the model guidance is in in good agreement. If it’s a 50/50 split, that’s a tough call.
It’s not just ego. It’s marketing. If a pundit can dismiss data in a way that at appears believable to at least part of the masses, then you can market whatever idea you want. People like to be able to say they picked the winner. I think pundits systematically dismissed the forecasts in order to make it as close as possible.
It’s of course impossible to verify,but I bet if pundits had conceded that Obama was likely to win, it wouldn’t have been anywhere near as “close” as it ended up being. Perhaps there is a study that given a 50-50 shot, would you vote versus a 10-90 shot. It looks like this study may control for the effects of “wishful thinking” though I do not have access to it.
http://poq.oxfordjournals.org/content/76/3/538.abstract
All hail Nate Silver! Mathiness over truthiness!
I am deeply amused by his new celebrity status over faithfully crunching numbers. When was the last time that happened?
On the subject of forecasting fun, I also liked this piece.
http://m.npr.org/news/front/164711093
Bruce – Making stuff up is also vastly cheaper and easier than processing large amounts of data, so it works for both sides of the bottom line if it works.
Cherilyn – Interesting. I wonder if Lichtman’s forecasts have been formally published in advance? Anyway, nice to have a moment when the press takes note of the tide of BS.