Not all models are wrong.

Box’s famous comment, that “all models are wrong,” gets repeated ad nauseum (even by me). I think it’s essential to be aware of this in the sloppy sciences, but it does a disservice to modeling and simulation in general.

As far as I’m concerned, a lot of models are basically right. I recently worked with some kids on an air track experiment in physics. We timed the acceleration of a sled released from various heights, and plotted the data. Then we used a quadratic fit, based on a simple dynamic model, to predict the next point. We were within a hundredth of a second, confirmed by video analysis.

Sure, we omitted lots of things, notably air resistance and relativity. But so what? There’s no useful sense in which the model was “wrong,” anywhere near the conditions of the experiment. (Not surprisingly, you can find a few cranks who contest Newton’s laws anyway.)

I think a lot of uncertain phenomena in social sciences operate on a backbone of the same kind of “physics.” The future behavior of the government is quite unpredictable, but there isn’t much uncertainty about accounting, e.g., that increasing the deficit increases the debt.

The domain of wrong but useful models remains large (within an even bigger sea of simple ignorance), but I think more and more things are falling into the category of models that are basically right. The trick is to be able to spot the difference. Some people clearly can’t:

A&G provide no formal method to distinguish between situations in which models yield useful or spurious forecasts. In an earlier paper, they claimed rather broadly,

‘To our knowledge, there is no empirical evidence to suggest that presenting opinions in mathematical terms rather than in words will contribute to forecast accuracy.’ (page 1002)

This statement may be true in some settings, but obviously not in general. There are many situations in which mathematical models have good predictive power and outperform informal judgments by a wide margin.

I wonder how well one could do with verbal predictions of a simple physical system? Score one for the models.

All data are wrong!

Simple descriptions of the Scientific Method typically run like this:

  • Collect data
  • Look for patterns
  • Form hypotheses
  • Gather more data
  • Weed out the hypotheses that don’t fit the data
  • Whatever survives is the truth

There’s obviously more to it than that, but every popular description I’ve seen leaves out one crucial aspect. Frequently, when the hypothesis doesn’t fit the data, it’s the data that’s wrong. This is not an invitation to cherry pick your data; it’s just recognition of a basic problem, particularly in social and business systems.

Any time you are building an integrated systems model, it’s likely that you will have to rely on data from a variety of sources, with differences in granularity, time horizons, and interpretation. Those data streams have probably never been combined before, and therefore they haven’t been properly vetted. They’re almost certain to have problems. If you’re only looking for problems with your hypothesis, you’re at risk of throwing the good model baby out with the bad data bathwater.

The underlying insight is that data is not really distinct from models; it comes from processes that are full of implicit models. Even “simple” measurements like temperature are really complex and assumption-laden, but at least we can easily calibrate thermometers and agree on the definition and scale of Kelvin. This is not always the case for organizational data.

A winning approach, therefore, is to pursue every lead:

  • Is the model wrong?
    • Does it pass or fail extreme conditions tests, conservation laws, and other reality checks?
    • How exactly does it miss following the data, systematically?
    • What feedbacks might explain the shortcomings?
  • Is the data wrong?
    • Do sources agree?
    • Does it mean what people think it means?
    • Are temporal patterns dynamically plausible?
  • If the model doesn’t fit the data, which is to blame?

When you’re building a systems model, it’s likely that you’re a pioneer in uncharted territory, and therefore you’ll learn something new and valuable either way.

Are all models wrong?

Artem Kaznatcheev considers whether Box’s slogan, “all models are wrong,” should be framed as an empirical question.

Building on the theme of no unnecessary assumptions about the world, @BlackBrane suggested … a position I had not considered before … for entertaining the possibility of a mathematical universe:

[Box’s slogan is] an affirmative statement about Nature that might in fact not be true. Who’s to say that at the end of the day, Nature might not correspond exactly to some mathematical structure? I think the claim is sometimes guilty of exactly what it tries to oppose, namely unjustifiable claims to absolute truth.

I suspect that we won’t learn the answer, at least in my lifetime.

In a sense, the appropriate answer is “who cares?” Whether or not there can in principle be perfect models, the real problem is finding ones that are useful in practice. The slogan isn’t helpful for this. (NIPCC authors seem utterly clueless as well.)

In a related post, AK identifies a 3-part typology of models that suggests a basis for guidance:

  • “Insilications – In physics, we are used to mathematical models that correspond closely to reality. All of the unknown or system dependent parameters are related to things we can measure, and the model is then used to compute dynamics, and predict the future value of these parameters. …
  • Heuristics – … When George Box wrote that “all models are wrong, but some are useful”, I think this is the type of models he was talking about. It is standard to lie, cheat, and steal when you build these sort of models. The assumptions need not be empirically testable (or even remotely true, at times), and statistics and calculations can be used to varying degree of accuracy or rigor. … A theorist builds up a collection of such models (or fables) that they can use as theoretical case studies, and a way to express their ideas. It also allows for a way to turn verbal theories into more formal ones that can be tested for basic consistency. …
  • Abstractions – … These are the models that are most common in mathematics and theoretical computer science. They have some overlap with analytic heuristics, except are done more rigorously and not with the goal of collecting a bouquet of useful analogies or case studies, but of general statements. An abstraction is a model that is set up so that given any valid instantiation of its premises, the conclusions necessarily follow. …”

The social sciences are solidly in the heuristics realm, while a lot of science is in the insilication category. The difficulty is knowing where the boundary lies. Actually, I think it’s a continuum, not a categorical. One can get some hint by looking at the problem context for models. For example:

Known state variables? Reality Checks (conservation laws, etc.)? Data per concept? Structural information from more granular observations or models? Experiments? Computation?
Physics yes lots lots yes yes often easy
Climate yes some some for many things not at scale limited
Economics no some some – flaky microfoundations often lacking or unused not at scale limited

(Ironically, I’m implying a model here, which is probably wrong, but hopefully useful.)

A lot of our most interesting problems are currently at the heuristics end of the spectrum. Some may migrate toward better model performance, and others probably won’t – particularly models of decision processes that willfully ignore models.

Models and metaphors

My last post about metaphors ruffled a few feathers. I was a bit surprised, because I thought it was pretty obvious that metaphors, like models, have their limits.

The title was just a riff on the old George Box quote, “all models are wrong, some are useful.” People LOVE to throw that around. I once attended an annoying meeting where one person said it at least half a dozen times in the space of two hours. I heard it in three separate sessions at STIA (which was fine).

I get nervous when I hear, in close succession, about the limits of formal mathematical models and the glorious attributes of metaphors. Sure, a metaphor (using the term loosely, to include similes and analogies) can be an efficient vehicle for conveying meaning, and might lend itself to serving as an icon in some kind of visualization. But there are several possible failure modes:

  • The mapping of the metaphor from its literal domain to the concept of interest may be faulty (a bathtub vs. a true exponential decay process).
  • The point of the mapping may be missed. (If I compare my organization to the Three Little Pigs, does that mean I’ve built a house of brick, or that there are a lot of wolves out there, or we’re pigs, or … ?)
  • Listeners may get the point, but draw unintended policy conclusions. (Do black swans mean I’m not responsible for disasters, or that I should have been more prepared for outliers?)

These are not all that different from problems with models, which shouldn’t really come as a surprise, because a model is just a special kind of metaphor – a mapping from an abstract domain (a set of equations) to a situation of interest – and neither a model nor a metaphor is the real system.

Models and other metaphors have distinct strengths and weaknesses though. Metaphors are efficient, cheap, and speak to people in natural language. They can nicely combine system structure and behavior. But that comes at a price of ambiguity. A formal model is unambiguous, and therefore easy to test, but potentially expensive to build and difficult to share with people who don’t speak math. The specificity of a model is powerful, but also opens up opportunities for completely missing the point (e.g., building a great model of the physics of a situation when the crux of the problem is actually emotional).

I’m particularly interested in models for their unique ability to generate reliable predictions about behavior from structure and to facilitate comparison with data (using the term broadly, to include more than just the tiny subset of reality that’s available in time series). For example, if I argue that the number of facebook accounts grows logistically, according to dx/dt=r*x*(k-x) for a certain r, k and x(0), we can agree on exactly what that means. Even better, we can estimate r and k from data, and then check later to verify that the model was correct. Try that with “all the world’s a stage.”

If you only have metaphors, you have to be content with not solving a certain class of problems. Consider climate change. I say it’s a bathtub, you say it’s a Random Walk Down Wall Street. To some extent, each is true, and each is false. But there’s simply no way to establish which processes dominate accumulation of heat and endogenous variability, or to predict the outcome of an experiment like doubling CO2, by verbal or visual analogy. It’s essential to introduce some math and data.

Models alone won’t solve our problems either, because they don’t speak to enough people, and we don’t have models for the full range of human concerns. However, I’d argue that we’re already drowning in metaphors, including useless ones (like “the war on [insert favorite topic]”), and in dire need of models and model literacy to tackle our thornier problems.

Better Lies

Hoisted from the comments, Miles Parker has a nice reflection on modeling in this video, Why Model Reality.

It might be subtitled, “Better Lies,” a reference to modeling as the pursuit of better stories about the world, which remain never quite true (a variation on the famous Box quote, “All models are wrong but some are useful.”). A few nice points that I picked out along the way,

  • All thinking, even about the future, is retrospective.
  • Big Data is Big Dumb, because we’re collecting more and more detail about a limited subset of reality, and thus suffer from sampling and “if your only tool is a hammer …” bias.
  • A crucial component of a modeling approach is a “bullshit detector” – reality checks that identify problems at various levels on the ladder of inference.
  • Model design is more than software engineering.
  • Often the modeling process is a source of key insights, and you don’t even need to run the model.
  • Modeling is a social process.

Coming back to the comment,

I think one of the greatest values of a model is that it can bring you to the point where you say “There isn’t any way to build a model within this methodology that is not self-contradicting. Therefore everyone in this room is contradicting themselves before they even open their mouths.”

I think that’s close to what Dana Meadows was talking about when she placed paradigms and transcendence of paradigms on the list of places to intervene in systems.

It reminds me of Gödel’s incompleteness theorems. With that as a model, I’d argue that one can construct fairly trivial models that aren’t self-contradictory. They might contradict a lot of things we think we know about the world, but by virtue of their limited expressiveness remain at least true to themselves.

Going back to the elasticity example, if I assert that oilConsumption = oilPrice^epsilon, there’s no internal contradiction as long as I use the same value of epsilon for each proposition I consider. I’m not even sure what an internal contradiction would look like in such a simple framework. However, I could come up with a long list of external consistency problems with the model: dimensional inconsistency, lack of dynamics, omission of unobserved structure, failure to conform to data ….

In the same way, I would tend to argue that general equilibrium is an internally consistent modeling paradigm that just happens to have relatively little to do with reality, yet is sometimes useful. I suppose that Frank Ackerman might disagree with me, on the grounds that equilibria are not necessarily unique or stable, which could raise an internal contradiction by violating the premise of the modeling exercise (welfare maximization).

Once you step beyond models with algorithmically simple decision making (like CGE), the plot thickens. There’s Condorcet’s paradox and Arrow’s impossibility theorem, the indeterminacy of Arthur’s El Farol bar problem, and paradoxes of zero discount rates on welfare.

It’s not clear to me that all interesting models of phenomena that give rise to self-contradictions must be self-contradicting though. For example, I suspect that Sterman & Wittenberg’s model of Kuhnian scientific paradigm succession is internally consistent.

Maybe the challenge is that the universe is self-referential and full of paradoxes and irreconcilable paradigms. Therefore as soon as we attempt to formalize our understanding of such a mess, either with nontrivial models, or trivial models assisting complex arguments, we are dragged into the quagmire of self-contradiction.

Personally, I’m not looking for the cellular automaton that runs the universe. I’m just hoping for a little feedback control on things that might make life on earth a little better. Maybe that’s a paradoxical quest in itself.