Who cares if your model is rubbish?

All models are wrong, so does it matter if your model is more wrong than necessary?

If you’re thinking “no,” you’re not alone (but you won’t like this blog). Some models are used successfully for pure propaganda. Like the Matrix, they can be convincing for those who don’t get too curious about what lies beneath. However, if that’s all we learn to do with models, we’re doomed. The real potential of models is for improving control, by making good contingent predictions of “what might happen if we do X.”

Doing the best possible job of that involves tradeoffs among scope, depth and quality, between formal and informal methods, and between time spent modeling and time spent on everything else – data collection, group process, and execution. It’s difficult to identify the best choices for a given messy situation.

You can hide ugly aspects of a model by embedding it in a fancy interface, or by showing confidence bounds on simulations without examining individual trajectories. But if you take the low (quality) road, you’re cheating yourself, your clients and the world out of a lot of good insight.


  • You’ll make bad decisions.
  • You won’t learn much, or at least you won’t learn much that’s right.
  • You’ll get into trouble when you attempt to reuse or extend the model later.
  • People will find out. Maybe. (Sadly, if the situation is complex enough, they won’t.) Eventually, this may affect your credibility.
  • You will get less recognition for your work. (Models that are too large, and insufficiently robust, are the primary failure mode in the papers I review.)
  • The process will destroy your soul, or at least your brain.

That last point is the dealbreaker for me. I’m into modeling for the occasional glimpses of truth and beauty. Without that, it’s no fun.

An Example

Suppose you’re modeling a city. You have sectors for residents (people), land use (commercial and residential buildings), business, education, transportation, utilities, etc. Individually, it’s “easy” to model these things. Some of the components, like population cohorts, have well known solutions. There are lots of SD models of schools, hospitals, and other assets. I’ve never actually modeled a sewer system, but I bet I could do a credible job with a few stocks of treatment capacity, water-consuming capital that generates load, and so on.

I see lots of models that stop here, making a few connections among sectors, but failing to enforce consistency. I’d describe these as “sectors flying in loose formation” rather than an integrated model. The problem is that real-world sectors, components, entities, or whatever you call them don’t have clean boundaries. They interact and overlap in many ways. What you need inside the model of a sector depends on what you need to connect it to, which depends in turn on what the model is for. The number of possible connections among components rises very quickly, in a classic combinatorial explosion. Just checking all the 2-way interactions requires N*(N-1) reality checks for N concepts.

This means that you can’t build a good demographic model in isolation. The people in the city’s age cohorts and the employees in the city’s businesses and government are the same people. The children in the younger cohorts are also the children in the education system. All of these things have to have internally consistent interactions. If you model a policy that attracts more business to the city, those businesses need employees,  those employees need housing and utilities, and their children need education. The future attractiveness of the city to business will depend, among other things, on the skills created by that education.

As a result, building an integrated model of population, land use, business, education, transportation and utilities is radically different from, and more difficult than, building independent models of these sectors, due to the number of interactions. Whenever you omit or misrepresent one of the necessary integrating links, you bias the policy response of your model. The effect could be minor – you choose the wrong tax rate due to omitting the costs of in-migration, or it could be massive – you miss the nexus of reinforcing feedbacks that would allow you to create a second Silicon Valley.

If you have a model with 7 sectors, you have two problems. First, are your seven sectors reasonable? Do the units balance, are people conserved, is there first-order negative feedback on outflows? If so, you’re not done, because there are (potentially) 42 two-sector interactions to consider, 210 three-sector interactions, and so on. If you’re not spending the majority of your time testing the implications of Sector A’s inputs and outputs on the requirements for Sector B, and vice versa, you’re probably missing something.

Of course, if you decide to leave out 3 sectors to reduce the complexity, you’re also missing something. So, which are worse, the sins of omission (from deliberate simplicity), or the sins of commission (from complexity with errors)? I think it’s hard to know in general, but I always try to err on the side of simplicity (for these reasons). In a simplified model, you essentially assume that certain relationships are 0. In an erroneous complex model, you assume that they are nonzero, but with an unknown (possibly large) gain, and your overall workload is larger, so you have less opportunity to discover the limitations of the model.

The “unknown gain” of undiscovered errors is not the same as a formal investigation of uncertainty. Injecting probabilistic parameters into a poorly understood, complex model does not compensate for quality problems; it merely propagates the uncertainty into a system with dubious fidelity, with correspondingly unpredictable results. Sensitivity analysis is a great way to discover problems in your model, but a lousy way to fix them.

The solution

So, how do you get from a big, messy model to a more elegant, robust version?

1. Do a lot of testing. Nearly anything will do. Pick an important state or decision in the model and disturb it significantly. Does the effect on everything else in the model make sense? (I really mean everything; you can set the graph tool in Vensim to plot every level in your model quickly, if it’s not too huge.)

  • You can formalize your tests as Reality Checks or other scripted experiments, which have advantages of speed and reusability.
  • Calibration is a useful approach to testing. The data may provide only weak constraints on quality, but they’re still important, as is learning about the data.
  • Policy optimization is also useful, because the optimizer will ruthlessly exploit weaknesses of the model to maximize outcomes.
  • The Vensim optimizer will also run a vector of tests on every parameter in the model for you.
  • Randomized sensitivity runs may reveal things you wouldn’t otherwise think to test.

2. Simplify. It’s hard to take a big model and make everything as connected and consistent as it needs to be, so something has to go.

  • If you have conflicting cohorts or aging chains, try reducing  the number of levels in the structure. You lose fidelity because you have to assume that things are well-mixed within fewer categories, but that’s OK if it lets you do a better job of capturing the big picture.
  • If you can’t adequately model the interactions among water, sewer and electric utilities, merge them into a generic “infrastructure capacity” concept.

3. Divide and conquer. Sometimes a portfolio of small models can span the set of questions people are asking about a system better than one giant model.

  • For example, it may be impractical to combine income inequality, energy efficiency impacts of building capital stock turnover, and climate adaptation all in one model. But you might be able to come up with several submodels that make key points about each problem and some of their interactions.
  • Similarly, you can split the roles of simulation and systems thinking. Generate a huge, but informal, causal loop diagram that captures the full richness of interactions thought to exist in a system. But model only a few key ones, and devote some time to building bridges that help people to appreciate the key dynamics in the simulation in the context of the rich picture.

4. Borrow and adopt. You can make faster progress if you can base your sector models on existing structure. That frees up time to focus on interactions among sectors. However, this strategy must be used with caution: the sectoral components you adopt may simply be incompatible.

5. Share the problem.

  • If the client sees you as the font of all wisdom, capable of modeling anything in a jiffy, you’re in trouble, because you have to respond to endless requests for new scope and detail. Detail helps people to appreciate how their part of a system fits into the big picture, but it’s the enemy of productivity and consistency. You can respond to a request for detail with an equivalent request for information: “how does (new item X) influence …” This will typically be a long list, which will dampen the enthusiasm for the request. If you engage the client with questions about ambiguities in the existing scope early on, you may never get to requests for more.
  • If you view the client as the font of all wisdom, you are compelled to model every concept they can articulate. This is a losing battle; even if they are brilliant, it’s unlikely that they’ve considered all the interactions that must be described in a formal model. Again, including the client in the exploration of “what if” and extreme conditions tests early on will help them to see the system in new ways, and head off the drive to unsubstantiated complexity.

I think the key is to maintain a mindset of humility. In some situations, you can build a model that’s basically right. But yours probably isn’t one of them -the situation is complex and messy. If it were easy, you wouldn’t be working on it. Acknowledge these things, then do the best you can to avoid making a bigger mess: build just as much high-quality structure as you can understand, and bring others along on the journey. Working as a team on manageable pieces of the problem, you can make impressive progress, and attract the resources needed to tackle the intractable parts of the problem later.

4 thoughts on “Who cares if your model is rubbish?”

  1. Tom,

    Interesting considerations. Leads me to wonder if “right and wrong” might not be the best way to think of the quality of a model. What about, for example, the “usefulness” of the model. One might then start thinking of the model along a scale of usefulness instead of a binary/black and white view of right or wrong (which, as you suggest, does not seem effective).

    I would also beware of simplification. Paul Meehl argued brilliantly against the notion of parsimony before he passed away. And, to bring just one of his many perspectives into the current conversation, there is no reason to believe, a priori, that a complex situation may be adequately understood be a simple model. Indeed, the opposite makes much more sense.

    Aside from that, building on your suggestion to “share the problem” in tandem with your suggestion to “divide and conquer” a more complex model may be better understood by a team – where each person takes responsibility for understanding their sub-model (or part of the whole model) along with the connections between their bailiwick and that of other team members.



    1. Nice thoughts.

      I quite agree that simplification can be carried too far. People do tend to get obsesses with Occam’s Razor and Von Neumann’s elephant wiggling its trunk. Often, I find it easier to get a good operational representation of a system by just adding some detail, rather than messing around trying to figure out an elegant way to aggregate things that really just aren’t the same. For example, we’re planning to disaggregate transportation in ClimateInteractive’s EnROADS model, because of the vastly different electrification potential for road, rail and other vehicles. Agent models are often good examples of detail that leads to clarity.

      Also, the root of some complexity is simply robustness. Real systems have lots of feedbacks that are only active in certain extreme conditions. You can leave them out for clarity, but then the model is valid over a smaller domain.

      OTOH complexity exceeding resources available to build the model is a common failure mode in academic papers I review.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.