Sources of Information for Modeling

The traditional picture of information sources for modeling is a funnel. For example, in Some Basic Concepts in System Dynamics (2009), Forrester showed:

I think the diagram, or at least the concept, is much older than that.

However, I think the landscape has changed a lot, with more to come. Generally, the mental database hasn’t changed too much, but the numerical database has grown a lot. The funnel isn’t 1-dimensional, so the relationships have changed on some axes, but not so much on others.

Notionally, I’d propose that the situation is something like this:

The mental database is still king for variety of concepts and immediacy or salience of information (especially to the owner of the brain involved). And, it still has some weaknesses, like the inability to easily observe, agree on and quantify the constructs included in it. In the last few decades, the numerical database has extended its reach tremendously.

The proper shape of the plot is probably very domain specific. When I drew this, I had in mind the typical corporate or policy setting, where information systems contain only a fraction of the information necessary to understand the organizations involved. But in some areas, the reverse may be true. For example, in earth systems, datasets are vast and include measurements that human senses can’t even make, whereas personal experience – and therefore mental models – is limited and treacherous.

I think I’ve understated the importance of the written database in the diagram above – perhaps I’m missing a dimension characterizing its cumulative nature (compared to the transience of mental databases). There’s also an interesting evolution underway, as tools for text analysis and large language models (ChatGPT) are making the written database more numerical in nature.

Finally, I think there’s a missing database in the traditional framework, which has growing importance. That’s the database of models themselves. They’ve been around for a long time – especially in physical sciences, but also corporate spreadsheets and the like. But increasingly, reasonably sophisticated models of organizational components are available as inputs to higher-level strategic problem solving modeling efforts.

Reading Between the Lines on Forrester’s Perspective on Data

I like Jay Forrester’s “Next 50 Years” reflection, except for his perspective on data:

I believe that fitting curves to past system data can be misleading.

OK, I’ll grant that fitting “curves” – as in simple regressions – may be a waste of time, but that’s a bit of a strawdog. The interesting questions are about fitting good dynamic models that pass all the usual structural tests as well as fitting data.

Also, the mere act of fitting a simple model doesn’t mislead; the mistake is believing the model. Simple fits can be extremely useful for exploratory analysis, even if you later discard the theories they imply.

Having a model give results that fit past data curves may impress a client.

True, though perhaps this is not the client you’d hope to have.

However, given a model with enough parameters to manipulate, one can cause any model to trace a set of past data curves.

This is Von Neumann’s elephant. He’s right, but I roll my eyes every time I hear this repeated – it’s a true but useless statement, like all models are wrong. Nonlinear dynamic models that pass SD quality checks usually don’t have anywhere near the degrees of freedom needed to reproduce arbitrary behaviors.

Doing so does not give greater assurance that the model contains the structure that is causing behavior in the real system.

On the other hand, if the model can’t fit the data, why would you think it does contain the structure that is causing the behavior in the real system?

Furthermore, the particular curves of past history are only a special case. The historical curves show how the system responded to one particular combination of random events impinging on the system. If the real system could be rerun, but with a different random environment, the data curves would be different even though the system under study and its essential dynamic character are the same.

This is certainly true. However, the problem is that the particular curve of history is the only one we have access to. Every other description of behavior we might use to test the model is intuitively stylized – and we all know how reliable intuition in complex systems can be, right?

Exactly matching a historical time series is a weak indicator of model usefulness.

Definitely.

One must be alert to the possibility that adjusting model parameters to force a fit to history may push those parameters outside of plausible values as judged by other available information.

This problem is easily managed by assigning strong priors to known parameters in the model calibration process.

Historical data is valuable in showing the characteristic behavior of the real system and a modeler should aspire to have a model that shows the same kind of behavior. For example, business cycle studies reveal a large amount of information about the average lead and lag relationships among variables. A business-cycle model should show similar average relative timing. We should not want the model to exactly recreate a sample of history but rather that it exhibit the kinds of behavior being experienced in the real system.

As above, how do we know what kinds of behavior are being experienced, if we only have access to one particular history? I think this comment implies the existence of intuitive data from other exemplars of the same system. If that’s true, perhaps we should codify those as reference modes and treat them like data.

Again, yielding to what the client wants may be the easy road, but it will undermine the powerful contributions that system dynamics can make.

This is true in so many ways. The client often wants too much detail, or too many scenarios, or too many exogenous influences. Any of these can obstruct learning, or break the budget.

These pages are full of data-free conceptual models that I think are valuable. But I also love data, so I have a different bottom line:

  • Data and calibration by themselves can’t make the model worse – you’re adding additional information to the testing process, which is good.
  • However, time devoted to data and calibration has an opportunity cost, which can be very high. So, you have to weigh time spent on the data against time spent on communication, theory development, robustness testing, scenario exploration, sensitivity analysis, etc.
  • That time spent on data is not all wasted, because it’s a good excuse to talk to people about the system, may reveal features that no one suspected, and can contribute to storytelling about the solution later.
  • Data is also a useful complement to talking to people about the system. Managers say they’re doing X. Are they really doing Y? Such cases may be revealed by structural problems, but calibration gives you a sharper lens for detecting them.
  • If the model doesn’t fit the data, it might be the data that is wrong or misinterpreted, and this may be an important insight about a measurement system that’s driving the system in the wrong direction.
  • If you can’t reproduce history, you have some explaining to do. You may be able to convince yourself that the model behavior replicates the essence of the problem, superimposed on some useless noise that you’d rather not reproduce. Can you convince others of this?

Limits to Big Data

I’m skeptical of the idea that machine learning and big data will automatically lead to some kind of technological nirvana, a Star Trek future in which machines quickly learn all the physics needed for us to live happily ever after.

First, every other human technology has been a mixed bag, with improvements in welfare coming along with some collateral damage. It just seems naive to think that this one will be different.


These are not the primary problem.

Second, I think there are some good reasons to think that problems will get harder at the same rate that machines get smarter. The big successes I’ve seen are localized point prediction problems, not integrated systems with a lot of feedback. As soon as causality are separated in time and space by complex mechanisms, you’re into sloppy systems territory, where data may constrain only a few parameters at a time. Making progress in such systems will increasingly require integration of multiple theories and data from multiple sources.

People in domains that have made heavy use of big data increasingly recognize this: Continue reading “Limits to Big Data”

All data are wrong!

Simple descriptions of the Scientific Method typically run like this:

  • Collect data
  • Look for patterns
  • Form hypotheses
  • Gather more data
  • Weed out the hypotheses that don’t fit the data
  • Whatever survives is the truth

There’s obviously more to it than that, but every popular description I’ve seen leaves out one crucial aspect. Frequently, when the hypothesis doesn’t fit the data, it’s the data that’s wrong. This is not an invitation to cherry pick your data; it’s just recognition of a basic problem, particularly in social and business systems.

Any time you are building an integrated systems model, it’s likely that you will have to rely on data from a variety of sources, with differences in granularity, time horizons, and interpretation. Those data streams have probably never been combined before, and therefore they haven’t been properly vetted. They’re almost certain to have problems. If you’re only looking for problems with your hypothesis, you’re at risk of throwing the good model baby out with the bad data bathwater.

The underlying insight is that data is not really distinct from models; it comes from processes that are full of implicit models. Even “simple” measurements like temperature are really complex and assumption-laden, but at least we can easily calibrate thermometers and agree on the definition and scale of Kelvin. This is not always the case for organizational data.

A winning approach, therefore, is to pursue every lead:

  • Is the model wrong?
    • Does it pass or fail extreme conditions tests, conservation laws, and other reality checks?
    • How exactly does it miss following the data, systematically?
    • What feedbacks might explain the shortcomings?
  • Is the data wrong?
    • Do sources agree?
    • Does it mean what people think it means?
    • Are temporal patterns dynamically plausible?
  • If the model doesn’t fit the data, which is to blame?

When you’re building a systems model, it’s likely that you’re a pioneer in uncharted territory, and therefore you’ll learn something new and valuable either way.