What is accumulation?

The SD Society posted a definition of accumulation on Facebook, and it caught my eye.

This is from the SD Glossary, by David Ford.

accumulation (integration) : a gradual, non-instantaneous increase or decrease of a quantity over time. An accumulator is also referred to as a stock or level and represents the state of a system. To accumulate is the act of increasing and decreasing the size of a state variable (a stock) over time.

I wrote,

I’m not a fan of this definition. Accumulation is not necessarily gradual or non-instantaneous. In fact, it’s quite common to accumulate a flow pulse to produce an abrupt step in a stock. The key feature of accumulation is that it’s, well, cumulative. I’m at a loss for a way to express that without mentioning integration, which won’t help most people. Maybe someone can do better?

I think it’s telling that we don’t have ready words to describe accumulation. That might be a symptom, or a cause, of our problematic mental models about bathtub dynamics and bathtub statistics.

Resorting to “integration” isn’t really helpful, except to the mathematically inclined, which is not the audience for this kind of description I think.

The dictionary definition of “cumulative” turns out to be helpful:

increasing by successive additions

With that in mind, I’d propose something like:

  • accumulation : increasing by successive additions, or decreasing by successive subtractions.
  • stock (level) : A variable representing a persistent state in a system, which can be considered the memory of the system. Stocks change by accumulation of flows.
  • flow (rate): A variable that contributes to cumulative change in a stock over time. Flows represent activity or change in a system. A flow may represent the movement of physical quantities between stocks within a system boundary or across the model boundary and thereby into or out of the system (sinks and sources), or the rate of change of a nonphysical or intangible state.

Note that it’s hard to discuss accumulation without also discussing stocks and flows, so I’ve modified all three glossary entries.

What is SD?

Asmeret Naugle, Saeed Langarudi, Timothy Clancy propose to define System Dynamics in a new paper.

The defining characteristics are: (1) models are based on causal feedback structure, (2) accumulations and delays are foundational, (3) models are equation-based, (4) concept of time is continuous, and (5) analysis focuses on feedback dynamics.

I like the paper, but … not so fast. I think more, and more flexible, criteria are needed. I would use the term “characterize” rather than “define.” The purpose should be to aid recognition of SD, and hopefully good SD, without drawing too tight a box around the field.

I particularly disagree with the inclusion of continuous time. Even though discrete time stinks, I think continuous time is a common but inessential feature, like continuous flows. Many models include occasional discrete events, and sometimes they’re important. Ventity’s actions are explicit discrete events between time steps, and they may modify model structure in ways that are key to an operational representation of reality.

My top-of-mind alternative framework looks like:

I think it’s also helpful to describe things that are not SD:

  • Intertemporal optimization or rational expectations representing behavior
  • Computable general equilibrium
  • Linear regression
  • Linear programming
  • Mixed integer programming
  • Social Network Analysis (static)
  • Discrete ABM
  • Discrete event simulation
  • Equilibrium
  • Simultaneity

Sometimes it’s easier to see the negative space, but there are exceptions to these rules.

I think it’s notable that both frameworks exclude a variety of qualitative systems thinking approaches, like group model building or elicitation methods that create CLDs rather than simulatable models. I’m a big tent fan, and certainly some of the exceptions are common at the SD conference, but does that make them SD?

I think behavior is another challenging feature to describe. In my mind, System Dynamics is almost synonymous with behavioral dynamics. If you’re building an economic model in which agents explicitly know the future (e.g., via intertemporal optimization), it’s not an SD model (though you might be using it as a comparison case for some SD purpose). Yet there’s a strong tradition of prize-winning biomedical models that lack behavior because they lack human agency. These are not easily distinguishable from what other fields might call ODEs or nonlinear dynamics. I would not want to eject those from the field, but neither would I want this to become our focus.

I’ll be interested to see how the conversation evolves on this.

Mental Models vs. Models in the Loop

Timothy Clancy, Saeed P. Langarudi and Raafat Zaini have an interesting new commentary in the SDR.

Never the strongest: reconciling the four schools of thought in system dynamics in the debate on quality

With the passing of Jay Forrester, the field of system dynamics exists at a similar crossroads. Debates of implicit, if not explicit, inheritance and future direction are already breaking out among competing generals. Who owns Forrester’s legacy? Will we proceed down the reference mode of the Macedonian and Mughal Empires—or will we instead seek an alternative reference mode of Alexandria: integration, reconciliation, and mutually recognized coexistence of different schools within the broader field of system dynamics?

We suggest the latter path—and that begins by recognizing at least four, if not more, distinct schools of thought on how to approach system dynamics and the study of complex systems. We believe these schools arise from differing mental models in the field and the consequences that arise in practice from these differences.

I haven’t really absorbed it yet, so I’ll refrain from direct comment, but it did spur me to finish off a draft of some similar thoughts on these questions.

I personally lean very much toward the hard science, data-driven side of the field: what the authors call the Empirical school of thought. But as a policy, I lean toward a big tent view of the field that includes work with low model content (which I don’t equate with low quality).

I think the central tension in the debate has already been posed by JWF and others long ago – all the way back to Industrial Dynamics really. In Some Basic Concepts in System Dynamics (2009), Forrester summarized,

The basic feedback loop in Figure 4 is too simple to represent real-world situations. But simple loops have more serious shortcomings—they are misleading and teach the wrong lessons. Most of our intuitive learning comes from very simple systems. The truths learned from simple systems are often completely opposite from the behavior of more complex systems. A person understands filling a water glass, as in Figure 3. But, if we go to a system that is only five times as complicated, as in Figure 5, intuition fails. A person cannot look at Figure 5 and anticipate the behavior of the pictured system.

Figure 5 from World Dynamics is five times more complicated than Figure 4 in the sense that it has five stocks—the rectangles in the figure. The figure shows how rapidly apparent complexity increases as more system stocks are added.

Mathematicians would describe Figure 5 as a fifth-order, nonlinear, dynamic system. No one can predict the behavior by studying the diagram or its underlying equations. Only by using computer simulation can the implied behavior be revealed.

I think the message is pretty clear here. To solve complex problems, you must formally simulate the system because mental simulations are treacherous. I’d go even one step further, and argue that it’s not sufficient to simulate the system once, figure out where the leverage point is, implement the solution, and toss out the model when you’re done. The simulation needs to become an ongoing part of the loop for model predictive control.

If that’s the ideal, why settle for anything less? I think there are a number of possible answers.

  1. Even in a perfect world where it’s easy to construct the model-in-the-loop, you need buy-in from the participants in the system to implement the model, and that requires a skillset that’s quite distinct.
  2. While it’s true that no one can intuit the behavior of a 10th order system, there might be a lot of value in managing low-order components of the system that are amenable to mental simulation or simple decision rules. There might be two reasons for this:
    • The complex system is dominated by a few key parameters (as in sloppy systems).
    • Risking global suboptimization by improving locally is better than optimizing nothing (though this might be a matter of luck).
  3. Often no single stakeholder in the system has the resources or authority to implement needed changes. But exposing the connectivity of the system, even if you can’t predict exactly how it works, is sometimes enough to catalyze creation of higher-level structures that enable change in the future.
  4. Not everyone is, or wants to be, a modeler. Moreover some participants in the system may reject models, data, and pretty much everything else since the Enlightenment, but you still have to include them.
  5. The non-modelers, as participants in the system, hold key knowledge that the modelers need.
  6. A qualitative map of a system is a good start towards an eventual quantitative model.
  7. Not every problem is big enough to model.

I’m sure you can probably think of more. I think these are good reasons to embrace non-model-based work on systems, as long as one refrains from making strong predictions about behavior from incomplete descriptions of behavior. Fortunately that leaves a lot of interesting things to think about.

I think the opposite perspective, that nothing is worth doing without a model and data, requires some counterexamples. Are there instances in which a group mapping exercise, playing a dynamic game, or engaging in cross-functional dialog led to reduced performance? I’m not aware of good examples of this, and certainly not of good diagnoses of the outcome. Attribution in complex systems is notoriously difficult. I think what this suggests is that we need stronger links to the evaluation research community, because we don’t really know what works and what doesn’t. We already have some strength in this area from the dynamic decision making experiment thread of SD, but … physician heal thyself.

There is one thing that troubles me though, just beyond the boundaries of our field. It’s climate policy (and related global issues). Most climate policy advocates are in some sense systems thinkers. Many build nice diagrams or use other systemic tools. If you don’t care about systems, it’s hard to see why you’d care about climate to begin with.

Yet … it seems that a substantial fraction of people who are pro-climate policy favor policies that are counterproductive or insufficient. They like low-carbon fuel standards that are unstable, inefficient, and can even increase emissions. They like standards that allocate more property rights to bigger polluters, or simply make it harder to change. They like to impose constraints on new fossil supply that work exactly like OPEC to increase prices and profits for incumbent producers. They subsidize EVs and solar, increasing the incentive to consume energy and congest roads, with benefits accruing to the rich who can afford the capital outlay.

What this means is that my reason #2, “Risking global suboptimization by improving locally is better than optimizing nothing,” isn’t working out too well. I think this is exactly the kind of counterintuitive behavior of social systems that JWF was referring too. I don’t believe you can sort these things out with CLDs or other qualitative methods, except perhaps when they are used as explanatory tools for underlying formal models.

I think the bottom line is that, inside the big tent, the tall pole must remain construction and validation of robust behavioral dynamic models.

Feedback is Interdisciplinary

Quite a while ago, I wrote about modeling the STEM workforce:

An integrated model needs three things: what, how, and why. The “what” is the state of the system – stocks of students, workers, teachers, etc. in each part of the system. Typically this is readily available – Census, NSF and AAAS do a good job of curating such data. The “how” is the flows that change the state. There’s not as much data on this, but at least there’s good tracking of graduation rates in various fields, and the flows actually integrate to the stocks. Outside the educational system, it’s tough to understand the matrix of flows among fields and economic sectors, and surprisingly difficult even to get decent measurements of attrition from a single organization’s personnel records. The glaring omission is the “why” – the decision points that govern the aggregate flows. Why do kids drop out of science? What attracts engineers to government service, or the finance sector, or leads them to retire at a given age? I’m sure there are lots of researchers who know a lot about these questions in small spheres, but there’s almost nothing about the “why” questions that’s usable in an integrated model.

I think the current situation is a result of practicality rather than a fundamental philosophical preference for analysis over synthesis. It’s just easier to create, fund and execute standalone micro research than it is to build integrated models.

According to Jay Forrester, Gordon Brown said it much more succinctly:

The message is in the feedback, and the feedback is inherently
interdisciplinary.

AI doesn’t help modelers

Large language model AI doesn’t help with modeling. At least, that’s my experience so far.


DALL-E images from Bing image creator.

On the ACM blog, Bertrand Meyer argues that AI doesn’t help programmers either. I think his reasons are very much compatible with what I found attempting to get ChatGPT to discuss dynamics:

Here is my experience so far. As a programmer, I know where to go to solve a problem. But I am fallible; I would love to have an assistant who keeps me in check, alerting me to pitfalls and correcting me when I err. A effective pair-programmer. But that is not what I get. Instead, I have the equivalent of a cocky graduate student, smart and widely read, also polite and quick to apologize, but thoroughly, invariably, sloppy and unreliable. I have little use for such  supposed help.

He goes on to illustrate by coding a binary search. The conversation is strongly reminiscent of our attempt to get ChatGPT to model jumping through the moon.

And then I stopped.

Not that I had succumbed to the flattery. In fact, I would have no idea where to go next. What use do I have for a sloppy assistant? I can be sloppy just by myself, thanks, and an assistant who is even more sloppy than I is not welcome. The basic quality that I would expect from a supposedly intelligent  assistant—any other is insignificant in comparison —is to be right.

It is also the only quality that the ChatGPT class of automated assistants cannot promise.

I think the fundamental problem is that LLMs aren’t “reasoning” about dynamics per se (though I used the word in my previous posts). What they know is derived from the training corpus, and there’s no reason to think that it reflects a solid understanding of dynamic systems. In fact there are presumably lots of examples in the corpus of failures to reason correctly about dynamic causality, even in the scientific literature.

This is similar to the reason AI image creators hallucinate legs and fingers: they know what the parts look like, but they don’t know how the parts work together to make the whole.

To paraphrase Meyer, LLM AI is the equivalent of a polite, well-read assistant who lacks an appreciation for complex systems, and aggressively indulges in laundry-list, dead-buffalo thinking about all but the simplest problems. I have no use for that until the situation improves (and there’s certainly hope for that). Worse, the tools are very articulate and confident in their clueless pronouncements, which is a deadly mix of attributes.

Related: On scientific understanding with artificial intelligence | Nature Reviews Physics

Sources of Information for Modeling

The traditional picture of information sources for modeling is a funnel. For example, in Some Basic Concepts in System Dynamics (2009), Forrester showed:

I think the diagram, or at least the concept, is much older than that.

However, I think the landscape has changed a lot, with more to come. Generally, the mental database hasn’t changed too much, but the numerical database has grown a lot. The funnel isn’t 1-dimensional, so the relationships have changed on some axes, but not so much on others.

Notionally, I’d propose that the situation is something like this:

The mental database is still king for variety of concepts and immediacy or salience of information (especially to the owner of the brain involved). And, it still has some weaknesses, like the inability to easily observe, agree on and quantify the constructs included in it. In the last few decades, the numerical database has extended its reach tremendously.

The proper shape of the plot is probably very domain specific. When I drew this, I had in mind the typical corporate or policy setting, where information systems contain only a fraction of the information necessary to understand the organizations involved. But in some areas, the reverse may be true. For example, in earth systems, datasets are vast and include measurements that human senses can’t even make, whereas personal experience – and therefore mental models – is limited and treacherous.

I think I’ve understated the importance of the written database in the diagram above – perhaps I’m missing a dimension characterizing its cumulative nature (compared to the transience of mental databases). There’s also an interesting evolution underway, as tools for text analysis and large language models (ChatGPT) are making the written database more numerical in nature.

Finally, I think there’s a missing database in the traditional framework, which has growing importance. That’s the database of models themselves. They’ve been around for a long time – especially in physical sciences, but also corporate spreadsheets and the like. But increasingly, reasonably sophisticated models of organizational components are available as inputs to higher-level strategic problem solving modeling efforts.

Scientific Revolutions in Ventity

I’ve long wanted to translate the Sterman-Wittenberg model of Kuhnian paradigm revolutions to Ventity. The original was in Dynamo, and I translated that to Vensim, but neither is really satisfactory, because both require provisioning array space for new paradigms statically, before it’s needed. This means simulating lots of useless 0s, and even worse, looking at them in the output.

The model is about the lifecycle of scientific paradigms, so a central feature is the occasional introduction and evolution of new paradigms, which eventually accumulate enough anomalies to erode confidence, making them vulnerable to the next great idea. So ideally, you’d like to introduce new paradigms dynamically and delete them when they no longer have many adherents. Dynamic creation and deletion of entities is of course a core feature of Ventity – it’s the tool this model has been waiting for all those years.

I finally got around to translating my Vensim version to Ventity recently. It works beautifully:

Above, paradigm confidence, showing eight dominant paradigms as well as many smaller paradigms that never rise to dominance. They disappear when they run out of adherents. Below, puzzles under attack for the same paradigms.

Links to the source papers and more notes on the model are in the Vensim library entry. I think the dynamics are generalizable to other aspects of thinking in paradigms, like filter bubbles. The model is also a bit ‘meta’: Ventity as a distinct modeling paradigm that’s neither in the classical array-based world nor the code-based discrete agent world has struggled to win mindshare.

A minor note on use: the Run Config includes two setups: “replicate” and “random”. The “replicate” setup, which is inactive by default, launches paradigms at fixed times given by initialization data from a run of the Vensim version. This makes it possible to compare the simulations without divergence from randomness. However, the randomized run will normally be the more interesting way to work with this model.

The model (requires Ventity, which has a free trial license):

SciRev 15.zip

Computer Collates Climate Contrarian Claims

Coan et al. in Nature have an interesting text analysis of climate skeptics’ claims.

I’ve been at this long enough to notice that a few perennial favorites are missing, perhaps because they date from the 90s, prior to the dataset.

The big one is “temperature isn’t rising” or “the temperature record is wrong.” This has lots of moving parts. Back in the 90s, a key idea was that satellite MSU records showed falling temperatures, implying that the surface station record was contaminated by Urban Heat Island (UHI) effects. That didn’t end well, when it turned out that the UAH code had errors and the trend reversed when they were fixed.

Later UHI made a comeback when the SurfaceStations project crowdsourced an assessment of temperature station quality. Some turned out to be pretty bad. But again, when the dust settled, it turned out that the temperature trend was bigger, not smaller, when poor sites were excluded and TOD was corrected. This shouldn’t have been a surprise, because windy day analsyses and a dozen other things already ruled out UHI, but …

I consider this a reminder of the fact that part of the credibility of mainstream climate science arises not from the fact that models are so good, but because so many alternatives have been tried, and proved so bad, only to rise again and again.