Confronting the dreaded blank canvas

Donella Meadows walks a class through the model conceptualization process in Creating Models from Scratch:

One interesting thing here is that she starts with a causal loop diagram. This is a case where there are some clear physical quantities of interest (people and mosquitoes), so I would probably have started with stocks and flows. (Or, I like to think I would.) But you never know how things are going to go – I can think of other situations where CLDs worked out better. The key is to stay flexible and switch methods as needed, and as the audience requires (see around 30:00 for the stock-flow implementation).

An application of model critique

This long post walks through a real-world critique of an interesting model – this year’s Dana Meadows Award winner.

Testing is the key to making a good model even better.

Model Quality: the High Road

Therefore, in the interest of continuous improvement, I’ll take a hard look at a very interesting model. To get in the spirit, you might want to take a look at How to Critique a Model and my video critique of World Dynamics. I’m taking this model apart not because it’s bad, but because it’s interesting and worth investigating. (Taking apart bad models is sometimes fun too, but the supply of them is overwhelming.)

Before digging in, let me point out that I have no particular expertise in this area, so my critique is purely technical.

Inspection

The model (originally in STELLA, translated to Vensim here) passed my initial sniff test – no strange formulations, ugly spaghetti, cryptic variable names, or unit errors in the original.

However, it turns out that STELLA’s unit checking is not very strict. For example, it permits:

LOG10(Free Cortisol)/LOG10(Ref Free Cortisol)

with cortisol in units of nmol. This is a conceptual error –  logarithms are fundamentally dimensionless. Fortunately, it’s without consequence for model behavior – it just scales the input to a lookup.

Here, a better normalization would be:

LOG10(Free Cortisol/Ref Free Cortisol)

In my translation, I didn’t fix these issues; I suppressed them with a “DMNL LOG10” macro that hides the warning.

STELLA also permits unnormalized lookups without issuing a warning (maybe this is a buried preference somewhere). This is not necessarily an error, but it’s not best practice. It may conceal errors, and makes analysis difficult (more on this below).

One reviewer pointed out that a number of physical processes in the model are represented by goal seeking structures – essentially SMOOTHs. Here, the number of glucocorticoid receptors adjusts toward a level indicated by cortisol levels:


This is basically a shorthand for a real physical process that must involve inflows and outflows, something like this:

The physical representation is potentially better because it’s more operational. It invites more thinking along the lines of “where do these receptors come from?” It exposes one important possibility: asymmetry. The process that increases GR numbers might have a different time constant from the process that decreases GR numbers. However, absent detailed information about GR regulation, I have no idea how to implement such a thing. Maybe no one does: my experience with biological models is that there are always many layers of complexity surrounding the system of interest, and the literature often just scratches the surface.

TIME STEP & PULSE

The first thing I test in most models is to vary TIME STEP to check stability. The usual trick is to halve TIME STEP and see if you get the same answer, but this model already runs for 92,160 time steps (128 steps per hour for 720 hours). I don’t see any delays that are small enough to require that, but you can’t always see implicit time constants in a model. Still, I wonder, could you get away with a coarser step?

If you double the TIME STEP, you immediately see differences:

This suggests that TIME STEP needs to be small (perhaps even smaller than its already-tiny value). But where does this come from? It turns out to be due to the test input. In the original, the external stress consists of a series of IF THEN ELSE statements, like:

IF((TIME>1) AND (TIME<1+Stress_Stimulus_Duration)) THEN (1) ELSE(0) + ...

I implemented this in the Vensim version via the PULSE TRAIN function. But there’s a small problem here: if the stress stimulus duration is not a power of 2, the effective width of the pulse will vary a little bit as you change the TIME STEP (assuming that it is a power of 2, as is usual to minimize roundoff error). That in turn means that the area under the curve of the stress perception inflow to the model varies slightly with the duration of the pulse.

Often, that won’t matter, but because this model has a numerically sensitive threshold, it matters a lot.

It turns out that if you renormalize the external stress input to deliver constant area under the curve (see the updated model for details), you can get away with a TIME STEP of .03125 – 4x bigger. I think one might carry this idea even further, and switch to RK4 Auto integration and make the test input smooth, but I haven’t tried that. Fortunately, all of this concerns the test input to the model alone; the dynamics are nearly unaffected.

Lookup Bounds

Next, I check runtime warnings. This model generates quite a few, all concerning lookups that are out of bounds, like:

WARNING: Lookup out of bounds at 24.125 In -#ProCyt Eff on TRP#- computing -ProCyt Eff on TRP-.

It might be OK to run off the ends of a lookup table, if the slope at the endpoints is zero. But I prefer to suppress these warnings by adding points to the ends of the lookup so that the needed domain is explicitly covered. Most of these turn out to be OK, but I modified them anyway to suppress the warnings:

A few cases are hard to reconcile without more knowledge than I have. For example:

Above, the GR function effect has a small discontinuity. Its input (GR function) seems to be bounded at one, but adding (1,1) to the lookup would cause a break in the slope. I prefer to leave such warnings in place for later review.

Extreme Conditions

My next probe of a model is generally random Synthesim overrides of key stocks and flows, to see whether the model is robust to extreme disturbances. Generally, I’d say that this model is unusually robust, in that it’s hard to get stocks to go negative or produce other undesirable behavior. That’s good. Part of the reason for this may be that many of the relationships in the model are sigmoids, and therefore bounded above and below.

Here’s one example of a test: I replicate the “probable depression” scenario from the paper, and increase the size of the one-time external stress to 100 (2x). Then I look at every stock in the model to see what happens (the stocks are the state of the system, so if you look at those, you know everything). This is easy in Vensim if you create an instance of the Strip Graph that shows all levels:

One thing jumps out at me: the initial response of serotonin is opposite the long term response:

This is not unusual, and it’s mentioned in the paper. However, I got curious about its origins, so I started causal tracing to identify the source of the behavior. The answer is … it’s complicated. But along the way, I do notice some fairly extreme nonlinear behaviors, like this:

Is this realistic, or is it a consequence of lookup table clipping and log(x)/log(y) normalizations? I can’t say for sure, but this tests the limits of what I perceive as reasonable behavior. But then, if systems don’t occasionally surprise you with weird (but real) behavior, you’re not paying attention. This is something I’d flag for further investigation.

Sensitivity

Ultimately, what we want out of this model is to identify interventions that can help people with immune/hormone/mood problems. The obvious way to get that advice out of the model is to do a lot of sensitivity analysis to test alternatives. Generically, we’re interested in two things:

  • Can you change the system state directly in some beneficial way, e.g. by administering a drug that supplies a hormone, or lowering stress?
  • Can you restructure the system, by changing parameters that govern the strength of feedback loops or adding/deleting links?

In a sense, these are all the same thing – a parameter is just a constant state that isn’t in the model (yet), and adding a link is like giving an implicit 0 parameter a nonzero value.

For the answers to make sense, you need a way to (a) influence each state, and (b) change the gain of each loop, preferably independently. In this model, that’s a bit tricky. Consider the effects of ProInflammatory Cytokines:

There are effects on stress, cortisol, and other things. Each has its own sigmoid-shaped lookup table transforming the input to the output. All the inputs are normalized to the same constant, Ref ProCyt. The normalization is good practice, but insufficient for testing purposes, because there’s no independent way to vary the gain on these loops by varying the shapes of the lookups. Yes, it’s possible to simply edit the curves, but that’s impractical for comprehensive, automated experimentation. Two approaches might be helpful:

1. Replace the lookups with parametric curves. Then the parameters can be varied to shift and stretch the functions. This is attractive because you get smooth behavior and a lot of flexibility. However, it’s a lot of work to implement. The functional forms may be arcane, and  you can’t easily visualize them until you run the model. Here are a few sigmoid options I’ve collected over the years:

:MACRO: SSHAPE3(x,slope,lowlim,uplim,x0)
SSHAPE3 = lowlim+(uplim-lowlim)*(1/(1+exp(-4*slope*xe))) ~ Dmnl ~ defaults: lowlim= 0; uplim = 1; slope = 1; x0=1 this gives a symmetric \ S-shape from lowlim to uplim through with 1 being the inflection point and \ derivative =  at this point = slope*(uplim-lowlim) |
xe = MAX(-ZIDZ(25,4*ABS(slope)),MIN(x-x0,ZIDZ(25,4*ABS(slope)))) ~ Dmnl ~ Clip to avoid floating point errors at extreme positive / negative values \ of x |
:END OF MACRO:

:MACRO: SSHAPE2(input)
SSHAPE2 = exp(MAX(-50,MIN(50,input)))/(1+exp(MAX(-50,MIN(50,input)))) ~ Dmnl ~ Exponential s-shaped curve; -infinity -> 0, 0 -> .5, infinity -> 1 |
:END OF MACRO:

:MACRO: SSHAPE(xin,profile)
SSHAPE = IF THEN ELSE( input>0.5, 1-(1-input)^profile*0.5/0.5^profile, input^profile*\ 0.5/0.5^profile) ~ Dmnl ~ S-shaped response, from 0-1 for input from 0-1. Profile should normally be >=1 \ (1=linear; 2=quadratic) Always passes through (0.5, 0.5) |
input = MIN(1,MAX(0,xin)) ~ xin ~ |
:END OF MACRO:

2. Apply scaling parameters around the lookups. For example, if you’re starting with:

output = lookup( input/reference input )

you can add:

output = reference output*lookup( input/reference input )^scale

or

output = reference output*( 1-scale + scale*lookup( input/reference input ))

These don’t give you full control over the upper and lower bounds, slopes and asymptotes of the table, and they might not work when the lookup doesn’t pass through some obvious point like (1,1) or (0,0). So, in some cases you may need to be cleverer, or to choose approach #1 instead.

STELLA lookups, and Vensim’s WITH LOOKUP function, don’t really lend themselves to this treatment – you have to add an additional variable to transform the output downstream of the lookup. That’s why I tend to prefer the original Vensim lookup syntax. However it’s implemented, I think some level of parametric control over lookup usage is essential.

After some noodling, I settled on the following policy:

  • For dimensionless parameters with (apparent) 0-1 bounds, or centered around 1, apply a scaling exponent, so y = y0*lookup(x/x0)^s
  • For parameters bounded below at 0, apply a scaling multiplier, so y = s*lookup(x/x0)
  • For parameters with log inputs, apply a shift of the input, so y = lookup(x/x0+s)
  • Where I couldn’t figure out what do do, or a loop already contains other independent scaling parameters, skip the item

With scaling parameters in place, I ran an all-constants sensitivity analysis on the model, testing the effect of 10% variations in each parameter against the integrated serotonin level over the simulation. I started from 2 cases: the “daily stress” scenario (repeated small events), and the “probable depression” scenario (one large stress event). I then sorted the results by rank of influence on serotonin:

These are interesting in several ways:

  • The model is only moderately sloppy – many parameters have a strong effect, especially in the Daily Stress scenario.
  • There are big differences in sensitivity between the two scenarios, even though they differ only in the test input. This suggests that policies might have to be tailored to the stressor, among other things.
  • Some of the scale parameters on lookups are near the top of the list, confirming that testing lookup tables matters.

From a policy standpoint, you have to know a little more to make sense of these. What matters is not so much the response to a 10% change, but the response to an X% change, where X is the amount you could plausibly move a parameter. For example, preventing degradation of glucocorticoid receptors is clearly important (Ref GR Deg Fraction, top of list). However, the corresponding Permanent Degeneration Time is at the bottom of the list, presumably because a 10% change from 10 hours has only a tiny effect on the time horizon of the simulation. One would have to be more ambitious than that, but it might still be important.

Bottom Line

While there are a few features that could be reexamined, this model stands up to hard use well. It would also have to pass the face validity test with people who actually know something about the system, but given the paper’s citation list, I would anticipate some success on that front.

I think there might be a lot of interesting policy implications lurking in this model, waiting for an intrepid explorer with more subject matter expertise than I have. I think the crucial point here is that the structure identifies a mechanism by which patient outcomes can be strongly path dependent, where positive feedback preserves a bad state long after harmful stimuli are removed. Among other things, this might explain why it’s so hard to treat such patients. That in turn could be a basis for something I’ve observed in the health system – that a lot of doctors find autoimmune diseases mysterious and frustrating, and respond with a variation on the fundamental attribution error – attributing bad outcomes to patient motivation when delayed, nonlinear feedback is responsible.

 

Model Quality: the High Road

There are two ways to go about building a model.

  • Plan A proceeds slowly. You build small or simple, aggregate components, and test each thoroughly before moving on.
  • Plan B builds a rough model spanning the large scope that you think encompasses the problem, then incrementally improves the solution.

Ideally, both approaches converge to the same point.

Plan B is attractive, for several reasons. It helps you to explore a wide range of ideas. It gives a satisfying illusion of rapid progress. And, most importantly, it’s satisfying for stakeholders, who typically have a voracious appetite for detail and a limited appreciation of dynamics.

The trouble is, Plan B does not really exist. When you build a lot of structure quickly, the sacrifice you have to make is ignoring lots of potential interactions, consistency checks, and other relationships between components. You’re creating a large backlog of undiscovered rework, which the extensive SD literature on projects tells us is fatal. So, you’re really on Path C, which leads to disaster: a large, incomprehensible, low-quality model.

In addition, you rarely have as much time as you think you do. When your work gets cut short, only Path A gives you an end product that you can be proud of.

So, resist pressures to include every detail. Embrace elegant simplicity and rich feedback. Check your units regularly, test often, and “always be done” (as Jim Hines puts it). Your life will be easier, and you’ll solve more problems in the long run.

RelatedHow to critique a model (and build a model that withstands critique)

DYNAMO

Today I was looking for DYNAMO documentation of the TRND macro. Lo and behold, archive.org has the second edition of the DYNAMO User Guide online. It reminds me that I was lucky to have missed the punch card era:

… but not quite lucky enough to miss timesharing and the teletype:

The computer under my desk today would have been the fastest in the world the year I finished my dissertation. We’ve come a long way.

Optimization and the Banana of Death

A colleague sent me a model that was yielding puzzling results in policy optimization. The model has multiple optima (not uncommon), so one question of interest is, how many peaks are there, and where? The parameter space is six-dimensional, so this is not practical to work out by intuition (especially for me, with no familiarity with the model).

One way to count the peaks is to use hill climbing optimization from multiple random starting points (Vensim’s Powell method, with multiple start). Then you look for clusters of endpoints that are presumably within a small tolerance of the maximum.

Interestingly, that doesn’t work out very well. After a lot of simulation, it appears that the model has two local optima. But in a large ensemble of simulations, about a third of the results make it to each of the peaks. The remaining third wind up strung out over the parameter space, seemingly at random.

Simple clustering algorithms like kmeans fail to discover the regularity of the results, so they indicate that there are more like half a dozen optima. But if you look at a scatter plot matrix of the solutions across the six dimensions, you quickly see it:

Scatter plat for results along the 6 parameter dimensions, plus the payoff.

Endpoints for 2 of the 6 parameters

Why does this happen? I think there are two reasons. First, the model is somewhat sloppy – two of the dimensions dominate the payoff, and the rest have small effects, and therefore are more difficult to traverse numerically. Second, along the minor dimensions, tradeoffs create a curving valley. Imagine a parabola in the z dimension, extruded along the hyperbolic x-y curve above. The upper surface of a banana is a pretty good model for this in 3D.

The basic idea of direction set optimization methods is to build up a set of “good” search directions, based on earlier searches along the principal axes. This works well if the surface is basically elliptical, like the Easter egg. But it doesn’t work on the banana, because there is no consistent good direction.

Suppose you arrive at a point on the ridge via a series of searches of the axes, with net direction given by the green arrow above. The projection of that net direction (orange) is not a good search direction, because it immediately begins to fall off the side of the ridge. Returning to the original principal axes yields the same problem. In fact, almost any set of directions, other than the lucky one that proceeds along the ridge, is likely to get stuck at an apparent local optimum. I’m pretty sure a gradient method will have the same problem (plus other numerical problems in more general cases).

What to do about this? I think there are several options.

If you know the hyperbolic valley is there, you can transform the dimensions to get rid of it. For example, if you take the logs of the axes, an hyperbola becomes a line. That makes the valley tractable for a direction set search. But it’s unlikely that you know about the curving valley a priori. Resource allocation tradeoffs are likely to create such features, but it’s tough to know when and where. So this is not a very general, or convenient, solution.

Another option is to forget about direction sets and go to a stochastic method. The differential evolution MCMC in Vensim also works as a simulated annealing optimizer. Essentially, you’re taking a motivated random walk on the payoff surface. There’s some willingness to take uphill steps, which prevents you from getting stuck in the curving valley. This approach works pretty well in this case. However, when hill-climbing works, it’s a lot more efficient than evolutionary methods.

I think there’s a third option, which you might call a 2nd order direction set. The basic idea is to estimate not just the progress, but the curvature of progress, over multiple iterations through a set of directions. This makes it possible to guess where the curvy ridge is heading. In general, as soon as you look for higher-order approximations of things, you become more susceptible to noise and numerical pathologies. That might make this a waste of time in some cases.

However, I’m experimenting with this in the context of a new parallel optimization code in Vensim, and it turns out to be computationally cheap to explore a few extra directions on the side, in the hope of getting lucky. So far, results are encouraging. The improvement from making iterative algorithms parallel in Vensim is already massive, and the 2nd-order “banana-killer” seems to add a further 50% improvement in progress along curvy ridges.

All this talk of fruit is making me hungry, so that’s it for now, but there’s much more to come on this frontier.

Sloppy System Dynamics

This post should be required reading for all modelers. And no, I’m not going to reproach sloppy modeling practices. This is much more interesting than that.

Sloppy models are an idea that formalizes a statement Jay Forrester made long ago, in Industrial Dynamics (13.5):

The third and least important aspect of a model to be considered in judging its validity concerns the values for its parameters (constant coefficients). The system dynamics will be found to be relatively insensitive to many of them. They may be chosen anywhere within a plausible range. The few sensitive parameters will be identified by model tests, and it is not so important to know their past values as it is to control their future values in a system redesign.

This remains true when you’re interested in estimation of parameters from data. At Ventana, we rely on the fact that structure and parameters for which you have no measurements will typically reveal themselves in the dynamics, if they’re dynamically important. (There are always pathological cases, where a nonlinearity makes something irrelevant in the past important in the future, but that’s why we don’t base models solely on formal data.)

Now, the required part.  Continue reading “Sloppy System Dynamics”

Discrete Time Stinks

Discrete time modeling is often convenient, occasionally right and frequently treacherous.

You often see models expressed in discrete time, like
Samuelson's multiplier-accelerator
That’s Samuelson’s multiplier-accelerator model. The same notation is ubiquitous in statistics, economics, ABM and many other areas.

Samuelson multiplier-accelerator schematic

So, what’s the problem?

  1. Most of the real world does not happen in discrete time. A few decisions, like electric power auctions, happen at regular intervals, but those are the exception. Most of the time we’re modeling on long time scales relative to underlying phenomena, and we have lots of heterogeneous agents or particles or whatever, with diverse delays and decision intervals.
  2. Discrete time can be artificially unstable. A stable continuous system can be made unstable by simulating at too large a discrete interval. A discrete system may oscillate, where its continuous equivalent would not.
  3. You can’t easily test for the effect of the time time step on stability. Q: If your discrete time model is running with one Excel row per interval, how will you test an interval that’s 1/2 or 1/12 as big for comparison? A: You won’t. Even if it occurs to you to try, it would be too much of a pain.
  4. The measurement interval isn’t necessarily the relevant dynamic time scale. Often the time step of a discrete model derives from the measurement interval in the data. There’s nothing magic about that interval, with respect to how the system actually works.
  5. The notions of stocks and flows and system state are obscured. (See the diagram from the Samuelson model above.) Lack of stock flow consistency can lead to other problems, like failure to conserve physical quantities.
  6. Units are ambiguous. This is a consequence of #5. When states and their rates of change appear on an equal footing in an equation, it’s hard to work out what’s what. Discrete models tend to be littered with implicit time constants and other hidden parameters.
  7. Most delays aren’t discrete. In the Samuelson model, output depends on last year’s output. But why not last week’s, or last century’s? And why should a delay consist of precisely 3 periods, rather than be distributed over time? (This critique applies to some Delay Differential Equations, too.)
  8. Most logic isn’t discrete. When time is marching along merrily in discrete lockstep, it’s easy to get suckered into discrete thinking: “if the price of corn is lower than last year’s price of corn, buy hogs.” That might be a good model of one farmer, but it lacks nuance, and surely doesn’t represent the aggregate of diverse farmers. This is not a fault of discrete time per se, but the two often go hand in hand. (This is one of many flaws in the famous Levinthal & March model.)

Certainly, there are cases that require a discrete time simulation (here’s a nice chapter on analysis of such systems). But most of the time, a continuous approach is a better starting point, as Jay Forrester wrote 50 years ago. The best approach is sometimes a hybrid, with an undercurrent of continuous time for the “physics” of the model, but with measurement processes represented by explicit sampling at discrete intervals.

So, what if you find a skanky discrete time model in your analytic sock drawer? Fear not, you can convert it.

Consider the adstock model, representing the cumulative effects of advertising:

Ad Effect = f(Adstock)
Adstock(t) = Advertising(t) + k*Adstock(t-1)

Notice that k is related to the lifetime of advertising, but because it’s relative to the discrete interval, it’s misleadingly dimensionless. Also, the interval is fixed at 1 time unit, and can’t be changed without scaling k.

Also notice that the ad effect has an instantaneous component. Usually there’s some delay between ad exposure and action. That delay might be negligible in some cases, like in-app purchases, but it’s typically not negligible for in-store behavior.

You can translate this into Vensim lingo literally by using a discrete delay:

Adstock = Advertising + k*Previous Adstock ~ GRPs
Previous Adstock = DELAY FIXED( Adstock, Ad Life, 0 ) ~ GRPs
Ad life = ... ~ weeks

That’s functional, but it’s not much of an improvement. Much better is to recognize that Adstock is (surprise!) a stock that changes over time:

Ad Effect = f(Adstock) ~ dimensionless
Adstock = INTEG( Advertising - Forgetting, 0 ) ~ GRPs
Advertising = ... ~ GRPs/week
Forgetting = Adstock / Ad Life ~ GRPs/week
Ad Life = ... ~ weeks

Now the ad life has a dimensioned real-world interpretation and you can simulate with whatever time step you need, independent of the parameters (as long as it’s small enough).

There’s one fly in the ointment: the instantaneous ad effect I mentioned above. That happens when, for example, the data interval is weekly, and ads released have some effect within their week of release – the Monday sales flyer drives weekend sales, for example.

There are two solutions for this:

  • The “cheat” is to include a bit of the current flow of advertising in the effective adstock, via a “current week effect” parameter. This is a little tricky, because it locks you into the weekly time step. You can generalize that away at the cost of more complexity in the equations.
  • A more fundamental solution is to run the model at a finer time step than the data interval. This gives you a cleaner model, and you lose nothing with respect to calibration (in Vensim/Ventity at least).

Occasionally, you’ll run into more than one delayed state on the right side of the equation, as with the inclusion of Y(t-1) and Y(t-2) in the Samuelson model (top). That generally signals either a delay with a complex structure (e.g., 2nd or higher order), or some other higher-order effect. Generally, you should be able to give a name and interpretation to these states (as with the construction of Y and C in the Samuelson model). If you can’t, don’t pull your hair out. It could be that the original is ill-formulated. Instead, think things through from scratch with stocks and flows in mind.

A tale of Big Data and System Dynamics

I recently worked on a fascinating project that combined Big Data and System Dynamics (SD) to good effect. Neither method could have stood on its own, but the outcome really emphasized some of the strategic limitations of the data-driven approach. Including SD in the project simultaneously lowered the total cost of analysis, by avoiding data processing for things that could be determined a priori, and increased its value by connecting the data to business context.

I can’t give a direct account of what we did, because it’s proprietary, but here’s my best shot at the generalizable insights. The context was health care for some conditions that particularly affect low income and indigent populations. The patients are hard to track and hard to influence.

Two efforts worked in parallel: Big Data (led by another vendor) and System Dynamics (led by Ventana). I use the term “SD” loosely, because much of what we ultimately did was data-centric: agent based modeling and estimation of individual-level nonlinear dynamic models in Vensim. The Big Data vendor’s budget was two orders of magnitude greater than ours, mostly due to some expensive system integration tasks, but partly due to the caché of their brand and flashy approach, I suspect. Continue reading “A tale of Big Data and System Dynamics”

The Ambiguity of Causal Loop Diagrams and Archetypes

I find causal loop diagramming to be a very useful brainstorming and presentation tool, but it falls short of what a model can do for you.

Here’s why. Consider the following pair of archetypes (Eroding Goals and Escalation, from wikipedia):

Eroding Goals and Escalation archetypes

Archetypes are generic causal loop diagram (CLD) templates, with a particular behavior story. The Escalation and Eroding Goals archetypes have identical feedback loop structures, but very different stories. So, there’s no unique mapping from feedback loops to behavior. In order to predict what a set of loops is going to do, you need more information.

Here’s an implementation of Eroding Goals:

Notice several things:

  • I had to specify where the stocks and flows are.
  • “Actions to Improve Goals” and “Pressure to Adjust Conditions” aren’t well defined (I made them proportional to “Gap”).
  • Gap is not a very good variable name.
  • The real world may have structure that’s not mentioned in the archetype (indicated in red).

Here’s Escalation:

The loop structure is mathematically identical; only the parameterization is different. Again, the missing information turns out to be crucial. For example, if A and B start with the same results, there is no escalation – A and B results remain constant. To get escalation, you either need (1) A and B to start in different states, or (2) some kind of drift or self-excitation in decision making (green arrow above).

Even then, you may get different results. (2) gives exponential growth, which is the standard story for escalation. (1) gives escalation that saturates:

The Escalation archetype would be better if it distinguished explicit goals for A and B results. Then you could mathematically express the key feature of (2) that gives rise to arms races:

  • A’s goal is x% more bombs than B
  • B’s goal is y% more bombs than A

Both of these models are instances of a generic second-order linear model that encompasses all possible things a linear model can do:

Notice that the first-order and second-order loops are disentangled here, which makes it easy to see the “inner” first order loops (which often contribute damping) and the “outer” second order loop, which can give rise to oscillation (as above) or the growth in the escalation archetype. That loop is difficult to discern when it’s presented as a figure-8.

Of course, one could map these archetypes to other figure-8 structures, like:

How could you tell the difference? You probably can’t, unless you consider what the stocks and flows are in an operational implementation of the archetype.

The bottom line is that the causal loop diagram of an archetype or anything else doesn’t tell you enough to simulate the behavior of the system. You have to specify additional assumptions. If the system is nonlinear or stochastic, there might be more assumptions than I’ve shown above, and they might be important in new ways. The process of surfacing and testing those assumptions by building a stock-flow model is very revealing.

If you don’t build a model, you’re in the awkward position of intuiting behavior from structure that doesn’t uniquely specify any particular mode. In doing so, you might be way ahead of non-systems thinkers approaching the same problem with a laundry list. But your ability to discover errors, incorporate data and discover leverage is far greater if you can simulate.

The model: wikiArchetypes1b.mdl (runs in any version of Vensim)

Loopy

I just gave Loopy a try, after seeing Gene Bellinger’s post about it.

It’s cool for diagramming, and fun. There are some clever features, like drawing a circle to create a node (though I was too dumb to figure that out right away). Its shareability and remixing are certainly useful.

However, I think one must be very cautious about simulating causal loop diagrams directly. A causal loop diagram is fundamentally underspecified, which is why no method of automated conversion of CLDs to models has been successful.

In this tool, behavior is animated by initially perturbing the system (e.g, increase the number of rabbits in a predator-prey system). Then you can follow the story around a loop via animated arrow polarity changes – more rabbits causes more foxes, more foxes causes less rabbits. This is essentially the storytelling method of determining loop polarity, which I’ve used many times to good effect.

However, as soon as the system has multiple loops, you’re in trouble. Link polarity tells you the direction of change, but not the gain or nonlinearity. So, when multiple loops interact, there’s no way to determine which is dominant. Also, in a real system it matters which nodes are stocks; it’s not sufficient to assume that there must be at least one integration somewhere around a loop.

You can test this for yourself by starting with the predator-prey example on the home page. The initial model is a discrete oscillator (more rabbits -> more foxes -> fewer rabbits). But the real system is nonlinear, with oscillation and other possible behaviors, depending on parameters. In Loopy, if you start adding explicit births and deaths, which should get you closer to the real system, simulations quickly result in a sea of arrows in conflicting directions, with no way to know which tendency wins. So, the loop polarity simulation could be somewhere between incomprehensible and dead wrong.

Similarly, if you consider an SIR infection model, there are three loops of interest: spread of infection by contact, saturation from running out of susceptibles, and recovery of infected people. Depending on the loop gains, it can exhibit different behaviors. If recovery is stronger than spread, the infection dies out. If spread is initially stronger than recovery, the infection shifts from exponential growth to goal seeking behavior as dominance shifts nonlinearly from the spread loop to the saturation loop.

I think it would be better if the tool restricted itself to telling the story of one loop at a time, without making the leap to system simulations that are bound to be incorrect in many multiloop cases. With that simplification, I’d consider this a useful item in the toolkit. As is, I think it could be used judiciously for explanations, but for conceptualization it seems likely to prove dangerous.

My mind goes back to Barry Richmond’s approach to systems here. Causal loop diagrams promote thinking about feedback, but they aren’t very good at providing an operational description of how things work. When you’re trying to figure out something that you don’t understand a priori, you need the bottom-up approach to synthesize the parts you understand into the whole you’re grasping for, so you can test whether your understanding of processes explains observed behavior. That requires stocks and flows, explicit goals and actual states, and all the other things system dynamics is about. If we could get to that as elegantly as Loopy gets to CLDs, that would be something.