On Compounding

This is a brief techy note on compounding in models, prompted by some recent work on financial functions, i.e. compound interest. It’s something you probably know, but don’t think about much. That’s because it’s irrelevant most of the time, except once in a while when it decides to bite you.

Suppose you’re translating someone’s discrete time model, and you decide to translate it to continuous time, because Discrete Time Stinks. The original has:

bank balance(t+1) = bank balance(t) * (1+0.1)

So you translate as:

interest rate = 0.1 ~ fraction/year

earning = interest rate * bank balance ~ $/year

bank balance = INTEG( earning, initial bank balance ) ~ $

So far, so good. But then you discover numerical issues in other parts of the model, and decide to reduce the time step from 1 to 0.125 as a remedy. That introduces a small discrepancy in the rate of growth of the bank balance from compounding. The compounded interest after a year will now be:

(1 + interest rate * TIME STEP)^(one year/TIME STEP)

(1+0.1*0.125)^(1/0.125) = 1.104486

where the 1 in the numerator of the exponent represents 1 year, preserving dimensional consistency. In other words, where the discrete model had simple interest at 10%/year, you now have (roughly) continuous compounded interest at 10.4%/year. To match the original model, you need to adjust the interest rate to yield the 10% simple interest, with a finer time step. If the time step is small (approximately continuous time), this is just the force of interest:

.0953 = ln( 1 + 0.1 ) / 1.0

Otherwise, you can explicitly account for compounding, solving as above:

1.1 = (1 + interest rate * TIME STEP)^(one year/TIME STEP) 

(1+interest rate*0.125) = 1.1^0.125

interest rate = .09588

Another situation where this kind of issue crops up is calibration to data. Suppose you have data on the bank balance, and want to estimate the interest rate. The model is continuous, but the data is discrete, with a longish interval between points. If you calibrate to the bank balance, you’ll automatically get an interest rate that compensates for any compounding differences between the model and the data. But if you use an interest rate that’s estimated externally with a different time step, you could get some mismatch.

Most of the time, this doesn’t make a lot of difference, but you can always compute a sanity check. If the simple rate is i (in time units of the model), the ratio of continuous compounding to simple interest is:

R = e^it / (1+i)^t

ln(R) = i - ln(1+i)

At 10%/yr interest, the difference is modest, as above:

.1 - ln(1.1) = .00496

No big deal – but if you have a credit card at 400%/yr, watch out!

A Grizzly-Pine-Nutcracker CLD Rework

I spent a little time working out what Clark’s Causal Calamity might look like as a well-formed causal loop diagram. Here’s an attempt, based on little more than spending a lot of time wandering around the Greater Yellowstone ecosystem:

The basic challenge is that there isn’t a single cycle that encompasses the whole system. Grizzlies, for example, are not involved in the central loop of pine-cone-seedling dispersal and growth (R1). They are to some extent free riders on the system – they raid squirrel middens and eat a lot of nuts, which can’t be good for the squirrels (dashed line, loop B5).

There are also a lot of “nuisance” loops  that are essential for robustness of the real system, but aren’t really central to the basic point about ecosystem interconnectedness. B6 is one example – you get such a negative loop every time you have an outflow from a stock (more stuff in the stock -> faster outflow -> less stuff in the stock). R2 is another – the development of clearings from pines via fire and pests is offset by the destruction of pines via the same process.

I suspect that this CLD is still dramatically underspecified and erroneous, compared to the simplest stock-flow model that could encompass these concepts. It would also make a lousy poster for grocery store consumption.

The Nordhaus Nobel

Congratulations to William Nordhaus for winning a Nobel in Economics for work on climate. However … I find that this award leaves me conflicted. I’m happy to see the field proclaim that it’s optimal to do something about climate change. But if this is the best economics has to offer, it’s also an indication of just how far divorced the field is from reality. (Or perhaps not; not all economists agree that we have reached a Neoclassical nirvana.)

Nordhaus was probably the first big name in economics to tackle the problem, and has continued to refine the work over more than two decades. At the same time, Nordhaus’ work has never recommended more than a modest effort to solve the climate problem. In the original DICE model, the optimal policy reduced emissions about 10%, with a tiny carbon tax of $10-15/tonC – a lot less than a buck a gallon on gasoline, for example. (Contrast this perspective with Stopping Climate Change Is Hopeless. Let’s Do It.)

Nordhaus’ mild prescription for action emerges naturally from the model’s assumptions. Ask yourself if you agree with the following statements:

If you find yourself agreeing, congratulations – you’d make a successful economist! All of these and more were features of the original DICE and RICE models, and the ones that most influence the low optimal price of carbon survive to this day. That low price waters down real policies, like the US government’s social cost of carbon.

In any case, you’re not off the hook; even with these rosy assumptions Nordhaus finds that we still ought to have a real climate policy. Perhaps that is the greatest irony here – that even the most Neoclassical view of climate that economics has to offer still recommends action. The perspective that climate change doesn’t exist or doesn’t matter requires assumptions even more contorted than those above, in a mythical paradise where fairies and unicorns cavort with the invisible hand.

Confronting the dreaded blank canvas

Donella Meadows walks a class through the model conceptualization process in Creating Models from Scratch:

One interesting thing here is that she starts with a causal loop diagram. This is a case where there are some clear physical quantities of interest (people and mosquitoes), so I would probably have started with stocks and flows. (Or, I like to think I would.) But you never know how things are going to go – I can think of other situations where CLDs worked out better. The key is to stay flexible and switch methods as needed, and as the audience requires (see around 30:00 for the stock-flow implementation).

An application of model critique

This long post walks through a real-world critique of an interesting model – this year’s Dana Meadows Award winner.

Testing is the key to making a good model even better.

Model Quality: the High Road

Therefore, in the interest of continuous improvement, I’ll take a hard look at a very interesting model. To get in the spirit, you might want to take a look at How to Critique a Model and my video critique of World Dynamics. I’m taking this model apart not because it’s bad, but because it’s interesting and worth investigating. (Taking apart bad models is sometimes fun too, but the supply of them is overwhelming.)

Before digging in, let me point out that I have no particular expertise in this area, so my critique is purely technical.


The model (originally in STELLA, translated to Vensim here) passed my initial sniff test – no strange formulations, ugly spaghetti, cryptic variable names, or unit errors in the original.

However, it turns out that STELLA’s unit checking is not very strict. For example, it permits:

LOG10(Free Cortisol)/LOG10(Ref Free Cortisol)

with cortisol in units of nmol. This is a conceptual error –  logarithms are fundamentally dimensionless. Fortunately, it’s without consequence for model behavior – it just scales the input to a lookup.

Here, a better normalization would be:

LOG10(Free Cortisol/Ref Free Cortisol)

In my translation, I didn’t fix these issues; I suppressed them with a “DMNL LOG10” macro that hides the warning.

STELLA also permits unnormalized lookups without issuing a warning (maybe this is a buried preference somewhere). This is not necessarily an error, but it’s not best practice. It may conceal errors, and makes analysis difficult (more on this below).

One reviewer pointed out that a number of physical processes in the model are represented by goal seeking structures – essentially SMOOTHs. Here, the number of glucocorticoid receptors adjusts toward a level indicated by cortisol levels:

This is basically a shorthand for a real physical process that must involve inflows and outflows, something like this:

The physical representation is potentially better because it’s more operational. It invites more thinking along the lines of “where do these receptors come from?” It exposes one important possibility: asymmetry. The process that increases GR numbers might have a different time constant from the process that decreases GR numbers. However, absent detailed information about GR regulation, I have no idea how to implement such a thing. Maybe no one does: my experience with biological models is that there are always many layers of complexity surrounding the system of interest, and the literature often just scratches the surface.


The first thing I test in most models is to vary TIME STEP to check stability. The usual trick is to halve TIME STEP and see if you get the same answer, but this model already runs for 92,160 time steps (128 steps per hour for 720 hours). I don’t see any delays that are small enough to require that, but you can’t always see implicit time constants in a model. Still, I wonder, could you get away with a coarser step?

If you double the TIME STEP, you immediately see differences:

This suggests that TIME STEP needs to be small (perhaps even smaller than its already-tiny value). But where does this come from? It turns out to be due to the test input. In the original, the external stress consists of a series of IF THEN ELSE statements, like:

IF((TIME>1) AND (TIME<1+Stress_Stimulus_Duration)) THEN (1) ELSE(0) + ...

I implemented this in the Vensim version via the PULSE TRAIN function. But there’s a small problem here: if the stress stimulus duration is not a power of 2, the effective width of the pulse will vary a little bit as you change the TIME STEP (assuming that it is a power of 2, as is usual to minimize roundoff error). That in turn means that the area under the curve of the stress perception inflow to the model varies slightly with the duration of the pulse.

Often, that won’t matter, but because this model has a numerically sensitive threshold, it matters a lot.

It turns out that if you renormalize the external stress input to deliver constant area under the curve (see the updated model for details), you can get away with a TIME STEP of .03125 – 4x bigger. I think one might carry this idea even further, and switch to RK4 Auto integration and make the test input smooth, but I haven’t tried that. Fortunately, all of this concerns the test input to the model alone; the dynamics are nearly unaffected.

Lookup Bounds

Next, I check runtime warnings. This model generates quite a few, all concerning lookups that are out of bounds, like:

WARNING: Lookup out of bounds at 24.125 In -#ProCyt Eff on TRP#- computing -ProCyt Eff on TRP-.

It might be OK to run off the ends of a lookup table, if the slope at the endpoints is zero. But I prefer to suppress these warnings by adding points to the ends of the lookup so that the needed domain is explicitly covered. Most of these turn out to be OK, but I modified them anyway to suppress the warnings:

A few cases are hard to reconcile without more knowledge than I have. For example:

Above, the GR function effect has a small discontinuity. Its input (GR function) seems to be bounded at one, but adding (1,1) to the lookup would cause a break in the slope. I prefer to leave such warnings in place for later review.

Extreme Conditions

My next probe of a model is generally random Synthesim overrides of key stocks and flows, to see whether the model is robust to extreme disturbances. Generally, I’d say that this model is unusually robust, in that it’s hard to get stocks to go negative or produce other undesirable behavior. That’s good. Part of the reason for this may be that many of the relationships in the model are sigmoids, and therefore bounded above and below.

Here’s one example of a test: I replicate the “probable depression” scenario from the paper, and increase the size of the one-time external stress to 100 (2x). Then I look at every stock in the model to see what happens (the stocks are the state of the system, so if you look at those, you know everything). This is easy in Vensim if you create an instance of the Strip Graph that shows all levels:

One thing jumps out at me: the initial response of serotonin is opposite the long term response:

This is not unusual, and it’s mentioned in the paper. However, I got curious about its origins, so I started causal tracing to identify the source of the behavior. The answer is … it’s complicated. But along the way, I do notice some fairly extreme nonlinear behaviors, like this:

Is this realistic, or is it a consequence of lookup table clipping and log(x)/log(y) normalizations? I can’t say for sure, but this tests the limits of what I perceive as reasonable behavior. But then, if systems don’t occasionally surprise you with weird (but real) behavior, you’re not paying attention. This is something I’d flag for further investigation.


Ultimately, what we want out of this model is to identify interventions that can help people with immune/hormone/mood problems. The obvious way to get that advice out of the model is to do a lot of sensitivity analysis to test alternatives. Generically, we’re interested in two things:

  • Can you change the system state directly in some beneficial way, e.g. by administering a drug that supplies a hormone, or lowering stress?
  • Can you restructure the system, by changing parameters that govern the strength of feedback loops or adding/deleting links?

In a sense, these are all the same thing – a parameter is just a constant state that isn’t in the model (yet), and adding a link is like giving an implicit 0 parameter a nonzero value.

For the answers to make sense, you need a way to (a) influence each state, and (b) change the gain of each loop, preferably independently. In this model, that’s a bit tricky. Consider the effects of ProInflammatory Cytokines:

There are effects on stress, cortisol, and other things. Each has its own sigmoid-shaped lookup table transforming the input to the output. All the inputs are normalized to the same constant, Ref ProCyt. The normalization is good practice, but insufficient for testing purposes, because there’s no independent way to vary the gain on these loops by varying the shapes of the lookups. Yes, it’s possible to simply edit the curves, but that’s impractical for comprehensive, automated experimentation. Two approaches might be helpful:

1. Replace the lookups with parametric curves. Then the parameters can be varied to shift and stretch the functions. This is attractive because you get smooth behavior and a lot of flexibility. However, it’s a lot of work to implement. The functional forms may be arcane, and  you can’t easily visualize them until you run the model. Here are a few sigmoid options I’ve collected over the years:

:MACRO: SSHAPE3(x,slope,lowlim,uplim,x0)
SSHAPE3 = lowlim+(uplim-lowlim)*(1/(1+exp(-4*slope*xe))) ~ Dmnl ~ defaults: lowlim= 0; uplim = 1; slope = 1; x0=1 this gives a symmetric \ S-shape from lowlim to uplim through with 1 being the inflection point and \ derivative =  at this point = slope*(uplim-lowlim) |
xe = MAX(-ZIDZ(25,4*ABS(slope)),MIN(x-x0,ZIDZ(25,4*ABS(slope)))) ~ Dmnl ~ Clip to avoid floating point errors at extreme positive / negative values \ of x |

:MACRO: SSHAPE2(input)
SSHAPE2 = exp(MAX(-50,MIN(50,input)))/(1+exp(MAX(-50,MIN(50,input)))) ~ Dmnl ~ Exponential s-shaped curve; -infinity -> 0, 0 -> .5, infinity -> 1 |

:MACRO: SSHAPE(xin,profile)
SSHAPE = IF THEN ELSE( input>0.5, 1-(1-input)^profile*0.5/0.5^profile, input^profile*\ 0.5/0.5^profile) ~ Dmnl ~ S-shaped response, from 0-1 for input from 0-1. Profile should normally be >=1 \ (1=linear; 2=quadratic) Always passes through (0.5, 0.5) |
input = MIN(1,MAX(0,xin)) ~ xin ~ |

2. Apply scaling parameters around the lookups. For example, if you’re starting with:

output = lookup( input/reference input )

you can add:

output = reference output*lookup( input/reference input )^scale


output = reference output*( 1-scale + scale*lookup( input/reference input ))

These don’t give you full control over the upper and lower bounds, slopes and asymptotes of the table, and they might not work when the lookup doesn’t pass through some obvious point like (1,1) or (0,0). So, in some cases you may need to be cleverer, or to choose approach #1 instead.

STELLA lookups, and Vensim’s WITH LOOKUP function, don’t really lend themselves to this treatment – you have to add an additional variable to transform the output downstream of the lookup. That’s why I tend to prefer the original Vensim lookup syntax. However it’s implemented, I think some level of parametric control over lookup usage is essential.

After some noodling, I settled on the following policy:

  • For dimensionless parameters with (apparent) 0-1 bounds, or centered around 1, apply a scaling exponent, so y = y0*lookup(x/x0)^s
  • For parameters bounded below at 0, apply a scaling multiplier, so y = s*lookup(x/x0)
  • For parameters with log inputs, apply a shift of the input, so y = lookup(x/x0+s)
  • Where I couldn’t figure out what do do, or a loop already contains other independent scaling parameters, skip the item

With scaling parameters in place, I ran an all-constants sensitivity analysis on the model, testing the effect of 10% variations in each parameter against the integrated serotonin level over the simulation. I started from 2 cases: the “daily stress” scenario (repeated small events), and the “probable depression” scenario (one large stress event). I then sorted the results by rank of influence on serotonin:

These are interesting in several ways:

  • The model is only moderately sloppy – many parameters have a strong effect, especially in the Daily Stress scenario.
  • There are big differences in sensitivity between the two scenarios, even though they differ only in the test input. This suggests that policies might have to be tailored to the stressor, among other things.
  • Some of the scale parameters on lookups are near the top of the list, confirming that testing lookup tables matters.

From a policy standpoint, you have to know a little more to make sense of these. What matters is not so much the response to a 10% change, but the response to an X% change, where X is the amount you could plausibly move a parameter. For example, preventing degradation of glucocorticoid receptors is clearly important (Ref GR Deg Fraction, top of list). However, the corresponding Permanent Degeneration Time is at the bottom of the list, presumably because a 10% change from 10 hours has only a tiny effect on the time horizon of the simulation. One would have to be more ambitious than that, but it might still be important.

Bottom Line

While there are a few features that could be reexamined, this model stands up to hard use well. It would also have to pass the face validity test with people who actually know something about the system, but given the paper’s citation list, I would anticipate some success on that front.

I think there might be a lot of interesting policy implications lurking in this model, waiting for an intrepid explorer with more subject matter expertise than I have. I think the crucial point here is that the structure identifies a mechanism by which patient outcomes can be strongly path dependent, where positive feedback preserves a bad state long after harmful stimuli are removed. Among other things, this might explain why it’s so hard to treat such patients. That in turn could be a basis for something I’ve observed in the health system – that a lot of doctors find autoimmune diseases mysterious and frustrating, and respond with a variation on the fundamental attribution error – attributing bad outcomes to patient motivation when delayed, nonlinear feedback is responsible.


Biological Dynamics of Stress Response

At ISDC 2018, we gave the Dana Meadows Award for best student paper to Gizem Aktas, for Modeling the Biological Mechanisms that Determine the Dynamics of Stress Response of the Human Body (with Yaman Barlas)This is a very interesting paper that elegantly synthesizes literature on stress, mood, and hormone interactions. I plan to write more about it later, but for the moment, here’s the model for your exploration.

The dynamic stress response of the human body to stressors is produced by nonlinear interactions among its physiological sub-systems. The evolutionary function of the response is to enable the body to cope with stress. However, depending on the intensity and frequency of the stressors, the mechanism may lose its function and the body can go into a pathological state. Three subsystems of the body play the most essential role in the stress response: endocrine, immune and neural systems. We constructed a simulation model of these three systems to imitate the stress response under different types of stress stimuli. Cortisol, glucocorticoid receptors, proinflammatory cytokines, serotonin, and serotonin receptors are the main variables of the model. Using both qualitative and quantitative physiological data, the model is structurally and behaviorally well-validated. In subsequent scenario runs, we have successfully replicated the development of major depression in the body. More interestingly, the model can present quantitative representation of some very well acknowledged qualitative hypotheses about the stress response of the body. This is a novel quantitative step towards the comprehension of stress response in relation with other disorders, and it provides us with a tool to design and test treatment methods.

The original is a STELLA model; here I’ve translated it to Vensim and made some convenience upgrades. I used the forthcoming XMILE translation in Vensim to open the model. You get an ugly diagram (due to platform differences and XMILE’s lack of support for flow-clouds), but it’s functional enough to browse. I cleaned up the diagrams and moved them into multiple views to take better advantage of Vensim’s visual approach.

The model ran right away, though I had to add one MAX statement to handle a uniflow (not supported in Vensim, and something I remain allergic to). There’s actually an important lesson on model replication and calibration in this.

When I first translated the model, I ran a few scenarios, using the comprehensive replication instructions in the supplemental material for the paper. I built up a Vensim command script to make it easy to replicate all the scenarios in the paper. To do that, I had to modify the equations a bit, so that manual equation editing (in STELLA) could be replaced by automatic parameter changes.

Then I ran my script and eyeballed a few graphs. Things looked pretty good:

The same, right? Not so fast! If you look closely, you’ll find that the Vensim version (bottom) has 9 peaks instead of 10, due to my replacement of a cascade of IF … ELSE test inputs with a simpler PULSE TRAIN. When you fix the count, there are still issues, because the duration parameter for each pulse (0.2) is not an integral multiple of the TIME STEP. (Incidentally, differences arising from PULSE implementations are tricky – see Yutaka Takahashi’s poster from ISDC 2018).

It took me several iterations to work out what was going wrong. I found that, to really verify that the translation (plus my initially erroneous upgrades) was OK, I had to export a run from STELLA, import it as a dataset in Vensim, and compare behavior hour by hour. That’s how I discovered the subtle but important uniflow difference.

The fact that tiny differences in test input implementations matter highlights the extreme numerical sensitivity of the model. This is a feature, not a bug. It arises from positive feedback that creates sensitive thresholds in stress response: 5% more episodic stress can be the difference between routine recovery and total collapse.

For example, here’s a sensitivity experiment with external stress at 10, 20, 30, 40, 50 & 60 units:

Notice that for external stress <= 40, recovery is quick – hours to days. But somewhere above 40 is a nonlinear threshold, beyond which recovery takes weeks.

This .zip archive contains:

  • An updated source model (.stmx) from the author, used for the translation.
  • The translated model (.mdl and .vpm). This version won’t work in PLE because it uses macros, but you can use the free Model Reader to run it.
  • Command scripts for replicating the paper’s scenarios, plus the vector of stress levels above.

StressResponseModel_converted 7.zip

Update: StressResponseModel_converted 7b.zip fixes a unit error in a test input (my mistake) – this version is closest to the original in the paper.

Update 2: StressResponseModel_converted 8.zip has an improved control panel and runs 4x faster. It departs from the original to improve sensitivity analysis capability and pulse test stability, but remains dynamically identical (as far as I can determine).

The original paper and supplementary material should be in the conference submission system.

Stay tuned for more on this topic! Here’s a detailed critique & analysis.

Not all models are wrong.

Box’s famous comment, that “all models are wrong,” gets repeated ad nauseum (even by me). I think it’s essential to be aware of this in the sloppy sciences, but it does a disservice to modeling and simulation in general.

As far as I’m concerned, a lot of models are basically right. I recently worked with some kids on an air track experiment in physics. We timed the acceleration of a sled released from various heights, and plotted the data. Then we used a quadratic fit, based on a simple dynamic model, to predict the next point. We were within a hundredth of a second, confirmed by video analysis.

Sure, we omitted lots of things, notably air resistance and relativity. But so what? There’s no useful sense in which the model was “wrong,” anywhere near the conditions of the experiment. (Not surprisingly, you can find a few cranks who contest Newton’s laws anyway.)

I think a lot of uncertain phenomena in social sciences operate on a backbone of the same kind of “physics.” The future behavior of the government is quite unpredictable, but there isn’t much uncertainty about accounting, e.g., that increasing the deficit increases the debt.

The domain of wrong but useful models remains large (within an even bigger sea of simple ignorance), but I think more and more things are falling into the category of models that are basically right. The trick is to be able to spot the difference. Some people clearly can’t:

A&G provide no formal method to distinguish between situations in which models yield useful or spurious forecasts. In an earlier paper, they claimed rather broadly,

‘To our knowledge, there is no empirical evidence to suggest that presenting opinions in mathematical terms rather than in words will contribute to forecast accuracy.’ (page 1002)

This statement may be true in some settings, but obviously not in general. There are many situations in which mathematical models have good predictive power and outperform informal judgments by a wide margin.

I wonder how well one could do with verbal predictions of a simple physical system? Score one for the models.

Dynamic Cohorts

This is the model library entry for my ISDC 2017 plenary paper with Larry Yeager on dynamic cohorts in Ventity:

Dynamic cohorts: a new approach to managing detail

While it is desirable to minimize the complexity of a model, some problems require the detailed representation of heterogeneous subgroups, where nonlinearities prevent aggregation or explicit chronological aging is needed. It is desirable to have a representation that avoids burdening the modeler or user computationally or cognitively. Eberlein & Thompson (2013) propose continuous cohorting, a novel solution to the cohort blending problem in population modeling, and test it against existing aging chain and cohort-shifting approaches. Continuous cohorting prevents blending of ages and other properties, at at some cost in complexity.

We propose another new solution, dynamic cohorts, that prevents blending with a comparatively low computational burden. More importantly, the approach simplifies the representation of distinct age, period and cohort effects and representation of dynamics other than the aging process, like migration and attribute coflows. By encapsulating the lifecycle of a representative cohort in a single entity, rather than dispersing it across many states over time, it makes it easier to develop and explain the model structure.

Paper: Dynamic Cohorts P1363.pdf

Models: Dynamic Cohorts S1363.zip

Presentation slides: Dynamic Cohorts Fid Ventana v2b.pdf

I’ve previously written about this here.