The Mysterious TREND Function

This post is literally trending.

Most System Dynamics software includes a pair of TREND and FORECAST functions. For historic reasons, these are typically the simplest possible first-order structure, which is fine for continuous, deterministic models, but not the best for applications with noise or real data. The waters are further muddied by the fact that Excel has a TREND function that’s really FORECASTing, plus newer FORECAST functions with methods that may differ from typical SD practice. Business Dynamics describes a third-order TREND function that’s considerably better for real-world applications.

As a result of all this variety, I think trend measurement and forecasting remain unnecessarily mysterious, so I built the model below to compare several approaches.

Goal

The point of TREND and FORECAST functions is to model the formation of expectations in a way that closely matches what people in the model are really doing.

This could mean a wide variety of things. In many cases, people aren’t devoting formal thought to observing and predicting the phenomenon of interest. In that case, adaptive expectations may be a good model. The implementation in SD is the SMOOTH function. Using a SMOOTH to set expectations says that people expect the future to be like the past, and they perceive changes in conditions only gradually. This is great if the forecasted variable is in fact stationary, or at least if changes are slow compared to the perception time. On the other hand, for a fast-evolving situation like COVID19, delay can be fatal – literally.

For anything that is in fact changing (or that people perceive to be changing), it makes sense to project changes into the future with some kind of model. For a tiny fraction of reality, that might mean a sophisticated model: multiple regression, machine learning, or some kind of calibrated causal model, for example. However, most things are not subject to that kind of sophisticated scrutiny. Instead, expectations are likely to be formed by some kind of simple extrapolation of past trends into the future.

In some cases, things that are seemingly modeled in a sophisticated way may wind up looking a lot like extrapolation, due to human nature. The forecasters form a priori expectations of what “good” model projections look like, based on fairly naive adaptive-extrapolative expectations and social processes, and use those expectations to filter the results that are deemed acceptable. This makes the sophisticated results look a lot like extrapolation. However, the better the model, the harder it is for this to happen.

More Oil Price Forecasts

The goal, by the way, is generally not to use trend-like functions to make a forecast. Extrapolation may be perfectly reasonable in some cases, particularly where you don’t care too much about the outcome. But generally, you’re better off with a more sophisticated model – the whole point of SD and other methods is to address the feedback and nonlinearities that make extrapolation and other simpleminded methods go wrong. On the other hand, simple extrapolation may be great for creating a naive or null forecast to use as a benchmark for comparison with better approaches.

Basics

So, let’s suppose you want to model the expectations for something that people perceive to be (potentially) steadily increasing or decreasing. You can visit St. Louis FRED and find lots of economic series like this – GDP, prices, etc. Here’s the spot price of West Texas Intermediate crude oil:

Given this data, there are immediately lots of choices. Thinking about someone today making an investment conditional on future oil prices, should they extrapolate linearly (black and green lines) or exponentially (red line)? Should they use the whole series (black and red) or just the last few years (green)? Each of these implies a different forecast for the future.

Suppose we have some ideas about the forecast horizon, desired sensitivity to noise, etc. How do we actually establish a trend? One option is linear regression, which is just a formal way of eyeballing a straight line that fits some data. It works well, but has some drawbacks. First, it assigns equal weight to all the data throughout the interval, and zero weight to anything outside the interval. That may be a poor model for perceptual processes, where the most recent data has the greatest salience to the decision maker. Second, it’s computation- and storage-intensive: you have to do a lot of math, and keep track of every data point within the window of interest. That’s fine if it resides in a spreadsheet, but not if it resides in someone’s head.

Linear fit to a subset of the WTI spot price data.

The trend-like functions make an elegant simplification that addresses the drawbacks of regression. It’s based on the following observation:*

If, as above, you take a growing input (red line) and smooth it exponentially (using the SMOOTH function, or an equivalent first order goal-gap structure), you get the blue line: another ramp, that lags the input with a delay equal to the smoothing time. This means that, at month 400, we know two points: the current value of the input, and the current value of the smoothed input. But the smoothed value represents the past value of the input, in this case 60 months previous. So, we can use these two points to determine the slope of the red line:

(1) slope = (current - smoothed) / smoothing time

This is the slope in terms of input units per time. It’s often convenient to compute the fractional slope instead, expressing the growth as a fractional increase in the input per unit time:

(2) fractional slope = (current - smoothed) / smoothed / smoothing time

This is what the simple TREND functions in SD software typically report. Note that it blows up if the smoothed quantity reaches 0, while the linear method (1) does not.

If we think the growth is exponential, rather than a linear ramp, we can compute the growth rate in continuous time:

(3) fractional growth rate = LN( current / smoothed ) / smoothing time

This has pros and cons. Obviously, if a quantity is really growing exponentially, it should be measured that way. But if we’re modeling how people actually think, they may extrapolate linearly when the underlying behavior is exponential, thereby greatly underestimating future growth. Note that the very idea of forecasting exponentially assumes that the values involved are positive.

Once you know the slope of the (estimated) line, you can extrapolate it into the future via a method that corresponds with the measurement:

(1b) future value = current + slope * forecast horizon
(2b) future value = current * (1 + fractional slope * forecast horizon)
(3b) future value = current * EXP( fractional growth rate * forecast horizon )

Typical FORECAST functions use (2b).

*There’s a nice discussion of this in Appendix L of Industrial Dynamics, around figure L-3.

Refinements

The strategy above has the virtue of great simplicity: you only need to keep track of one extra stock, and the computation needed to extrapolate is minimal. It works great for continuous models. Unfortunately, it’s not very resistant to noise and discontinuities. Consider what happens if the input is not a smooth line, but a set of noisy points scattered around the line:

The SMOOTH function filters the data, so the past point (blue) may still be pretty close to the underlying input trend (red line). However, the extrapolation (orange line) relies only on the past point and the single current point. Any noise or discontinuity in the current point therefore can dramatically influence the slope estimate and future projections. This is not good.

Similar perverse behaviors happen if the input is a pulse or step function. For example:

Fortunately, simple functions can be saved. In Expectation Formation in Behavioral Simulation Models, John Sterman describes an alternative third-order TREND function that improves robustness and realism. The same structure can be found in the excellent discussion of expectations in Business Dynamics, Chapter 16.

I’ll leave the details to the article, but the basic procedure is:

  • Recognize that the input is not perceived instantaneously, but only after some delay (represented by smoothing). This might capture the fact that formal accounting procedures only report results with a lag, or that you only see the price of cheese at the supermarket intermittently.
  • Track a historic point (the Reference Condition), by smoothing, as in the simpler methods.
  • Measure the Indicated Trend as the fractional slope between the Perceived Present Condition and the Reference Condition.
  • Smooth the Indicated Trend again to form the final Perceived Trend. The smoothing prevents abrupt changes in the indicated trend from causing dramatic overshoots or undershoots in the trend estimate and extrapolations that use it.

There’s an intermediate case that’s actually what I’m most likely to reach for when I need something like this: second-order smoothing. There are actually several very similar approaches (see double exponential smoothing for example) in the statistical literature. You have to be a little cautious, because these are often expressed in discrete time and therefore require a little thought to adapt to continuous time and/or unequal data intervals.

The version I use does the following:

(4) smoothed input = SMOOTH( input, smoothing time )

(5) linear trend = (input-smoothed input) / smoothing time
(6) smoothed trend = SMOOTH( linear trend, trend smoothing time )
(7) forecast = smoothed input + smoothed trend*(smoothing time + forecast horizon)

This provides most of what you want in a simple extrapolation method. It largely ignores a PULSE disturbance. Overshoot is mild when presented with a STEP input (as long as the smoothing times are long enough). It largely rejects noise, but still tracks a real RAMP accurately.

Back to regression

SD models typically avoid linear regression, for a reasons that are partly legitimate (as mentioned above). But it’s also partly cultural, as a reaction to incredibly stupid regressions that passed for models in other fields around the time of SD’s inception. We shouldn’t throw the baby out with that bathwater.

Fortunately, while most software doesn’t make linear regression particularly accessible, it turns out to be easy to implement an online regression algorithm with stocks and flows with no storage of data vectors required. The basic insight is that the regression slope (typically denoted beta) is given by:

(8) slope = covar(x,y) / var(x)

where x is time and y is the input to be forecasted. But var() and covar() are just sums of squares and cross products. If we’re OK with having exponential weighting of the regression, favoring more recent data, we can track these as moving sums (analogous to SMOOTHs). As a further simplification, as long as the smoothing window is not changing, we can compute var(x) directly from the smoothing window, so we only need to track the mean and covariance, yielding another second-order smoothing approach.

If the real decision makers inspiring your model are actually using linear regression, this may be a useful way to implement it. The implementation can be extended to equal weighting over a finite interval if needed. I find the second-order smoothing approach more intuitive, and it performs just as well, so I tend to prefer that in most cases.

Extensions

Most of what I’ve described above is linear, i.e. it assumes linear growth or decline of the quantity of interest. For a lot of things, exponential growth will be a better representation. Equations (3) and (3b) assume that, but any of the other methods can be adapted to assume exponential behavior by operating on the logarithm of the input, and then inverting that with exp(…) to form the final output.

All the models described here share one weakness: cyclical inputs.

When presented with a sin wave, the simplest approach – smoothing – just bulldozes through. The higher the frequency, the less of the signal passes into the forecast. The TREND function can follow a wave if the period is longer than the smoothing time. If the dynamics are faster, it starts to miss the turning points and overshoot dramatically. The higher-order methods are better, but still not really satisfactory. The bottom line is that your projection method must use a model capable of representing the signal, and none of the methods above embodies anything about cyclical behavior.

There are lots of statistical approaches to detection of seasonality, which you can google. Many involve binning techniques, similar to those described in Appendix N of Industrial Dynamics, Self Generated Seasonal Cycles.

The model

The Vensim model, with changes (.cin) files implementing some different experiments:

trendy 4b.zip

I developed this in DSS and forgot to test PLE before uploading. If you have issues, please comment and I’ll adapt it to work.

A Ventity version is on the way.

Steady State Growth in SIR & SEIR Models

Beware of the interpretation of R0, and models that plug an R0 estimated in one context into a delay structure from another.

This started out as a techy post about infection models for SD practitioners interested in epidemiology. However, it has turned into something more important for coronavirus policy.

It began with a puzzle: I re-implemented my conceptual coronavirus model for multiple regions, tuning it for Italy and Switzerland. The goal was to use it to explore border closure policies. But calibration revealed a problem: using mainstream parameters for the incubation time, recovery time, and R0 yielded lukewarm growth in infections. Retuning to fit the data yields R0=5, which is outside the range of most estimates. It also makes control extremely difficult, because you have to reduce transmission by 1-1/R0 = 80% to stop the spread.

To understand why, I decided to solve the model analytically for the steady-state growth rate in the early infection period, when there are plenty of susceptible people, so the infection rate is unconstrained by availability of victims. That analysis is reproduced in the subsequent sections. It’s of general interest as a way of thinking about growth in SD models, not only for epidemics, but also in marketing (the Bass Diffusion model is essentially an epidemic model) and in growing economies and supply chains.

First, though, I’ll skip to the punch line.

The puzzle exists because R0 is not a complete description of the structure of an epidemic. It tells you some important things about how it will unfold, like how much you have to reduce transmission to stop it, but critically, not how fast it will go. That’s because the growth rate is entangled with the incubation and recovery times, or more generally the distribution of the generation time (the time between primary and secondary infections).

This means that an R0 value estimated with one set of assumptions about generation times (e.g., using the R package R0) can’t be plugged into an SEIR model with different delay structure assumptions, without changing the trajectory of the epidemic. Specifically, the growth rate is likely to be different. The growth rate is, unfortunately, pretty important, because it influences the time at which critical thresholds like ventilator capacity will be breached.

The mathematics of this are laid out clearly by Wallinga & Lipsitch. They approach the problem from generating functions, which give up simple closed-form solutions a little more readily than my steady-state growth calculations below. For example, for the SEIR model,

R0 = (1 + r/b1)(1 + r/b2)           (Eqn. 3.2)

Where r is the growth rate, b1 is the inverse of the incubation time, and b2 is the inverse of the recovery time. If you plug in r = 0.3/day, b1 = 1/(5 days), b2 = 1/(10 days), R0 = 10, which is not plausible for COVID-19. Similarly, if you plug in the frequently-seen R0=2.4 with the time constants above, you get growth at 8%/day, not the observed 30%/day.

There are actually more ways to get into trouble by using R0 as a shorthand for rich assumptions in models. Stochastic dynamics and network topology matter, for example. In The Failure of R0, Li Blakely & Smith write,

However, in almost every aspect that matters, R 0 is flawed. Diseases can persist with R 0 < 1, while diseases with R 0 > 1 can die out. We show that the same model of malaria gives many different values of R 0, depending on the method used, with the sole common property that they have a threshold at 1. We also survey estimated values of R 0 for a variety of diseases, and examine some of the alternatives that have been proposed. If R 0 is to be used, it must be accompanied by caveats about the method of calculation, underlying model assumptions and evidence that it is actually a threshold. Otherwise, the concept is meaningless.

Is this merely a theoretical problem? I don’t think so. Here’s how things stand in some online SEIR-type simulators:

Model R0 (dmnl) Incubation (days) Infectious (days) Growth Rate (%/day)
My original 3.3  5  7  17
Homer US 3.5  5.4  11  18
covidsim.eu 4  4 & 1  10  17
Epidemic Calculator 2.2  5.2  2.9  30*
Imperial College 2.4 5.1 ~3** 20***

*Observed in simulator; doesn’t match steady state calculation, so some feature is unknown.

**Est. from 6.5 day mean generation time, distributed around incubation time.

***Not reported; doubling time appears to be about 6 days.

I think this is certainly a Tower of Babel situation. It seems likely that the low-order age structure in the SEIR model is problematic for accurate representation of the dynamics. But it also seems like piecemeal parameter selection understates the true uncertainty in these values. We need to know the joint distribution of R0 and the generation time distribution in order to properly represent what is going on.

Steady State Growth – SIR

Continue reading “Steady State Growth in SIR & SEIR Models”

Misadventures with Little’s Law

I’ve been working on a vehicle fleet model, re-implementing a spreadsheet in Ventity, using dynamic cohorts.

The vehicle lifetime in the spreadsheet is 11 years, and it’s discrete. This means that every vehicle retires precisely 11 years after it’s put into service. This raised a red flag for me, because it represents a rather short vehicle lifetime. I know from work in other jurisdictions that the average life of a vehicle is more like 16-18 years typically (and getting longer as quality improves).

So, where does the 11 year figure come from? We’re not sure. Other published data for the region indicates an average vehicle age of 8.5 years, so it’s not that. A Ventana colleague pointed out that it might be a steady-state estimate from combining vehicle fleet data with new vehicle sales data:

 

Given the data (red), assume that the vehicle stock is in equilibrium (inflow=outflow). Then it follows from Little’s Law that the average lifetime of vehicles must be 11 years. Little’s Law works regardless of the delay distribution, i.e. regardless of the delay order, but if you were formulating the fleet as a first-order system, that’s precisely how you’d write the outflow equation: outflow = fleet/lifetime, with lifetime=11 years.

… the long-term average number L of customers in a stationary system is equal to the long-term average effective arrival rate λ multiplied by the average time W that a customer spends in the system. – Wikipedia

However, there’s a danger here. The system might not be in equilibrium. Then both the assumption of inflow=0utflow and the stationarity required in Little’s Law. Vehicle sales are, unfortunately, rather volatile, particularly around events like the 2008 recession:

It’s tempting to use the average age of vehicles as another data point, but that turns out to be a bad idea. The average age of vehicles is sensitive to both variations in the inflow and the assumed distribution of the discard process. The following Ventity model illustrates this problem, using some of the same machinery as last week’s Erlang model.

As before, there’s a population of entities (agents). Each has a cascade of N internal states, represented by a stock counter, and an age that increases continuously. An entity deletes itself when it’s too old, or its state count is too high.

For accounting purposes, when an entity “dies” it records the event by incrementing counter stocks in the Model entity:

In this way, we can keep track of how old the average entity was at the time it deleted itself. This should be the average residence time in Little’s Law. We can also track the average age of existing entities, to see whether it’s the same.

First, consider a very simple, very nonstationary special case, in which there’s no flow of entity turnover. There’s only an initial population of entities of age 0, who gradually leave the system. Here are three variants of that experiment:

Set Model.Delay tau = 50 and Model.Flow Start Time = 1000 to replicate this experiment.

The blue line is the stochastic population analog of the classic first-order delay. The probability of a given entity departing is constant over time, as for radioactive decay. Therefore we get exponential decay, with count = N0*exp(-time/Delay tau). The red line is the third-order equivalent, yielding an Erlang 3 distribution. The green line is the pipeline delay equivalent, in which all entities self-delete at a specified age, rather than with a random distribution. Therefore the population steps from 1000 to 0 at time 50.

The two lower panels compare the average age of surviving entities (middle) to the average age at which entities self-delete (bottom). At bottom, you can see that all variants eventually converge to (roughly) the expected 50-year entity lifespan. However, each trajectory initially indicates a shorter lifespan. This is due to a form of censoring bias – at a given point in time, the longest-lived entities have not yet been observed.

The middle panel indicates how average age can mislead. In this case, age=time for all entities, and therefore the average age increases linearly, even though the expected residence time is constant.

At the opposite extreme, here’s an experiment with a constant flow of new agents, so that the system is in equilibrium after a few time constants:

Set Model.Delay tau = 20 and Model.Flow Start Time = 0 to replicate this experiment.

After the initial transient has died out (by time 20 to 60), all 3 residence times (age at deletion) converge to the expected value of 20. But notice the ages. They converge, too, but the value is dependent on the distribution. For the 1st-order system (blue), the average age does equal the average residence time of 20 years. But the pipeline system (green) has an average age that’s half that, at 10 years. This makes sense, if you think about an equilibrium population composed of a uniform mix of ages between 0 and 20 years. The 3rd-order system is in between.

This uncertain relationship between age and residence time means that we can’t use the average age of the vehicle fleet to determine the rate of vehicle turnover. That’s too bad, because age is the one statistic that’s easy to compute from a database of vehicle registrations. To know more, we have to start making inferences about the inflows and outflows – but that’s tricky if data coverage varies with time. Unfortunately, this is a number that we care about, because the residence time of vehicles in the system is an important driver of future penetration of low-carbon technologies.

The model: AgentAge2.zip

The Delay Sandbox can be used to explore similar phenomena in a continuous, aggregate, deterministic setting.

Aging Chains and the Erlang Distribution

My Delay Sandbox model illustrates the correspondence between Nth-order delays and the Erlang distribution (among other things).

Delay Sandbox

This model provides some similar insights – this time in Ventity. It’s a hybrid of classic continuous SD and agent equivalents.

First, the Erlang3 entitytype compares the classic 3rd-order aging chain’s behavior to analytical equivalents, as in the Delay Sandbox. The analytic values are computed in a set of Ventity’s new macros:

Notice that the variances, which arise from Euler integration with a finite time step, are small enough to be uninteresting.

Second, the model compares the dynamics of discrete agent populations to the analytic Erlang results. To do this, the Model entity creates populations of agents at time 0, and (for comparison) computes the expected surviving population according to the Erlang distribution:

The agents live for a time, then self-delete according to two different strategies:

On the left, an agent tracks its own age, and has an age-specific probability of mortality (again, thanks to the hazard rate of the Erlang distribution). On the right, an agent has a state counter, and mortality occurs when the number of state transitions reaches 3.

We can then compare the surviving agent populations (blue) to the Erlang expectation (red):

When the population is small (above, 100), there’s some stochastic variation around the expected result. But for larger populations, the difference is negligible.

The model: Erlang3 4 (2).zip

A System Zoo

I just picked up a copy of Hartmut Bossel’s excellent System Zoo 1, which I’d seen years ago in German, but only recently discovered in English. This is the first of a series of books on modeling – it covers simple systems (integration, exponential growth and decay), logistic growth and variants, oscillations and chaos, and some interesting engineering systems (heat flow, gliders searching for thermals). These are high quality models, with units that balance, well-documented by the book. Every one I’ve tried runs in Vensim PLE so they’re great for teaching.

I haven’t had a chance to work my way through the System Zoo 2 (natural systems – climate, ecosystems, resources) and System Zoo 3 (economy, society, development), but I’m pretty confident that they’re equally interesting.

You can get the models for all three books, in English, from the Uni Kassel Center for Environmental Systems Research – it’s now easy to find a .zip archive of the zoo models for the whole series, in Vensim .mdl format, on CESR’s home page: www2.cesr.de/downloads.

To tantalize you, here are some images of model output from Zoo 1. First, a phase map of a bistable oscillator, which was so interesting that I built one with my kids, using legos and neodymium magnets:

Continue reading “A System Zoo”

Delay Sandbox

There’s a handy rule of thumb for estimating how much of the input to a first order delay has propagated through as output: after three time constants, 95%. (This is the same as the rule for estimating how much material has left a stock that is decaying exponentially – about a 2/3 after one lifetime, 85% after two, 95% after three, and 99% after five lifetimes.)

I recently wanted rules of thumb for other delay structures (third order or higher), so I built myself a simple model to facilitate playing with delays. It uses Vensim’s DELAY N function, to make it easy to change the delay order.

Here’s the structure:

Continue reading “Delay Sandbox”

The Rise and Fall of the Saturday Evening Post

Replicated by David Sirkin and Julio Gomez from Hall, R. I. 1976. A system pathology of an organization: The rise and fall of the old Saturday Evening Post. Administrative Science Quarterly 21(2): 185-211. (JSTOR link). Just updated for newer Vensim versions.

This is one of the classic models on the Desert Island Dynamics list.

There are some units issues, preserved from the original by David and Julio. As I update it, I also wonder if there are some inconsistencies in the accounting for the subscription pipeline. Please report back here if you find anything interesting.

satevepost2011b.mdl

satevepost2011b.vmf

Fibonacci Rabbits

This is a small, discrete time model that explores the physical interpretation of the Fibonacci sequence. See my blog post about this model for details.

Fibonacci2.vpm This runs with Vensim PLE, but users might want to use the Model Reader in order to load the included .cin file with non-growing eigenvector settings.

Oscillation from a purely positive loop

Replicated by Mohammad Mojtahedzadeh from Alan Graham’s thesis, or created anew with the same inspiration. He created these models in the course of his thesis work on structural analysis through pathway participation matrices.

Alan Graham, 1977. Principles on the Relationship Between Structure and Behavior of Dynamic Systems. MIT Thesis. Page 76+

These models are pure positive feedback loops that don’t exhibit exponential growth (under the right initial conditions). See my blog post for a discussion of the details.

These are generic models, and therefore don’t have units. All should run with Vensim PLE, except the generic gain matrix version which uses arrays and therefore requires an advanced version or the Model Reader.

The original 4th order model, replicated from Alan’s thesis: PurePosOscill4.vpm – note that this includes a .cin file with an alternate stable initialization.

My slightly modified version, permitting initialization with different gains at each level: PurePosOscill4alt.vpm

Loops of different orders: 3.vpm 6.vpm 8.vpm 12.vpm (I haven’t spent much time with these. It appears that the high-order versions transition to growth rather quickly – my guess is that this is an artifact of numerical precision, i.e. any tiny imprecision in the initialization introduces a bit of the growth eigenvector, which quickly swamps the oscillatory signal. It would be interesting to try these in double precision Vensim to see if I’m right.)

Stable initializations: 2stab.vpm 12stab.vpm

A generic version, representing a system as a generic gain matrix, so you can use it to explore any linear unforced variant: Generic.vpm