This is a great talk by Nate Osgood on the intersection of systems and complexity with data science and machine learning:
Obey these simple rules to avoid garbage-in->garbage-out.
There’s a lot of art to modeling, and more generally to managing complex systems. But there’s also some craft to it: simple, mechanical steps that must be followed, almost without exception. Woodworkers know that when you’re using a chisel or plane, you cut with the grain, not across it. Knowing that isn’t sufficient to make a nice-looking chair, but at least your funny-looking chair won’t have ugly tearout.
So what are the rules for classic System Dynamics? Here are a few:
- Unbalanced or missing units. It’s possible to build a correct model without units, but most people (including me) are unlikely to manage it. Even if the model is right in some sense, without units it’s still unintelligible to others.
- No FONFOO. Every physical stock needs First-Order Negative Feedback On the Outflows. This means the equations ensure that the outflow goes to 0 as the stock goes to 0 – not after a while, but now and forever. This ensures conservation of stuff: no inventory -> no sales. Nonphysical stocks often require this treatment as well, unless negative values are permitted by definition.
- Embedded parameters. A colleague just found an equation in a spreadsheet model reading something like =A2*EXP(-C4/C1) + 4. The “4” was just an arbitrary fudge factor on the answer. This should never happen; anything more complex than the 1 in 1/x should always be exposed as a distinct, named variable with appropriate units.
- Corollary: the embedded parameter often represents an implicit goal. For example, in inventory adjustment = (1000-inventory)/inventory adj time, the goal of 1000 units should be made explicit.
- Discrete time. Generally, your model should be independent of the TIME STEP and simulation method. Decision rules should integrate information smoothly, not at arbitrary point lags.
- Discrete logic. Sometimes I see equations that involve big cascades of logical statements: IF THEN ELSE( inventory < 100 :AND: price > 2, do x, IF THEN ELSE( inventory > 200 :AND: expected sales > inventory/desired coverage, do y, IF THEN ELSE( … Constructions like this are hard to read and hard to debug, and they often fail important reality checks. They might be appropriate in tactical cases where reality has discernible, discrete rules. But they’re seldom helpful in strategic models involving the aggregate behavior of many agents or objects.
- Overuse of delays. Every feedback loop must include a stock. This is a consequence of “time is what keeps everything from happening at once.” If there’s no integration in a loop, then feedback would run infinitely fast. Sometimes, confronted with an apparently simultaneous loop, modelers just insert a SMOOTHI or similar function that contains a stock. This may not be good enough; the stock in the loop can’t be arbitrary; it has to have real meaning.
- It’s also possible to commit the opposite sin: underuse of delays. Perceptions lag reality, and people often underestimate the extent to which this is true. Decision rules in your model should reflect this, but I think it’s more a matter of art than craft.
- Taking the cream out of the coffee. Suppose you have a stock of people, with a coflow of money used to keep track of the average wealth of people in the stock. It’s then tempting to handle a thought experiment like, “ok, what if all the rich people leave the country?” by siphoning off a greater-than-average share of the money alongside each departing person. This violates the assumption that a stock is the complete representation of system state. What if, for example, the rich people already left, so that the remainder are uniformly poor? If the distinction is important, you simply must disaggregate the people into classes.
Like all rules, these are made to be broken, but exceptions are rare, and require that you really know what you’re doing. They are important because they ensure compliance with Reality Checks that should remain inviolate for strong reasons. If your population model isn’t conserving people, you have a problem.
Incidentally, at least half of these are mentioned in Appendix O of Industrial Dynamics, “Beginners’ Difficulties.” However, these are not just tricks for beginners: everyone can benefit from keeping them in mind, just as professional pilots rely on checklists.
I’m eager to hear your thoughts in the comments. What rules did I miss?
* Update: edited slightly for parallelism of the headers.
Every physical stock needs First Order Negative Feedback On Outflows.
I’ve been approached several times recently with questions about stocks behaving badly. All involved a construction something like the following:
This is a simple inventory control system, in which I’ve short-circuited the production start feedback by making Starting exogenous and equal to the desired sales rate. Therefore, there are really only two interesting equations:
Shipping=desired sales rate Units: widgets/Month Completing=DELAY3( Starting, production time ) Units: widgets/Month
Notice that there’s a violation of standard practice here, in that there’s a flow-to-flow connection from Starting to Completing. This is due to the DELAY3 function, which is shorthand for an explicit 3rd-order delay:
The 3rd-order delay is often a realistic compromise between a 1st-order system, in which the first completions arrive too quickly after Starting, and a pipeline delay or conveyor, which has too little dispersion to represent an aggregation of many items. (See the Delay Sandbox and Erlang models for examples.)
So, how can we break this model?
I always like to start with some tests in Synthesim. A good one is to stress the system with a step in the desired sales rate, here from 100 to 120. You can immediately spot a problem:
Inventory goes negative, because Shipping proceeds, even when inventory is exhausted. That can’t happen in reality, but it happens here because Shipping is not a function of Inventory. There’s a simple fix:
Shipping=MIN(desired sales rate, Inventory/min shipping time) Units: widgets/Month
Above, min shipping time is a time constant representing the minimum time needed to deplete inventory. It’s common to set min shipping time = TIME STEP in situations where you want to prevent negative inventory, and the precise dynamics of inventory exhaustion are not central to the model. (If it matters, see Dynamics of the Last Twinkie.)
This is FONFOO. The “first order negative feedback” refers to the balancing loop created by the Inventory/min shipping time term in the fixed equation:
The tricky thing about this situation is that if Starting had been endogenous, the negative inventory problem would have been much harder to spot. Here’s the same model with a simple decision rule for Starting that maintains Inventory and WIP and desired levels:
Now, a modest step in sales doesn’t cause negative inventory, as long as the production process can replenish it in time. It takes a huge step (from 100 to 400 widgets/month) to reveal the problem:
This means that experiments on a model as a whole may not reveal problems that lurk in the details of the model, unless they’re quite extreme. I recommend extreme tests, but prevention is more important. Simply make it a habit to implement FONFOO everywhere, and you won’t have problems. (Note that we could automate this in Vensim, but we don’t, because doing so can easily mask other formulation problems, fall short of the control that’s really needed, or impede situations in which nonphysical stocks are intentionally negative.)
Now let’s take a look at the 3rd-order production delay surrounding WIP. As presented above, it works fine – it’s mathematically equivalent to the explicit 3rd-order aging chain. However, there are consistency issues to be aware of. Consider the following augmentation of the structure, representing stock losses (the flow of Breaking) from WIP:
Completing=DELAY3( Starting, production time ) Units: widgets/Month Breaking=DELAY3(Starting*loss fraction,production time) Units: widgets/Month
Completing is still a delayed function of Starting. But Completing is not directly aware of WIP and therefore unaware of the consequences of Breaking. This is a violation of FONFOO because the DELAY3 function contains internal states that are independent of the WIP stock. Consider what happens if the loss fraction is nonzero. In equilibrium, the output of DELAY3 is equal to the inflow. So, the outflow from WIP would be Breaking+Completing, which equals Starting+Starting*loss fraction, which is of course greater than starting for any nonzero loss.
A step in the loss function from 0 to 0.2 causes WIP to go negative:
Again, the remedy is simple. In most cases, you can keep the DELAY function if you ensure that the inflows and outflows are conserved. For example, adding a term:
Completing=DELAY3( Starting*(1-loss fraction), production time ) Units: widgets/Month Breaking=DELAY3(Starting*loss fraction,production time) Units: widgets/Month
In some situations, it may be desirable to switch to an explicit aging chain in order to handle an idiosyncratic distribution of losses across the WIP process, or other complexities. Often arrays are useful for such purposes.
You may encounter the DELAY1 function in similar circumstances. DELAY1 is just like DELAY3, except that it’s first order. So, the system:
inflow = 10 ~ widgets/month stock = INTEG(inflow-outflow, inflow*tau) ~ widgets outflow = stock/tau ~ widgets/month tau = 6 ~ months
is identical to the system:
inflow = 10 ~ widgets/month stock = INTEG(inflow-outflow, inflow*tau) ~ widgets outflow = DELAY1(inflow,tau) ~ widgets/month tau = 6 ~ months
In this case, there’s really no reason to use the DELAY1 – it just obfuscates the first-order stock dynamics. However, there’s still a potential pitfall, which also applies to DELAY3. The initialization is important. The DELAY functions generally initialize their internal stocks in equilibrium, as if the inflow had been at its initial level historically. Therefore the stock above needs to be initialized the same way, to inflow*tau. If you want to use some other value, like zero, you need to use DELAY3i (or its equivalent) to set the stock and delay function to a consistent set of assumptions.
In reviewing other models, you may also find hybrid approaches, like:
inflow = 10 ~ widgets/month stock = INTEG(inflow-outflow, inflow*tau) ~ widgets outflow = DELAY1(stock/tau,tau) ~ widgets/month tau = 6 ~ months
This is another FONFOO violation. The outflow is indeed a function of the stock, which ensures that the outflow eventually goes to zero when the stock is exhausted. But this does not create a 1st-order negative feedback loop; the DELAY1 contains an additional stock. So, this is SONFOO (second order negative feedback on the outflow), which might be useful for creating an oscillator, but won’t solve your supply chain problems.
If you make FONFOO a habit, you’ll have one less thing to worry about when you start exploring the interesting, complex behaviors of your models.
Modeling projects usually start with the dreaded blank sheet of paper (or blank screen). How to get started? Just do it. Write stuff down, and see what organization emerges.
Here are some concrete approaches that I’ve often used:
- Start with the question. Inventory is unstable? OK, put inventory on the diagram. It’s a stock, so what are the flows? Put them on the diagram. Are the inflows and outflows unstable, or just one? Follow the unstable direction….
- Start with the data. We get this a lot in marketing science projects. There’s typically a big pile of Nielsen or IMS data on price, promotion and distribution. How does that drive sales? You can do a little data mining for insight, but typically the data describes less than half of what’s going on, so more importantly, what else drives sales? How do brand equity, supply chain performance, and other dynamics introduce feedback into the picture?
- Start with a spreadsheet. There’s always a spreadsheet. It’s probably open loop and static, but it captures features that someone thought were important. Audit the spreadsheet to discover its structure, then make it dynamic.
- Start with the goal. You want to maximize profit? Write down a P&L, then trace each item. Where does revenue come from? What drives costs? When you answer these questions, look for the key strategic stocks that govern the behavior – people, capital, perceptions, etc.
- Start with the physics. What are the key stocks of scarce resources in the system? Equipment, people, money, knowledge? What makes them change, and where are the decisions?
- Start with the stakeholders. What are the major constituencies in the problem domain. What do they want, and what stocks are they looking at to guide how they get it?
The key thing is to remember that modeling is an iterative process at every level. The data might be wrong. The equations will be wrong. The equations might be in the wrong structure. The structure might describe the wrong problem. This is normal. Don’t be afraid to back up and start over.
@SDWisdom Ken Cooper lists 6 good reasons to apply System Dynamics to medical research. I think there are more if you broaden the definition of ‘medical’ :
7. Dose titration can be dynamically complex and subject to misperceptions of feedback; models make it easy.
8. Chronic autoimmune and mental health problems are embedded in a nest of feedback between the disease and the person’s environment.
9. ERs, hospitals and other delivery systems are loaded with delays, feedback and nonlinearity.
10. Smoking, diet, exercise, and other big health drivers are social phenomena.
11. Diet and exercise are entangled with other systems, like urban design and energy efficiency.
12. The health insurance system, especially in the US where it has evolved into a mess, can’t be redesigned without a systemic perspective.
I enjoyed the biomedical modeling plenary at #ISDC2019 more than most. I was struck by the continuum of behavior involved in the system:
- True biomedical modeling is a bit funny, because it’s not typical System Dynamics, in the sense that it’s nonlinear dynamic simulation, but it’s not behavioral, so it’s missing one of the cornerstones of SD. Nevertheless, I think the way we think about complex systems is a useful complement to other approaches coming more from biology and mathematics (nonlinear dynamics).
- Behavior enters one level “up”, in problems like Jim Rogers & Ed Gallaher’s work on dose titration in anemia. This is a classic case of smart people having trouble managing a system with fairly simple dynamics – essentially a single pipeline delay in the case of anemia. There may be many similar cases, where large performance improvements are available from simple models (but complicated people management).
- Next, there are problems that combine behavioral dynamics and misperceptions of feedback with an underlying system that is also quite complex. Gizem Aktas’ work on stress and hormonal regulation is an example, as are diabetes and mental health models.
- At the far end of the scale, there are health system models, like ReThink Health, which abstract away from the biomedical details of any particular disease. In its place, there’s an extremely complex network of human resources, incentives and decisions.
I think the opportunities are large in all of these areas. Once challenge for the field is that each requires a different interface to other researchers, health practitioners and managers. That’s a lot for relatively few modelers to manage. How can we team up to be more effective?
Hybrid models are the solution to blending endogenous elegance with practicality.
My last post probably sounds like I disagree with Jack Homer’s recommendation to tolerate some exogenous drivers and consideration of policy feasibility. Actually I don’t. In fact, we at Ventana probably do more data-intensive SD than anyone. I build hybrid models all the time.
When philosophizing about the best way to change the world, it’s easy to lose sight of some practical considerations that influence choices:
- Cost. It’s expensive to develop an elegant, endogenous theory for things like interest rates that you might normally think of as exogenous to a firm. On the other hand, it’s also expensive to collect and use data – often 1/3 of project cost in our experience.
- Clarity. Exogenous variables complicate the analysis of a model, because you have driven behavior on top of the model’s endogenous dynamics. I think this makes it harder to understand the basic behavior, because you lose the insight you might gain from starting a model in equilibrium and perturbing it with policies.
- Calibration. On the other hand, using exogenous drivers increases your ability to gain insight from comparison of model behavior to data. This is not a definitive test, but you can definitely use it to estimate uncertain parameters and weed out certain dumb ideas.
- Client. You have to meet people where they are. If, historically, they think R^2 is the definitive measure of success, you’d better deliver. You can explain why that’s a bad metric and present a more endogenous view of the situation later, after you’ve established trust.
I think there’s no clear answer – the extent to which endogenous or exogenous elements are preferred has to be a situation-specific decision. In my own work, I often use a two-pronged approach, and two ways to structure that have emerged:
- Build a single, large, calibrated model with some exogenous drivers. Build endogenous submodels or metamodels for equilibrium experiments and to explain key features of the big model.
- Build a single, elegant endogenous model, with few drivers. Use smaller exogenous models, or statistical and machine learning tools, to understand local features of the data, incorporating those insights into the endogenous model without using the data directly.
How do we strike the right balance among closed loops revealing all possible leverage points, exogenous drivers for fidelity and economy, and practical focus on feasible policies?
On the SDwisdom blog, Jack Homer wonders whether we should be a little less endogenous:
The party line is that a model’s boundary should be broad enough so that the system’s main observed behaviors—such as S-shaped growth, oscillation, or overshoot and decline—are fully explained by the model’s endogenous structure. One should avoid the use of exogenous time series drivers, because they undermine the ability of the model to explain and to anticipate change.
I mostly agree with this view but want to offer a friendly amendment here. In my experience with real-world clients, I have often encountered situations in which it makes sense to employ exogenous time series for the sake of completeness and realism.
My experience suggests that we should be less doctrinaire about the endogenous perspective and understand that “endogenous” is a relative thing. No model can be all-encompassing and explain all observed behavior patterns. That’s why we define a model relative to some subset of behaviors also known as the dynamic problem. As long as the model adequately addresses the dynamic problem, it shouldn’t really matter if the model has some exogenous time series included to improve the model’s realism.
The counterpoint to this might be George Richardson’s excellent article on the value of an endogenous perspective:
But the foundation of systems thinking and system dynamics lies deeper than these and is often implicit or even ignored: it is the “endogenous point of view.” The paper will begin with historical background, clarify the endogenous point of view, illustrate with examples, and argue that the endogenous point of view is the sine%qua%non of systems approaches.
The dead buffalo model on the left in Richardson’s figure is perhaps too extreme. I think what Homer is really arguing for is a middle road that might look like this:
In other words, it’s a hybrid model, with an endogenous core, and a few exogenous drivers from beyond the model boundary. For a firm, this might mean that we model the interactions of marketing, production, human resources, etc. endogenously. But we leave the world crude price and GDP of France exogenous, because we have no plausible means to influence them.
Richardson points out, in the context of Urban Dynamics:
How might thinking exogenously have affected conclusions emerging from these urban models? Consider just one: suppose, as a number of critics of various system dynamics studies have suggested, we chose to use time series data for urban population projections. Suppose the data-based projections were very carefully developed by sophisticated statistical tools and econometric methods, and suppose those projections were fed into URBAN1 in place of the endogenous stock of population. To make the implications easiest to see, let’s suppose the base run of the model looks just as it did in Figure 5b. What would Figure 5c look like? Sadly, population would not show a bump as it did in 5c because the sophisticated exogenous time series data would not be influenced in the slightest by the changing conditions in the model. We would not see the compensating urban migration effects we see in Figure 5c, and we would miss the crucial conclusion that population and business construction dynamics would naturally compensate for the jobs program. We’d probably think it was a long term policy success, and we would be dramatically wrong.
Similarly, a key part of my dissertation critique of the DICE model was the idea that its exogenous representation of technology excluded a set competitive dynamics between fossil fuels and renewables that are crucial to understanding climate policy. In both cases, leaving even a few loops exogenous might dramatically alter policy recommendations.
In 1977, my teacher, SD pioneer Ed Roberts, wrote an important paper called “Strategies for Effective Implementation of Complex Corporate Models”, based on his years of consulting experience. When it comes to policy recommendations, he said, “the organizations’ ability to absorb the associated change must be assessed…The model-builder and the organization both profit from implementation of moderate change proposals leading to some successful results; both lose from grandiose plans which fail to be moved ahead.”
Without too much extra effort beyond what we already do in modeling, we could assess policy feasibility by studying the political situation and its dynamics, looking at trends in public opinion surveys and other indicator data, and talking to experts. We could then calculate the “feasibility-adjusted value” of a policy by multiplying its potential impact (as determined by our SD model) by its likelihood of enactment.
In other words, one might ask, what good is it to know the nature of the best endogenous policy if the key leverage points are beyond your control? Or, as economists put it, what good are first-best policies in a second-best world?
I think Richardson might respond along the lines of Pascal’s wager:
To the extent that we accept some exogeneity in our models, we accept our fate in the lower left box. That choice presumes that our best strategy is to do a good job of predicting and preparing – making the best of a bad situation, or living to fight another day. Is that really a good idea – can we know, or at least make a good guess, whether the true state of affairs is exogenous or endogenous? Richardson argues that if we don’t know which box we’re in, it’s best to choose the endogenous approach:
If one were to recast Table 4 as a decision tree, the decision to take an endogenous point of view in all circumstances would have the highest net payoff, at least in terms of happy faces and the real feelings they represent. An endogenous point of view is potentially empowering, and that feels good to us.
If the right (endogenous) side of the diagram is better in some sense, we should ask a more important question: can we change the state of affairs, so that we can move from the left quadrants to the right? This is related to the idea that we might need to change paradigms to change the system.
This perspective is especially critical for problems like climate change, where the space of politically feasible policies currently does not include anything that actually solves the problem. Treating the infeasible loops as exogenous drivers won’t help. We need to ask, what loops that are now beyond our control can be activated in order to reach a more attractive solution?
Maybe there really is no good solution to climate and other Limits problems. Then perhaps it would be optimal in some sense to take catastrophe as exogenous and use models to plan a personal path through the eye of the storm. Personally, I’m not ready to accept that. My modeling objective remains to die with my boots on tackling the biggest problems.
Propaganda, for one thing.
A while back I made a video about spreadsheets, that makes some points about open-loop models vs. real, closed-loop dynamic models:
The short version is that people tend to build this:
when reality works like this:
I think there are some understandable reasons to prefer the first, simpler view:
- Just understanding the dynamics of accumulation (here, the vehicle stock) may be mind-blowing enough without adding feedback complexity.
- It’s a start, and certainly better than no model or extrapolation!
Some of the reasons these models get built are a little less appetizing:
- In the short run, some loops really aren’t closed (though the short run is rarely as short as you think, and myopia gets you in trouble).
- They’re quicker and cheaper to build (if you don’t mind less insight).
- Dynamic modeling skills, both for construction and consumption, are not very widespread.
- Open-loop models are easier to calibrate (but a calibration neglecting accumulation and feedback is likely bogus and misleading).
- Open-loop models are easier to manipulate to produce a desired outcome.
I think the last point is key. At Ventana, we’ve discussed – only partly in jest – creating a “propaganda mode” for Vensim and Ventity. This would automate the discovery of a parameterization of a model that both fits history and makes a preferred policy optimal.
Perhaps the ultimate example of this is the RMSM model. 20 years ago, this was the World Bank’s preferred tool for country modeling. When Gerald Barney and Weishuang Qu replicated the model in Vensim, they discovered that is was full of disconnected trees of causality. That would permit creation of a scenario in which GDP growth marched along merrily without any water, for example. Politically, this was actually a feature, not a bug, because some users simply didn’t want to know that their pet project would displace a lot of people or destroy resources.
I think the solution here is to equip people to ask the right questions that close loops. Once there’s an appetite for dynamic, operational thinking, we can supply good modelers to provide the tools.
A slide at ISSS wonders what the big ideas about systems are:
Here’s my take:
- Stocks & flows – a.k.a. states and rates, levels and rates, integration, accumulation, delays – understanding these bathtub dynamics is absolutely central.
- Feedback -positive and negative feedback, leading to exponential growth and decay and other simple or complex behaviors.
- Emergence – including the idea that structure determines behavior, the iceberg, and more generally that complex, counter-intuitive patterns can emerge from simple structures.
- Relationships – ranging from simple connections, to networks, to John Muir’s insight, “When we try to pick out anything by itself, we find it hitched to everything else in the Universe.”
- Randomness, risk and uncertainty – this does a disservice by condensing a large domain in its own right into an aspect of systems, but it’s certainly critical for understanding the nature of evidence and decision making.
- Self-reference, e.g., autopoiesis and second-order cybernetics.
- Evolution – population selection and modification by recombination, mutation and imitation.**
- Models – recognizing that mental models, diagrams, archetypes and stories can only get you so far – eventually you need simulation and other formal tools.
- Paradigms – in the sense in which Dana Meadows meant, “The mindset or paradigm out of which the system — its goals, power structure, rules, its culture — arises.”
If pressed for simplification, I’ll take stocks, flows and feedback. If you don’t have those, you don’t have much.
* h/t Angelika Schanda for posting the slide above in the SD Society Facebook group.
** Added following a suggestion by Gustavo Collantes on LinkedIn, which also mentioned learning. That’s an interesting case, because elements of learning are present in ordinary feedback loops, in evolution (imitation), and in self-reference (system redesign).