6 more reasons to apply SD to medical research

@SDWisdom Ken Cooper lists 6 good reasons to apply System Dynamics to medical research. I think there are more if you broaden the definition of ‘medical’ :

7. Dose titration can be dynamically complex and subject to misperceptions of feedback; models make it easy.

8. Chronic autoimmune and mental health problems are embedded in a nest of feedback between the disease and the person’s environment.

9. ERs, hospitals and other delivery systems are loaded with delays, feedback and nonlinearity.

10. Smoking, diet, exercise, and other big health drivers are social phenomena.

11. Diet and exercise are entangled with other systems, like urban design and energy efficiency.

12. The health insurance system, especially in the US where it has evolved into a mess, can’t be redesigned without a systemic perspective.

4 Faces of Medical Modeling

I enjoyed the biomedical modeling plenary at #ISDC2019 more than most. I was struck by the continuum of behavior involved in the system:

  • True biomedical modeling is a bit funny, because it’s not typical System Dynamics, in the sense that it’s nonlinear dynamic simulation, but it’s not behavioral, so it’s missing one of the cornerstones of SD. Nevertheless, I think the way we think about complex systems is a useful complement to other approaches coming more from biology and mathematics (nonlinear dynamics).
  • Behavior enters one level “up”, in problems like Jim Rogers & Ed Gallaher’s work on dose titration in anemia. This is a classic case of smart people having trouble managing a system with fairly simple dynamics – essentially a single pipeline delay in the case of anemia. There may be many similar cases, where large performance improvements are available from simple models (but complicated people management).
  • Next, there are problems that combine behavioral dynamics and misperceptions of feedback with an underlying system that is also quite complex. Gizem Aktas’ work on stress and hormonal regulation is an example, as are diabetes and mental health models.
  • At the far end of the scale, there are health system models, like ReThink Health, which abstract away from the biomedical details of any particular disease. In its place, there’s an extremely complex network of human resources, incentives and decisions.

I think the opportunities are large in all of these areas. Once challenge for the field is that each requires a different interface to other researchers, health practitioners and managers. That’s a lot for relatively few modelers to manage. How can we team up to be more effective?

Closing loops – practicalities

Hybrid models are the solution to blending endogenous elegance with practicality.

My last post probably sounds like I disagree with Jack Homer’s recommendation to tolerate some exogenous drivers and consideration of policy feasibility. Actually I don’t. In fact, we at Ventana probably do more data-intensive SD than anyone. I build hybrid models all the time.

When philosophizing about the best way to change the world, it’s easy to lose sight of some practical considerations that influence choices:

  • Cost. It’s expensive to develop an elegant, endogenous theory for things like interest rates that you might normally think of as exogenous to a firm. On the other hand, it’s also expensive to collect and use data – often 1/3 of project cost in our experience.
  • Clarity. Exogenous variables complicate the analysis of a model, because you have driven behavior on top of the model’s endogenous dynamics. I think this makes it harder to understand the basic behavior, because you lose the insight you might gain from starting a model in equilibrium and perturbing it with policies.
  • Calibration. On the other hand, using exogenous drivers increases your ability to gain insight from comparison of model behavior to data. This is not a definitive test, but you can definitely use it to estimate uncertain parameters and weed out certain dumb ideas.
  • Client. You have to meet people where they are. If, historically, they think R^2 is the definitive measure of success, you’d better deliver. You can explain why that’s a bad metric and present a more endogenous view of the situation later, after you’ve established trust.

I think there’s no clear answer – the extent to which endogenous or exogenous elements are preferred has to be a situation-specific decision. In my own work, I often use a two-pronged approach, and two ways to structure that have emerged:

  • Build a single, large, calibrated model with some exogenous drivers. Build endogenous submodels or metamodels for equilibrium experiments and to  explain key features of the big model.
  • Build a single, elegant endogenous model, with few drivers. Use smaller exogenous models, or statistical and machine learning tools, to understand local features of the data, incorporating those insights into the endogenous model without using the data directly.

Closing the loops

How do we strike the right balance among closed loops revealing all possible leverage points, exogenous drivers for fidelity and economy, and practical focus on feasible policies?

On the SDwisdom blog, Jack Homer wonders whether we should be a little less endogenous:

The party line is that a model’s boundary should be broad enough so that the system’s main observed behaviors—such as S-shaped growth, oscillation, or overshoot and decline—are fully explained by the model’s endogenous structure.  One should avoid the use of exogenous time series drivers, because they undermine the ability of the model to explain and to anticipate change.

I mostly agree with this view but want to offer a friendly amendment here.  In my experience with real-world clients, I have often encountered situations in which it makes sense to employ exogenous time series for the sake of completeness and realism.

My experience suggests that we should be less doctrinaire about the endogenous perspective and understand that “endogenous” is a relative thing.  No model can be all-encompassing and explain all observed behavior patterns.  That’s why we define a model relative to some subset of behaviors also known as the dynamic problem.  As long as the model adequately addresses the dynamic problem, it shouldn’t really matter if the model has some exogenous time series included to improve the model’s realism.

The counterpoint to this might be George Richardson’s excellent article on the value of an endogenous perspective:

But the foundation of systems thinking and system dynamics lies deeper than these and is often implicit or even ignored: it is the “endogenous point of view.” The paper will begin with historical background, clarify the endogenous point of view, illustrate with examples, and argue that the endogenous point of view is the sine%qua%non of systems approaches.

The dead buffalo model on the left in Richardson’s figure is perhaps too extreme. I think what Homer is really arguing for is a middle road that might look like this:

In other words, it’s a hybrid model, with an endogenous core, and a few exogenous drivers from beyond the model boundary. For a firm, this might mean that we model the interactions of marketing, production, human resources, etc. endogenously. But we leave the world crude price and GDP of France exogenous, because we have no plausible means to influence them.

Richardson points out, in the context of Urban Dynamics:

How might thinking exogenously have affected conclusions emerging from these urban models? Consider just one: suppose, as a number of critics of various system dynamics studies have suggested, we chose to use time series data for urban population projections. Suppose the data-based projections were very carefully developed by sophisticated statistical tools and econometric methods, and suppose those projections were fed into URBAN1 in place of the endogenous stock of population. To make the implications easiest to see, let’s suppose the base run of the model looks just as it did in Figure 5b. What would Figure 5c look like? Sadly, population would not show a bump as it did in 5c because the sophisticated exogenous time series data would not be influenced in the slightest by the changing conditions in the model. We would not see the compensating urban migration effects we see in Figure 5c, and we would miss the crucial conclusion that population and business construction dynamics would naturally compensate for the jobs program. We’d probably think it was a long term policy success, and we would be dramatically wrong.

Similarly, a key part of my dissertation critique of the DICE model was the idea that its exogenous representation of technology excluded a set competitive dynamics between fossil fuels and renewables that are crucial to understanding climate policy. In both cases, leaving even a few loops exogenous might dramatically alter policy recommendations.

In another article, Homer rejoins:

In 1977, my teacher, SD pioneer Ed Roberts, wrote an important paper called “Strategies for Effective Implementation of Complex Corporate Models”, based on his years of consulting experience.  When it comes to policy recommendations, he said, “the organizations’ ability to absorb the associated change must be assessed…The model-builder and the organization both profit from implementation of moderate change proposals leading to some successful results; both lose from grandiose plans which fail to be moved ahead.”

Without too much extra effort beyond what we already do in modeling, we could assess policy feasibility by studying the political situation and its dynamics, looking at trends in public opinion surveys and other indicator data, and talking to experts. We could then calculate the “feasibility-adjusted value” of a policy by multiplying its potential impact (as determined by our SD model) by its likelihood of enactment.

In other words, one might ask, what good is it to know the nature of the best endogenous policy if the key leverage points are beyond your control? Or, as economists put it, what good are first-best policies in a second-best world?

I think Richardson might respond along the lines of Pascal’s wager:

To the extent that we accept some exogeneity in our models, we accept our fate in the lower left box. That choice presumes that our best strategy is to do a good job of predicting and preparing – making the best of a bad situation, or living to fight another day. Is that really a good idea – can we know, or at least make a good guess, whether the true state of affairs is exogenous or endogenous? Richardson argues that if we don’t know which box we’re in, it’s best to choose the endogenous approach:

If one were to recast Table 4 as a decision tree, the decision to take an endogenous point of view in all circumstances would have the highest net payoff, at least in terms of happy faces and the real feelings they represent. An endogenous point of view is potentially empowering, and that feels good to us.

If the right (endogenous) side of the diagram is better in some sense, we should ask a more important question: can we change the state of affairs, so that we can move from the left quadrants to the right? This is related to the idea that we might need to change paradigms to change the system.

This perspective is especially critical for problems like climate change, where the space of politically feasible policies currently does not include anything that actually solves the problem. Treating the infeasible loops as exogenous drivers won’t help. We need to ask, what loops that are now beyond our control can be activated in order to reach a more attractive solution?

Maybe there really is no good solution to climate and other Limits problems. Then perhaps it would be optimal in some sense to take catastrophe as exogenous and use models to plan a personal path through the eye of the storm. Personally, I’m not ready to accept that. My modeling objective remains to die with my boots on tackling the biggest problems.

See also:

Closing loops – practicalities

Why the World Loves Open Loop Models

What kind of model should I use?

Why the World Loves Open Loop Models

Propaganda, for one thing.

A while back I made a video about spreadsheets, that makes some points about open-loop models vs. real, closed-loop dynamic models:

The short version is that people tend to build this:

when reality works like this:

I think there are some understandable reasons to prefer the first, simpler view:

  • Just understanding the dynamics of accumulation (here, the vehicle stock) may be mind-blowing enough without adding feedback complexity.
  • It’s a start, and certainly better than no model or extrapolation!

Some of the reasons these models get built are a little less appetizing:

  • In the short run, some loops really aren’t closed (though the short run is rarely as short as you think, and myopia gets you in trouble).
  • They’re quicker and cheaper to build (if you don’t mind less insight).
  • Dynamic modeling skills, both for construction and consumption, are not very widespread.
  • Open-loop models are easier to calibrate (but a calibration neglecting accumulation and feedback is likely bogus and misleading).
  • Open-loop models are easier to manipulate to produce a desired outcome.

I think the last point is key. At Ventana, we’ve discussed – only partly in jest – creating a “propaganda mode” for Vensim and Ventity. This would automate the discovery of a parameterization of a model that both fits history and makes a preferred policy optimal.

Perhaps the ultimate example of this is the RMSM model. 20 years ago, this was the World Bank’s preferred tool for country modeling. When Gerald Barney and Weishuang Qu replicated the model in Vensim, they discovered that is was full of disconnected trees of causality. That would permit creation of a scenario in which GDP growth marched along merrily without any water, for example. Politically, this was actually a feature, not a bug, because some users simply didn’t want to know that their pet project would displace a lot of people or destroy resources.

I think the solution here is to equip people to ask the right questions that close loops. Once there’s an appetite for dynamic, operational thinking, we can supply good modelers to provide the tools.

A puzzling bias against experimentation

Objecting to experiments that compare two unobjectionable policies or treatments

Randomized experiments have enormous potential to improve human welfare in many domains, including healthcare, education, finance, and public policy. However, such “A/B tests” are often criticized on ethical grounds even as similar, untested interventions are implemented without objection. We find robust evidence across 16 studies of 5,873 participants from three diverse populations spanning nine domains—from healthcare to autonomous vehicle design to poverty reduction—that people frequently rate A/B tests designed to establish the comparative effectiveness of two policies or treatments as inappropriate even when universally implementing either A or B, untested, is seen as appropriate. This “A/B effect” is as strong among those with higher educational attainment and science literacy and among relevant professionals. It persists even when there is no reason to prefer A to B and even when recipients are treated unequally and randomly in all conditions (A, B, and A/B). Several remaining explanations for the effect—a belief that consent is required to impose a policy on half of a population but not on the entire population; an aversion to controlled but not to uncontrolled experiments; and a proxy form of the illusion of knowledge (according to which randomized evaluations are unnecessary because experts already do or should know “what works”)—appear to contribute to the effect, but none dominates or fully accounts for it. We conclude that rigorously evaluating policies or treatments via pragmatic randomized trials may provoke greater objection than simply implementing those same policies or treatments untested.

Complexity should be the default assumption

Whether or not we can prove that a system experiences trophic cascades and other nonlinear side-effects, we should manage as if it does, because we know that these dynamics are common.

There’s been a long-running debate over whether wolf reintroduction led to a trophic cascade in Yellowstone. There’s a nice summary here:

Do Wolves Change Rivers?

Yesterday, June initiated an in depth discussion on the benefit of wolves in Yellowstone, in the form of trophic cascade with the video: How Wolves Change the River:

This was predicted by some, and has been studied by William Ripple, Robert Beschta Trophic Cascades in Yellowstone: The first fifteen years after wolf reintroduction http://www.cof.orst.edu/leopold/papers/RippleBeschtaYellowstone_BioConserv.pdf

Shannon, Roger, and Mike, voiced caution that the verdict was still out.

I would like to caution that many of the reported “positive” impacts wolves have had on the environment after coming back to Yellowstone remain unproven or are at least controversial. This is still a hotly debated topic in science but in the popular media the idea that wolves can create a Utopian environment all too often appears to be readily accepted. If anyone is interested, I think Dave Mech wrote a very interesting article about this (attached). As he puts it “the wolf is neither a saint nor a sinner except to those who want to make it so”.

Mech: Is Science in Danger of Sanctifying Wolves

Roger added

I see 2 points of caution regarding reports of wolves having “positive” impacts in Yellowstone. One is that understanding cause and effect is always hard, nigh onto impossible, when faced with changes that occur in one place at one time. We know that conditions along rivers and streams have changed in Yellowstone but how much “cause” can be attributed to wolves is impossible to determine.

Perhaps even more important is that evaluations of whether changes are “positive” or “negative” are completely human value judgements and have no basis in science, in this case in the science of ecology.

-Ely Field Naturalists

Of course, in a forum discussion, this becomes:

Wolves changed rivers.

Not they didn’t.

Yes they did.

(iterate ad nauseam)

Prove it.

… with “prove it” roughly understood to mean establishing that river = a + b*wolves, rejecting the null hypothesis that b=0 at some level of statistical significance.

I would submit that this is a poor framing of the problem. Given what we know about nonlinear dynamics in  networks like an ecosystem, it’s almost inconceivable that there would not be trophic cascades. Moreover, it’s well known that simple correlation would not be able to detect such cascades in many cases anyway.

A “no effect” default in other situations seems equally naive. Is it really plausible that a disturbance to a project would not have any knock-on effects? That stressing a person’s endocrine system would not cause a path-dependent response? I don’t think so. Somehow we need ordinary conversations to employ more sophisticated notions about models and evidence in complex systems. I think at least two ideas are useful:

  • The idea that macro behavior emerges from micro structure. The appropriate level of description of an ecosystem, or a project, is not a few time series for key populations, but an operational, physical description of how species reproduce and interact with one another, or how tasks get done.
  • A Bayesian approach to model selection, in which our belief in a particular representation of a system is proportional to the degree to which it explains the evidence, relative to various alternative formulations, not just a naive null hypothesis.

In both cases, it’s important to recognize that the formal, numerical data is not the only data applicable to the system. It’s also crucial to respect conservation laws, units of measure, extreme conditions tests and other Reality Checks that essentially constitute free data points in parts of the parameter space that are otherwise unexplored.

The way we think and talk about these systems guides the way we act. Whether or not we can prove in specific instances that Yellowstone had a trophic cascade, or the Chunnel project had unintended consequences, we need to manage these systems as if they could. Complexity needs to be the default assumption.

Big Ideas About Systems

A slide at ISSS wonders what the big ideas about systems are:

*

Here’s my take:

  • Stocks & flows – a.k.a. states and rates, levels and rates, integration, accumulation, delays – understanding these bathtub dynamics is absolutely central.
  • Feedback -positive and negative feedback, leading to exponential growth and decay and other simple or complex behaviors.
  • Emergence – including the idea that structure determines behavior, the iceberg, and more generally that complex, counter-intuitive patterns can emerge from simple structures.
  • Relationships – ranging from simple connections, to networks, to John Muir’s insight, “When we try to pick out anything by itself, we find it hitched to everything else in the Universe.”
  • Randomness, risk and uncertainty – this does a disservice by condensing a large domain in its own right into an aspect of systems, but it’s certainly critical for understanding the nature of evidence and decision making.
  • Self-reference, e.g., autopoiesis and second-order cybernetics.
  • Evolution – population selection and modification by recombination, mutation and imitation.**
  • Models – recognizing that mental models, diagrams, archetypes and stories can only get you so far – eventually you need simulation and other formal tools.
  • Paradigms – in the sense in which Dana Meadows meant, “The mindset or paradigm out of which the system — its goals, power structure, rules, its culture — arises.”

If pressed for simplification, I’ll take stocks, flows and feedback. If you don’t have those, you don’t have much.

* h/t Angelika Schanda for posting the slide above in the SD Society Facebook group.

** Added following a suggestion by Gustavo Collantes on LinkedIn, which also mentioned learning. That’s an interesting case, because elements of learning are present in ordinary feedback loops, in evolution (imitation), and in self-reference (system redesign).