Coronavirus Curve-fitting OverConfidence

This is a follow-on to The Normal distribution is a bad COVID19 model.

I understand that the IHME model is now more or less the official tool of the Federal Government. Normally I’m happy to see models guiding policy. It’s better than the alternative: would you fly in a plane designed by lawyers? (Apparently we have been.)

However, there’s nothing magic about a model. Using flawed methods, bad data, the wrong boundary, etc. can make the results GIGO. When a bad model blows up, the consequences can be just as harmful as any other bad reasoning. In addition, the metaphorical shrapnel hits the rest of us modelers. Currently, I’m hiding in my foxhole.

On top of the issues I mentioned previously, I think there are two more problems with the IHME model:

First, they fit the Normal distribution to cumulative cases, rather than incremental cases. Even in a parallel universe where the nonphysical curve fit was optimal, this would lead to understatement of the uncertainty in the projections.

Second, because the model has no operational mapping of real-world concepts to equation structure, you have no hooks to use to inject policy changes and the uncertainty associated with them. You have to construct some kind of arbitrary index and translate that to changes in the size and timing of the peak in an unprincipled way. This defeats the purpose of having a model.

For example, from the methods paper:

A covariate of days with expected exponential growth in the cumulative death rate was created using information on the number of days after the death rate exceeded 0.31 per million to the day when different social distancing measures were mandated by local and national government: school closures, non-essential business closures including bars and restaurants, stay-at-home recommendations, and travel restrictions including public transport closures. Days with 1 measure were counted as 0.67 equivalents, days with 2 measures as 0.334 equivalents and with 3 or 4 measures as 0.

This postulates a relationship that has only the most notional grounding. There’s no concept of compliance, nor any sense of the effect of stringency and exceptions.

In the real world, there’s also no linear relationship between “# policies implemented” and “days of exponential growth.” In fact, I would expect this to be extremely nonlinear, with a threshold effect. Either your policies reduce R0 below 1 and the epidemic peaks and shrinks, or they don’t, and it continues to grow at some positive rate until a large part of the population is infected. I don’t think this structure captures that reality at all.

That’s why, in the IHME figure above (retrieved yesterday), you don’t see any scenarios in which the epidemic fizzles, because we get lucky and warm weather slows the virus, or there are many more mild cases than we thought. You also don’t see any runaway scenarios in which measures fail to bring R0 below 1, resulting in sustained growth. Nor is there any possibility of ending measures too soon, resulting in an echo.

For comparison, I ran some sensitivity runs my model for North Dakota last night. I included uncertainty from fit to data (for example, R0 constrained to fit observations via MCMC) and some a priori uncertainty about effectiveness and duration of measures, and from the literature about fatality rates, seasonality, and unobserved asymptomatics.

I found that I couldn’t exclude the IHME projections from my confidence bounds, so they’re not completely crazy. However, they understate the uncertainty in the situation by a huge margin. They forecast the peak at a fairly definite time, plus or minus a factor of two. With my hybrid-SEIR model, the 95% bounds include variation by a factor of 10. The difference is that their bounds are derived only from curve fitting, and therefore omit a vast amount of structural uncertainty that is represented in my model.

Who is right? We could argue, but since the IHME model is statistically flawed and doesn’t include any direct effect of uncertainty in R0, prevalence of unobserved mild cases, temperature sensitivity of the virus, effectiveness of measures, compliance, travel, etc., I would not put any money on the future remaining within their confidence bounds.

Crusonia on COVID19

On Saturday I gave a quick introduction to modeling the coronavirus epidemic in a Crusonia livestream, followed by Q&A. We had some really smart people on the call, with lots of interesting ideas to follow up on. I talk a bit about economic tradeoffs, using a speculative extension of my earlier model. One thing that was clearer to me after the discussion than before is just how hamstrung we’ve been by the lack of testing. I think it would not be much of stretch to say that this failure is costing us a billion dollars a day, due to inability to isolate the infected and lack of information for decision making.

The second panelist – Eugene Scarberry – was really interesting, both for his extensive experience in the trenches getting things approved at the FDA, and some good background on ventilators and alternatives, and how to get more of them.

The Normal distribution is a bad COVID19 model

Forecasting diffusion processes by fitting sigmoid curves has a long history of failure. Let’s not repeat those mistakes in the COVID19 epidemic.

I’ve seen several models explaining “flattening the curve” that use the Normal distribution as a model of the coronavirus epidemic. Now this model uses it to forecast peak hospital load:

We developed a curve-fitting tool to fit a nonlinear mixed effects model to the available admin 1 cumulative death data. The cumulative death rate for each location is assumed to follow a parametrized Gaussian error function … where the function is the Gaussian error function(written explicitly above), p controls the maximum death rate at each location, t is the time since death rate exceeded 1e-15, ß(beta)is a location-specific inflection point(time at which rate of increase of the death rate is maximum), and α(alpha)is a location-specific growth parameter. Other sigmoidal functional forms … were considered but did not fit the data as well. Data were fit to the log of the death rate in the available data, using an optimization framework described in the appendix.

One bell-shaped curve is as good as another, right? No!

Like Young Frankenstein, epidemic curves are not Normal.

1. Fit to data is a weak test.

The graph below compares 3 possible models: the Normal distribution, the Logistic distribution (which has an equivalent differential equation interpretation), and the SEIR model. Consider what’s happening when you fit a sigmoid to the epidemic data so far (red box). The curves below are normalized to yield similar peaks, but imagine what would happen to the peaks if you fit all 3 to the same data series.

The problem is that this curve-fitting exercise expects data from a small portion of the behavior to tell you about the peak. But over that interval, there’s little behavior variation. Any exponential is going to fit reasonably well. Even worse, if there are any biases in the data, such as dramatic shifts in test coverage, the fit is likely to reflect those biases as much as it does the physics of the system. That’s largely why the history of fitting diffusion models to emerging trends in the forecasting literature is so dreadful.

After the peak, the right tail of the SEIR model is also quite different, because the time constant of recovery is different from the time constant for the growth phase. This asymmetry may also have implications for planning.

2. The properties of the Normal distribution don’t match the observed behavior of coronavirus.

It’s easier to see what’s going on if you plot the curves above on a log-y scale:

The logistic and SEIR models have a linear left tail. That is to say that they have a constant growth rate in the early epidemic, until controls are imposed or you run out of susceptible people.

The Normal distribution (red) is a parabola, which means that the growth rate is steadily decreasing, long before you get near the peak. Similarly, if you go backwards in time, the Normal distribution predicts that the growth rate would have been higher back in November, when patient 0 emerged.

There is some reason to think that epidemics start faster due to social network topology, but also some reasons for slower emergence. In any case, that’s not what is observed for COVID19 – uncontrolled growth rates are pretty constant:

https://aatishb.com/covidtrends/

3. With weak data, you MUST have other quality checks

Mining data to extract relationships works great in many settings. But when you have sparse data with lots of known measurement problems, it’s treacherous. In that case, you need a model of the physics of the system and the lags and biases in the data generating process. Then you test that model against all available information, including

  • conservation laws,
  • operational correspondence with physical processes,
  • opinions from subject matter experts and measurements from other levels of aggregation,
  • dimensional consistency,
  • robustness in extreme conditions, and finally
  • fit to data.

Fortunately, a good starting point has existed for almost a century: the SEIR model. It’s not without pitfalls, and needs some disaggregation and a complementary model of policies and the case reporting process, but if you want simple projections, it’s a good place to start.

Once you have triangulation from all of these sources, you have some hope of getting the peak right. But your confidence bounds should still be derived not only from the fit itself, but also priors on parameters that were not part of the estimation process.

Update: Coronavirus Curve-fitting OverConfidence

Steady State Growth in SIR & SEIR Models

Beware of the interpretation of R0, and models that plug an R0 estimated in one context into a delay structure from another.

This started out as a techy post about infection models for SD practitioners interested in epidemiology. However, it has turned into something more important for coronavirus policy.

It began with a puzzle: I re-implemented my conceptual coronavirus model for multiple regions, tuning it for Italy and Switzerland. The goal was to use it to explore border closure policies. But calibration revealed a problem: using mainstream parameters for the incubation time, recovery time, and R0 yielded lukewarm growth in infections. Retuning to fit the data yields R0=5, which is outside the range of most estimates. It also makes control extremely difficult, because you have to reduce transmission by 1-1/R0 = 80% to stop the spread.

To understand why, I decided to solve the model analytically for the steady-state growth rate in the early infection period, when there are plenty of susceptible people, so the infection rate is unconstrained by availability of victims. That analysis is reproduced in the subsequent sections. It’s of general interest as a way of thinking about growth in SD models, not only for epidemics, but also in marketing (the Bass Diffusion model is essentially an epidemic model) and in growing economies and supply chains.

First, though, I’ll skip to the punch line.

The puzzle exists because R0 is not a complete description of the structure of an epidemic. It tells you some important things about how it will unfold, like how much you have to reduce transmission to stop it, but critically, not how fast it will go. That’s because the growth rate is entangled with the incubation and recovery times, or more generally the distribution of the generation time (the time between primary and secondary infections).

This means that an R0 value estimated with one set of assumptions about generation times (e.g., using the R package R0) can’t be plugged into an SEIR model with different delay structure assumptions, without changing the trajectory of the epidemic. Specifically, the growth rate is likely to be different. The growth rate is, unfortunately, pretty important, because it influences the time at which critical thresholds like ventilator capacity will be breached.

The mathematics of this are laid out clearly by Wallinga & Lipsitch. They approach the problem from generating functions, which give up simple closed-form solutions a little more readily than my steady-state growth calculations below. For example, for the SEIR model,

R0 = (1 + r/b1)(1 + r/b2)           (Eqn. 3.2)

Where r is the growth rate, b1 is the inverse of the incubation time, and b2 is the inverse of the recovery time. If you plug in r = 0.3/day, b1 = 1/(5 days), b2 = 1/(10 days), R0 = 10, which is not plausible for COVID-19. Similarly, if you plug in the frequently-seen R0=2.4 with the time constants above, you get growth at 8%/day, not the observed 30%/day.

There are actually more ways to get into trouble by using R0 as a shorthand for rich assumptions in models. Stochastic dynamics and network topology matter, for example. In The Failure of R0, Li Blakely & Smith write,

However, in almost every aspect that matters, R 0 is flawed. Diseases can persist with R 0 < 1, while diseases with R 0 > 1 can die out. We show that the same model of malaria gives many different values of R 0, depending on the method used, with the sole common property that they have a threshold at 1. We also survey estimated values of R 0 for a variety of diseases, and examine some of the alternatives that have been proposed. If R 0 is to be used, it must be accompanied by caveats about the method of calculation, underlying model assumptions and evidence that it is actually a threshold. Otherwise, the concept is meaningless.

Is this merely a theoretical problem? I don’t think so. Here’s how things stand in some online SEIR-type simulators:

Model R0 (dmnl) Incubation (days) Infectious (days) Growth Rate (%/day)
My original 3.3  5  7  17
Homer US 3.5  5.4  11  18
covidsim.eu 4  4 & 1  10  17
Epidemic Calculator 2.2  5.2  2.9  30*
Imperial College 2.4 5.1 ~3** 20***

*Observed in simulator; doesn’t match steady state calculation, so some feature is unknown.

**Est. from 6.5 day mean generation time, distributed around incubation time.

***Not reported; doubling time appears to be about 6 days.

I think this is certainly a Tower of Babel situation. It seems likely that the low-order age structure in the SEIR model is problematic for accurate representation of the dynamics. But it also seems like piecemeal parameter selection understates the true uncertainty in these values. We need to know the joint distribution of R0 and the generation time distribution in order to properly represent what is going on.

Steady State Growth – SIR

Continue reading “Steady State Growth in SIR & SEIR Models”

A model of COVID-19 in the US with endogenous testing, containment measures, and social distancing

Here’s another COVID-19 model. This one’s from Jack Homer of Homer Consulting. Jack is a very creative modeler, the author of some SD classics like the worker burnout model, an SD blogger, and plays a central role in important projects like Rethink Health.

The core of the model is an SEIR chain, similar to my model. This adds some nice features, including endogenous testing and a feedback decision rule for control measures. It’s parameterized for the US.

I haven’t spent significant time with the model yet, so I can’t really comment. An alarming feature of this disease is that doublings occur on the same time scale as thinking through an iteration of a model, especially if coronavirus is not your day job. I hope to add some further thoughts when I’ve thinned my backlog a bit.

From the slide deck:

Conclusions

  • The results here come from a model with several key numerical assumptions, especially around behavioral responses. As the 4 runs illustrate, if the assumptions are modified, the overall results change over some range of possibility.
  • My assumptions about the behavioral responses were informed by what we been seeing recently in the US: a good response, even in regions not yet hard-hit. The message is out, and it is having an effect.
  • Because of the responses, and despite the absence of a vaccine, I conclude this epidemic will not infect a third or half of the population as some have predicted. Rather, we are likely to see 6m-28m cases in the US in total, resulting in 100k-500k deaths. This projection assumes a vaccine available by next April.
  • I also conclude that our hospital system overall has enough bed capacity to handle the peak load late April/early May; and enough ventilator capacity except during those 3 weeks in the more pessimistic Slowboth scenario. We would need 180k ventilators (rather than the assumed 120k) to avoid this shortage in the pessimistic scenario.
  • I have not addressed here the impact of containment measures and social distancing on the economy, including the supply of food and other necessities. This supply is important, affecting our ability to maintain strong containment and distancing.

This archive contains the Vensim model in mdl and vpmx format, a custom graph set (already loaded in the model), and some runs:

homerCOVID19v2.zip

A nice slide deck documenting the results:

Covid19US model jh v2.pdf

This uses data via GET XLS so it won’t work with PLE; the Model Reader will work.

Update, 3/24/2020: This version refines the model. I’ve added a copy with the data variables deleted, that should work with PLE.

Covid19US-model-jh-v2c.zip

The Chartjunk Pandemic

So much junk, so little time.

The ‘net is awash with questionable coronavirus memes. The most egregiously flawed offender I’ve seen is this one from visualcapitalist:

It’s interesting data, but the visualization really fails to put COVID19 in a proper perspective.

 

Exponential Growth

The biggest problem is obvious: the bottom of the curve is nothing like the peak for a quantity that grows exponentially.

Comparing the current death toll from COVID19, a few months old, to the final values from other epidemics over years to decades, is just spectacularly misleading. It beggars belief that anyone could produce such a comparison.

Perspective

Speaking of perspective, charts like this are rarely a good idea. This one gives the impression that 5M < 3M:

Reliance on our brains to map 2D to 3D is even more problematic when you consider the next problem.

2D or 3D?

Measuring the fur-blob sizes shows that the mapping of the data to the blobs is two-dimensional: the area of the blob on the page represents the magnitude. But the blobs are clearly rendered in 3D. That means the visual impression of the relationship between the Black Death (200M) and Japanese Smallpox (1M) is off by a factor of 15. The distortion is even more spectacular for COVID19.

You either have to go all the way with 3D, in which case COVID19 looks bigger, even with the other distortions unaddressed, or you need to make a less-sexy but more-informative flat 2D chart.

Stocks vs. Flows

The fourth problem here is that the chart neglects time. The disruption from an epidemic is not simply a matter of its cumulative death toll. The time distribution also matters: a large impact concentrated in a brief time frame has much greater ripple effects, as we are now experiencing.

Open Letter on Coronavirus

For all friends in the Bozone … this is letter I sent to a community list earlier today:

I appreciate the timely information from Chris Mehl. I’d like to emphasize a few points. I’m largely drawing on my thinking from an epidemic model that I posted here a few days ago, https://youtu.be/mxUsBI0Wr2M

It’s unfortunate that we now have our first confirmed COVID19 case, but we knew it was coming. One case likely means several more exposed asymptomatic are already here. Things could develop quickly: Italy went from 3 cases to 3858 in two weeks, although some of that is illusory due to the expansion of testing. However, that is not necessarily our fate – we may be luckier, and we definitely can change some things.

Two key points about this disease:
– The Covid-19 virus is transmissible before people become symptomatic. We therefore cannot wait to take action until there are confirmed or presumptive cases, or even people who have potentially been exposed, in our social networks, faith groups, communities, workplace, towns, or cities.
– This is not just about us individually. The disease disproportionately attacks older people, and people with compromised immune systems. The life you save through personal behavior may be your grandmother’s.

The response has a lot of moving parts.
1. The end of the line is treatment for the severely ill. The flu peak is typically 5-6% of ER admissions in Montana, so something just as widespread and 10x more serious is an obvious threat. It’s critical to avoid overwhelming (and infecting) our precious and limited health care workers, because if we don’t, fatality rates go up, as happened in Italy and Wuhan.
2. We can reduce contacts with the infected through monitoring, quarantine and isolation. This is the public health bailiwick, but I would expect that a large caseload could overwhelm them as well. I hope they’ll reach out to the community for help if needed. Independent of that, perhaps we can make it easy on people who are self-isolating, by organizing delivery of essential services? For employers, I hope that generous sick leave is a matter of enlightened self-interest: do you want 1 employee out for a week, or all of them out two weeks later?
We have two options that scale up well:
3. We can reduce the risk of each social contact by avoiding proximity and touching (elbow bumps instead of handshakes) and cleaning everything – handwashing, hard surfaces, etc. Lots of good info out there on this. (Sadly, hoarding sanitizers doesn’t contribute.)
4. Finally, we can reduce the number of contacts per person, aka social distancing. Cancelling nonessential group events, or moving them online, is very influential. One Biogen company meeting spawned 100 infections in Boston. The Atlantic has a nice discussion of the nuances: https://www.theatlantic.com/…/coronavirus-what-does…/607927/
If half the infected are isolated, and we have half as many contacts, and those contacts are made 50% safer, then we’ve reduced the transmission of infection by 87.5%. That’s enough to slow the infection rate to less than the recovery rate, so the disease will die out.

If we do that, we’re not quite out of the woods. Social distancing is going to be hard to sustain. But if you stop to soon, the disease comes back. So we’ll need a plan for transitioning from more disruptive social distancing measures to things we can sustain.

When we have to close schools, which I think is likely, we will need to find ways to provide good nutrition and safe spaces for kids, without risk of infection. We can help.

Social distancing is also disruptive to the economy. Our tourism industry and performing arts, and surely other sectors I haven’t thought of, are going to have a rough time. We need to mitigate that.

It’s hard on our social fabric, so things like the Renewal Network’s recent links are important. We need to figure out how to support, comfort and play interact with each other … six feet apart, like the Italians singing from their balconies in the darkness of locked-down Siena.

Fortunately, some bad outcomes are very unlikely. There’s no reason for the food system to break down, for example. Inventories are large and the working-age population won’t have high mortality. So keeping a decent stock of food in case you’re sick is prudent, but panic buying of massive quantities is unproductive. This is not the zombie apocalypse.

There is a Cassandra’s curse here. If we succeed, the worst-case scenarios won’t come true, and some will accuse us of overreacting. That takes courage.

Finally, a personal note. David Brooks had a gloomy piece in the New York Times a day or two back, detailing social breakdown during historic plagues. I think that is not our fate. We have new tools at our disposal, i.e. the internet, that did not exist in previous pandemics. Incredibly, I know – by just one hop on social media – people involved in several of the defining events of this epidemic, including the Siena singers. W now have a powerful and non-infectious way for us to stay coordinated; we just have to be sure that we find ways to reach out to people who are not so digitally connected.

There’s a huge trove of digital resources here:
https://coronavirustechhandbook.com/

Tom Continue reading “Open Letter on Coronavirus”

Vensim SIR modeling primer

I’ve added an SIR modeling primer video to the Vensim coronavirus page, where you can download the models and the software.

This illustrates most of the foundations of the community coronavirus model. Feel free to adapt any of these tools for education or other purposes (but please respect the free Vensim PLE educational license and buy a paid copy if you’re doing commercial work).

 

Is coronavirus different in the UK and Italy?

BBC UK writes, Coronavirus: Three reasons why the UK might not look like Italy. They point to three observations about the epidemic so far:

  1. Different early transmission – the UK lags the epidemic in Italy
  2. Italy’s epidemic is more concentrated
  3. More of Italy’s confirmed cases are fatal

I think these speculations are misguided, and give a potentially-disastrous impression that the UK might somehow dodge a bullet without really trying. That’s only slightly mitigated by the warning at the end,

Don’t relax quite yet

Even though our epidemic may not follow Italy’s exactly, that doesn’t mean the UK will escape serious changes to its way of life.

Epidemiologist Adam Kucharski warns against simple comparisons of case numbers and that “without efforts to control the virus we could still see a situation evolve like that in Italy”, even if not necessarily in the next four weeks.

… which should be in red 72-point text right at the top.

Will the epidemic play out differently in the UK? Surely. But it really look qualitatively different? I doubt it, unless the reaction is different.

The fundamental problem is that the structure of a system determines its behavior. A slinky will bounce if you jiggle it, but more fundamentally it bounces because it’s a spring. You can jiggle a brick all you want, but you won’t get much bouncing.

The system of a virus spreading through a population is the same. The structure of the system says that, as long as the virus can infect people faster than they recover, it grows exponentially. It’s inevitable; it’s what a virus does. The only way to change that is to change the structure of the system by slowing the reproduction. That happens when there’s no one left to infect, or when we artificially reduce the infection rate though social distancing, sterilization and quarantine.

A change to the initial timing or distribution of the epidemic doesn’t change the structure at all. The slinky is still a slinky, and the epidemic will still unfold exponentially. Our job, therefore, is to make ourselves into bricks.

The third point, that fatality rates are lower, may also be a consequence of the UK starting from a different state today. In Italy, infections have grown high enough to overwhelm the health care system, which increases the fatality rate. The UK may not be there yet. However, a few doublings of the number of infected will quickly close the gap. This may also be an artifact of incomplete testing and lags in reporting.

Here’s a more detailed explanation:

More Interactive Coronavirus Models

Jeroen Struben has a nice interactive web simulation running online at Forio. It’s multiregion, with diffusion of infection across borders, and includes some of the interesting structures I excluded from my simple model, including explicit quarantine and vaccination, and testing and reporting lags.

The NYT has a very simple interactive simulation embedded in:

How Much Worse the
Coronavirus Could Get, in Charts

As always, I’m eager to know of more – please comment!