Exponential Epi Pens

Mylan Pharmaceuticals is in the news for taking the price of EpiPens, which contain about $1 of active ingredient, to stratospheric levels. I think Bloomberg broke the story, and the NY Times has the latest.

Here’s the price trajectory:


epi pen data.xlsx

The rate of increase is not that far from the health care inflation rate in general, except that in this case, there’s no obvious underlying cost driver, hence the allegations of gouging.

Here’s a first cut at the structure of the problem:


Econ 101 says that high profits should attract competition, putting downward pressure on prices (loop B 101). However, that’s not happening, because the FDA is the gatekeeper on product approval. It’s not clear to me whether the FDA just makes the approval delay systematically long and uncertain, or that it’s actually captive to Mylan lobbyists and holding new entrants to higher standards, as some hint (that would be a reinforcing loop, R2). Either way, the only loop that’s functioning is Mylan’s reinvestment in marketing and lobbying to create demand (R 1).

This reminds me of California’s electricity market deregulation debacle, which created a wholesale power market without corresponding retail price elasticity. Utilities were stranded between hammer (floating generation prices) and anvil (fixed demand). The resulting mess was worse than might have occurred in either a more or less deregulated market.

Similarly, to bring this market under control, you’d either have to get the FDA out of the way, restoring the balancing loop, or regulate the price side of the market, constraining the reinforcing loop. In this case, it may be the court of public opinion that puts the brakes on, adding a balancing loop of bad press that has so far cost Mylan dearly in investor confidence, if nothing else.

Mylan responds to gouging allegations rather unconvincingly, I think. Their CEO argues that the problem is multiple markups in the supply chain, subsidization of Europe, and R&D. It’s hard to square those external-cause arguments with Mylan’s financials.

Blood pressure regulation

The Tech Review Arxiv blog has a neat summary of new research on high blood pressure. It turns out that the culprit may be a feedback mechanism that can’t adequately respond to stiffening of the arteries with age:

The human body has a well understood mechanism for monitoring blood pressure changes, consisting of sensors embedded in the major arterial walls that monitor changes in pressure and then trigger other changes in the body to increase or reduce the pressure as necessary, such as the regulation of the volume of fluid in the blood vessels. This is known as the baroreceptor reflex.

So an interesting question is why this system does not respond appropriately as the body ages. Why, for example, does this system not reduce the volume of fluid in the blood to decrease the pressure when it senses a high systolic pressure in an elderly person?

The theory that Pettersen and co have tested is that the sensors in the arterial walls do not directly measure pressure but instead measure strain, that is the deformation of the arterial walls.

As these walls stiffen due to the natural ageing process, the sensors become less able to monitors changes in pressure and therefore less able to compensate.

Spot the health care smokescreen

A Tea Party presentation on health care making the rounds in Montana claims that life expectancy is a smoke screen, and it’s death rates we should be looking at. The implication is that we shouldn’t envy Japan’s longer life expectancy, because the US has lower death rates, indicating superior performance of our health care system.

Which metric really makes the most sense from a systems perspective?

Here’s a simple, 2nd order model of life and death:

From the structure, you can immediately observe something important: life expectancy is a function only of parameters, while the death rate also includes the system states. In other words, life expectancy reflects the expected life trajectory of a person, given structure and parameters, while the aggregate death rate weights parameters (cohort death rates) by the system state (the distribution of population between old and young).

In the long run, the two metrics tell you the same thing, because the system comes into equilibrium such that the death rate is the inverse of the life expectancy. But people live a long time, so it might take decades or even centuries to achieve that equilibrium. In the meantime, the death rate can take on any value between the death rates of the young and old cohorts, which is not really helpful for understanding what a new person can expect out of life.

So, to the extent that health care performance is visible in the system trajectory at all, and not confounded by lifestyle choices, life expectancy is the metric that tells you about performance, and the aggregate death rate is the smokescreen.

Here’s the model: LifeExpectancyDeathRate.mdl or LifeExpectancyDeathRate.vpm

It’s initialized in equilibrium. You can explore disequilbrium situations by varying the initial population distribution (Init Young People & Init Old People), or testing step changes in the death rates.

False positives, publication bias and systems models

A PLOS Medicine paper asserts that most published results are false.

It can be proven that most claimed research findings are false

Corollary 1: The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.

Corollary 2: The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.

Corollary 3: The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true.

Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.

Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.

Corollary 6: The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.

This somewhat alarming result arises from fairly simple statistics of false positives, publication selection bias, and causation vs. correlation problems. While the math is incontrovertible, some of the assumptions have been challenged:

… calculating the unreliability of the medical research literature, in whole or in part, requires more empirical evidence and different inferential models than were used. The claim that “most research findings are false for most research designs and for most fields” must be considered as yet unproven.

Still, the argument seems to be a matter of how much rather than whether publication bias influences findings:

We agree with the paper’s conclusions and recommendations that many medical research findings are less definitive than readers suspect, that P-values are widely misinterpreted, that bias of various forms is widespread, that multiple approaches are needed to prevent the literature from being systematically biased and the need for more data on the prevalence of false claims.

(Others propose similar challenges. There’s conflicting literature about whether (weak) observational studies hold up with (strong) randomized follow-up trials.)

This is obviously a big problem from a control perspective, because the kind of information provided by the studies in question is key to managing many systems, as in Nancy Leveson‘s pharma safety example:

It’s also leads me to a rather pointed self-question. To what extent is typical system dynamics modeling practice subject to the same kinds of biases? Can we say not only that all models are wrong, but that most are useless?

First the good news.

  • SD doesn’t usually operate in the data mining space, where large observational studies seek effects absent any a priori causal theory. That means we’re not operating where false positives are most likely to arise.
  • Often, SD practitioners are not testing our own pet theories, but those of some decision makers – perhaps even theories of competing interests in an organization.
  • SD models play a “knowledge integration” role that’s somewhat analogous to meta-analysis. A meta-analysis pools the statistics from a number of replications of some observation, which improves the signal to noise ratio, making it easier to see whether there’s any baby in the bathwater. An SD model instead pools the effect sizes of inputs (studies or anecdotes) and puts them to a functional test: do the individual components, assembled into a system, yield the observed behavior of the macro system?
  • Similarly, good SD modelers tend to supplement purely statistical inputs with Reality Checks that effectively provide additional data verification by testing extreme conditions where outcomes are known (though this is not helpful if you don’t know anything about relationships to begin with).
  • Including physics (using the term loosely to include things like conservation of people) in models also greatly constrains the space of plausible hypotheses a priori.

Now the bad news.

  • Models are often used in one-off, non-replicable strategic decision making situations, so we’ll never know. Refereed forecasting helps, but success can still be due to luck rather than skill.
  • We often have to formalize soft variable concepts for which definitions are uncertain and measurements are lacking.
  • SD models are often reliant on thin literature bases, small studies, or subject matter expertise to establish relationships. Studies with randomized control are a rarity.
  • Available data for model verification is often of low quality and short duration.
  • Data can provide a weak check on the model – if a system exhibits exponential growth, for example, one positive feedback loop in the dynamic hypothesis is as good as another (though of course good a priori explanations of the structure of the system help).

My suspicion is that savvy modelers are already well aware of just how messy and uncertain their problem domains are. Decisions will be taken, with or without a model, so the real objective is to use the model to add value by rejecting ideas that don’t work. The problem then is not that wrong models make decisions worse, but that we could probably do a lot better if we could be smarter about the possible biases in models and thinking in general.

Alex Tabarrok at Marginal Revolution has a nice take on remedies:

What can be done about these problems? (Some cribbed straight from Ioannidis and some my own suggestions.)

1) In evaluating any study try to take into account the amount of background noise. That is, remember that the more hypotheses which are tested and the less selection which goes into choosing hypotheses the more likely it is that you are looking at noise.

2) Bigger samples are better. (But note that even big samples won’t help to solve the problems of observational studies which is a whole other problem).

3) Small effects are to be distrusted.

4) Multiple sources and types of evidence are desirable.

5) Evaluate literatures not individual papers.

6) Trust empirical papers which test other people’s theories more than empirical papers which test the author’s theory.

7) As an editor or referee, don’t reject papers that fail to reject the null.

For SD modeling, I’d add a few more:

8) Reserve time for exploration of uncertainty (lots of Monte Carlo simulation).

9) Calibrate your confidence bounds.

10) Help clients to appreciate the extent and implications of uncertainty.

11) Pay attention to the language used to describe statistical concepts. Words like “expectation” and “significance” that have specific mathematical interpretations don’t mean the same thing to managers.

11) Look for robust policies that work irrespective of uncertain relationships.

12) Explicitly seek out and test alternative hypotheses (This sounds like it’s at odds with Corollary 3 above, but I think it’s the right thing to do. Testing multiple hypotheses in the context of the model is not the same thing as mining data for multiple relationships.).

13) If you can’t estimate something directly from data, or back it up with literature (more than a single paper), at least articulate some bounds on the effect, perhaps through experiments with a submodel.

What do you think? When is modeling and statistical analysis helpful, and when is it risky business?



Fat taxes & modeling

NPR covers a Danish move to tax saturated fat:

So when the tiny Scandinavian country announced it would be imposing a 16 Kroner (about $3 U.S.) tax on every kilogram of saturated fat as a way to discourage poor eating habits and raise revenue, we were left scratching our heads.

How’s that going to work?

Ole Linnet Juul, food director at Denmark’s Confederation of Industries, tells The Washington Post that the tax will increase the price of a burger by around $0.15 and raise the price of a small package of butter by around $0.40.

Our pals over at Planet Money took a stab last year at explaining the economics of our version of the fat tax — the soda tax. They conclude that price increases do drive down demand somewhat.

But couldn’t Danes just easily sneak over to neighboring Sweden for butter and oil and simply avoid paying the tax, throwing all revenue calculations off?

Meanwhile, some health studies indicate a soda tax doesn’t work to curb obesity anyways.

First a few obvious problems: oil is typically not saturated and therefore presumably wouldn’t fall under the tax. And sneaking over the border for butter? Seriously? You’d better bring back a heckuva lot, because there’s the little matter of the Øresund Strait, which now has a handy bridge, and a 36 EUR toll to go with it.

More interesting is the use of models in the linked studies. From the second (“doesn’t work”):

But new research from Northwestern University suggests that soda taxes don’t actually help obese people lose weight, largely because people with weight problems already tend to drink diet soda rather than the sugary kind. So taxing full-calorie sodas may not help many Americans make better dietary choices.

Patel ran computer simulations designed to track how soda prices would affect obesity rates. The findings demonstrated that a sugar tax would cause a negligible drop in obesity, about 1.4%, and that obese people would not lose much weight. “For people going from [body mass indexes] of over 30 to below that…most people are not having massive swings,” Patel said.

For the study, Patel’s team collected data on people with “all ranges of BMI” from the Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System, which has tracked health conditions in the U.S. for nearly three decades. They also collected a data set of soda prices and sales to estimate consumer practices, which they used to predict what people would purchase before and after the implementation of a soda tax. Based on the resulting change in total calories consumed per day over a set time period, the team modeled long-term changes in weight using existing nutrition literature.

Kelly Brownell, the director of the Rudd Center for Food Policy and Obesity at Yale University, has doubts about the accuracy of studies such as Patel’s. Simulations of the potential impact of public health actions such as a soda tax are based on a huge number of assumptions — about consumption, spending behavior, weight change — that are, in reality, difficult to make accurately, he explains.

“All of those changes are unknown,” he said. “So it’s not hard to allow those assumptions to create the results you want.”

Patel counters that assumptions are inevitable in research, and that previous studies that have produced results in favor of soda taxes have also made assumptions, typically about consumer preferences. “I’m trying to see if there are any critical assumptions here that really change the results, but so far I haven’t had anything like that,” he said. “It’s a somewhat valid criticism, but the paper is still being fleshed out, and there are a variety of robustness checks.”

But Patel acknowledges that his study could not predict whether a soda tax would help prevent people from consuming sweetened drinks in the first place and becoming fat later on — another point raised by Brownell. “The question of whether a soda tax could prevent people from becoming obese in the future…that’s still kind of an open question because there are some issues on how you model weight change that to my knowledge haven’t been addressed,” he said. “It’s possible that a soda tax could prevent people from becoming obese in the future, but for people already obese it’s not really going to do anything.”

As press coverage of models goes, this is actually pretty good, and Patel is nicely circumspect about the limitations of the work. The last paragraph hints at one thing that strikes me as extremely important though: the study model is essentially open loop, with price->choice->calories->body mass causality. The real world is closed loop, with important feedbacks between health and future choices of diet and exercise, and social interactions involved in choices. I suspect that the net result is that the long term effect of pricing, or any other measure, on health is substantially greater than the open loop analysis indicates, especially if you’re clever about exploiting the various positive loops that create obesity traps.

Brownell’s complaint – that we know nothing, so we can just plug in assumptions to get whatever answer we want – irks me. It betrays an ignorance of models (especially nonlinear dynamic ones), which are typically more constrained than unstated mental models, not less.

There seems to be a flowering of health and obesity models in system dynamics lately, with some interesting posters and papers at the last few conferences. There’s hope for closing those loops yet.

More power of personal feedback

Now that I’ve dumped on emerging behavioral feedback technologies, perhaps I should share a personal success story, in which measurement technology played a key role.

Ten years ago, a routine test revealed that my cholesterol was 280 mg/dl, and even higher in a confirmation test. That’s not instant death, but it’s bad. NIH calls <200 desirable, and many argue for even lower levels.

This was a surprise, because I was getting a fair amount of exercise and eating healthier than the typical American diet. I suspect that their must be some genetic component.

Without any discussion, my doctor handed me a prescription for Lipitor. Now, I liked that doctor, and I know he was smart because we’d just had an interesting conversation about wavelet analysis of time series data in biomedical research. But I think he was operating under the assumption that there was no potential for improvement from behavior change. This idea seems to grip much of the medical profession, and creates nasty self-fulfilling prophecy and eroding goals dynamics.

I decided that I didn’t want to take statins for the rest of my (hopefully long) life, so with the aid of spousal prodding and planning, I eliminated all cholesterol and saturated fats (essentially all animal products) from my diet. I was quickly below 200, and then made more gradual progress to a range of about 160 to 180.

Interestingly, since then I’ve also cut out a lot of carbohydrates, because the rest of my family is gluten intolerant, which takes the fun out of bread and pasta. My cholesterol is now lower than ever, 149 at last check, in spite of adding eggs, a big dietary cholesterol source, back into my diet.

While my wife deserves most of the credit for my success, I think technology played a key role as well. Early on, I bought a home cholesterol test meter (a Bioscanner 2000, predecessor to the CardioChek that I now have). The meter allowed me to close the loop between behavior and outcome without the long delay and expense involved with a trip to the doctor. That obviously had a practical benefit, but it was also very motivating.

Continue reading “More power of personal feedback”

The Ryan health care proposal

The Ryan budget proposal achieves the bulk of its savings by cutting health care outlays, particularly Medicare and Medicaid. The mechanism sounds a lot like a firm’s transition from a defined benefits pension plan to a defined contribution scheme. Medicaid becomes a system of block grants to states, and Medicare becomes a system of flat-rate vouchers. Along the way, it has some useful aspirations: to separate health insurance from employment and eliminate health’s favored tax status.

Reading some of the finer print, though, I don’t think it really fixes the fundamental flaws of the current system. It’s billed as “universal access” but that’s a misnomer. It guarantees universal access to a tax credit or voucher that can be used to purchase coverage, but not universal access to coverage. That’s because it doesn’t solve the adverse selection problem. As a result, any provider that doesn’t play the usual game of excluding anyone who’s really sick from coverage (using preexisting conditions and rotating plan changes) will suffer a variant of the utility death spiral: increasing costs drive the healthy out of the plan, leaving it to serve a diminishing set of members who had the misfortune to get sick, at an escalating cost.

Universal access to coverage is left to the states, who can create assigned risk pools or other methods to cover the uncoverable. Leaving things to the states strikes me as a reasonable strategy, because the health system is so complex that evolutionary learning is likely to beat the kind of deliberate design we’ll get out of congress. But it’s not clear to me that the proposal creates any real authority to raise money to support these assigned risk pools; without money, the state mechanisms will be rather perfunctory.

The real challenge seems to me to be to address three features of health:

  • Prevention beats cure by a long shot, in terms of both cost and quality of life. In the current system, patient churn through providers eliminates most of the provider-side incentive to address this. Patients have contributed by abdicating responsibility for their own health, and insurance exacerbates the problem by obscuring the costs of the quadruple bypass that follows from a life of Big Macs.
  • Health care expenditures are extremely skewed over one’s lifetime and within age cohorts. Good behavior can’t mitigate all risk, particularly the risk of getting old. (See below for a peek at the data.)
  • In some circumstances, the health care system is capable of expending an extremely large amount of resources on a person – sometimes for a miraculous outcome, and sometimes for rather marginal end-of-life extension.

What’s needed is a distributed way to share risk (which is why it’s called insurance), while preserving incentives for good behavior and matching total expenditures to resources. That’s a tall order. It’s not clear to me that the Ryan proposal tackles it in any serious way; it just extends the flaws of the current system to Medicare patients.

healthExpendAgeIncomeMEPSPer capita annual medical expenditures from the MEPS panel, by age and income. There’s surprisingly little variation by income, but a lot by age. The bill terminates the agency that collects this data.

healthExpendAgeDecileMEPSHealth expenditures by age and decile of cohort, showing the extreme concentration of expenditures at all ages.

The really fine print, the text of the bill itself, is daunting – 629 pages. This strikes me as simply unmanageable (like the deceased cap and trade legislation). There are simply too many opportunities for unintended consequences, and hidden agendas, in such a multifaceted approach, especially with the opaque analytic support available. Surely this could be tackled in a series of smaller bites – health, revenue, other expenditures. It calls to mind the criticism of the FAA’s repeated failure to redesign the air traffic control system, “you can’t design a system that evolved.” Well, maybe you can, but not with the kind of tools and discourse that now prevail.

A System Zoo

I just picked up a copy of Hartmut Bossel’s excellent System Zoo 1, which I’d seen years ago in German, but only recently discovered in English. This is the first of a series of books on modeling – it covers simple systems (integration, exponential growth and decay), logistic growth and variants, oscillations and chaos, and some interesting engineering systems (heat flow, gliders searching for thermals). These are high quality models, with units that balance, well-documented by the book. Every one I’ve tried runs in Vensim PLE so they’re great for teaching.

I haven’t had a chance to work my way through the System Zoo 2 (natural systems – climate, ecosystems, resources) and System Zoo 3 (economy, society, development), but I’m pretty confident that they’re equally interesting.

You can get the models for all three books, in English, from the Uni Kassel Center for Environmental Systems Research, http://www.usf.uni-kassel.de/cesr/. Follow the Download link and choose the Software category to obtain a .zip archive of the zoo models for the whole series, in Vensim .mdl format.

To tantalize you, here are some images of model output from Zoo 1. First, a phase map of a bistable oscillator, which was so interesting that I built one with my kids, using legos and neodymium magnets:

Continue reading “A System Zoo”

Cheese is Murder

Needlessly provocative title notwithstanding, the dairy industry has to be one of the most spectacular illustrations of the battle for control of system leverage points. In yesterday’s NYT:

Domino’s Pizza was hurting early last year. Domestic sales had fallen, and a survey of big pizza chain customers left the company tied for the worst tasting pies.

Then help arrived from an organization called Dairy Management. It teamed up with Domino’s to develop a new line of pizzas with 40 percent more cheese, and proceeded to devise and pay for a $12 million marketing campaign.

Consumers devoured the cheesier pizza, and sales soared by double digits. “This partnership is clearly working,” Brandon Solano, the Domino’s vice president for brand innovation, said in a statement to The New York Times.

But as healthy as this pizza has been for Domino’s, one slice contains as much as two-thirds of a day’s maximum recommended amount of saturated fat, which has been linked to heart disease and is high in calories.

And Dairy Management, which has made cheese its cause, is not a private business consultant. It is a marketing creation of the United States Department of Agriculture — the same agency at the center of a federal anti-obesity drive that discourages over-consumption of some of the very foods Dairy Management is vigorously promoting.

Urged on by government warnings about saturated fat, Americans have been moving toward low-fat milk for decades, leaving a surplus of whole milk and milk fat. Yet the government, through Dairy Management, is engaged in an effort to find ways to get dairy back into Americans’ diets, primarily through cheese.

Now recall Donella Meadows’ list of system leverage points:

Leverage points to intervene in a system (in increasing order of effectiveness)
12. Constants, parameters, numbers (such as subsidies, taxes, standards)
11. The size of buffers and other stabilizing stocks, relative to their flows
10. The structure of material stocks and flows (such as transport network, population age structures)
9. The length of delays, relative to the rate of system changes
8. The strength of negative feedback loops, relative to the effect they are trying to correct against
7. The gain around driving positive feedback loops
6. The structure of information flow (who does and does not have access to what kinds of information)
5. The rules of the system (such as incentives, punishment, constraints)
4. The power to add, change, evolve, or self-organize system structure
3. The goal of the system
2. The mindset or paradigm that the system – its goals, structure, rules, delays, parameters – arises out of
1. The power to transcend paradigms

The dairy industry has become a master at exercising these points, in particular using #4 and #5 to influence #6, resulting in interesting conflicts about #3.

Specifically, Dairy Management is funded by a “checkoff” (effectively a tax) on dairy output. That money basically goes to marketing of dairy products. A fair amount of that is done in stealth mode, through programs and information that appear to be generic nutrition advice, but happen to be funded by the NDC, CNFI, or other arms of Dairy Management. For example, there’s http://www.nutritionexplorations.org/ – for kids, they serve up pizza:


That slice of “combination food” doesn’t look very nutritious to me, especially if it’s from the new Dominos line DM helped create. Notice that it’s cheese pizza, devoid of toppings. And what’s the gratuitous bowl of mac & cheese doing there? Elsewhere, their graphics reweight the food pyramid (already a grotesque product of lobbying science), to give all components equal visual weight. This systematic slanting of nutrition information is a nice example of my first deadly sin of complex system management.

A conspicuous target of dubious dairy information is school nutrition programs. Consider this, from GotMilk:

Flavored milk contributes only small amounts of added sugars to children ‘s diets. Sodas and fruit drinks are the number one source of added sugars in the diets of U.S. children and adolescents, while flavored milk provides only a small fraction (< 2%) of the total added sugars consumed.

It’s tough to fact-check this, because the citation doesn’t match the journal. But it seems likely that the statement that flavored milk provides only a small fraction of sugars is a red herring, i.e. that it arises because flavored milk is a small share of intake, rather than because the marginal contribution of sugar per unit flavored milk is small. Much of the rest of the information provided is a similar riot of conflated correlation and causation and dairy-sponsored research. I have to wonder whether innovations like flavored milk are helpful, because they displace sugary soda, or just one more trip around a big eroding goals loop that results in kids who won’t eat anything without sugar in it.

Elsewhere in the dairy system, there are price supports for producers at one end of the supply chain. At the consumer end, their are price ceilings, meant to preserve the affordability of dairy products. It’s unclear what this bizarre system of incentives at cross-purposes really delivers, other than confusion.

The fundamental problem, I think, is that there’s no transparency: no immediate feedback from eating patterns to health outcomes, and little visibility of the convoluted system of rules and subsidies. That leaves marketers and politicians free to push whatever they want.

So, how to close the loop? Unfortunately, many eaters appear to be uninterested in closing the loop themselves by actively seeking unbiased information, or even actively resist information contrary to their current patterns as the product of some kind of conspiracy. That leaves only natural selection to close the loop. Not wanting to experience that personally, I implemented my own negative feedback loop. I bought a cholesterol meter and modified my diet until I routinely tested OK. Sadly, that meant no more dairy.

Ultradian Oscillations of Insulin and Glucose

Citation: Jeppe Sturis, Kenneth S. Polonsky, Erik Mosekilde, and Eve van Cauter. Computer Model for Mechanisms Underlying Ultradian Oscillations of Insulin and Glucose. Am. J. Physiol. 260 (Endocrinol. Metab. 23): E801-E809, 1991.

Source: Replicated by Hank Taylor

Units: No

Format: Vensim

Ultradian Oscillations of Insulin and Glucose (Vensim .vpm)