## Linear regression bathtub FAIL

I seldom run across an example of so many things that can go wrong with linear regression in one place, but one just crossed my reader.

A new paper examines the relationship between CO2 concentration and flooding in the US, and finds no significant impact:

Has the magnitude of floods across the USA changed with global CO2 levels?

R. M. Hirsch & K. R. Ryberg

Abstract

Statistical relationships between annual floods at 200 long-term (85–127 years of record) streamgauges in the coterminous United States and the global mean carbon dioxide concentration (GMCO2) record are explored. The streamgauge locations are limited to those with little or no regulation or urban development. The coterminous US is divided into four large regions and stationary bootstrapping is used to evaluate if the patterns of these statistical associations are significantly different from what would be expected under the null hypothesis that flood magnitudes are independent of GMCO2. In none of the four regions defined in this study is there strong statistical evidence for flood magnitudes increasing with increasing GMCO2. One region, the southwest, showed a statistically significant negative relationship between GMCO2 and flood magnitudes. The statistical methods applied compensate both for the inter-site correlation of flood magnitudes and the shorter-term (up to a few decades) serial correlation of floods.

There are several serious problems here.

First, it ignores bathtub dynamics. The authors describe causality from CO2 -> energy balance -> temperature & precipitation -> flooding. But they regress:

ln(peak streamflow) = beta0 + beta1 × global mean CO2 + error

That alone is a fatal gaffe, because temperature and precipitation depend on the integration of the global energy balance. Integration renders simple pattern matching of cause and effect invalid. For example, if A influences B, with B as the integral of A, and A grows linearly with time, B will grow quadratically with time. The situation is actually worse than that for climate, because the system is not first order; you need at least a second-order model to do a decent job of approximating the global dynamics, and much higher order models to even think about simulating regional effects. At the very least, the authors might have explored the usual approach of taking first differences to undo the integration, though it seems likely that the data are too noisy for this to reveal much.

Second, it ignores a lot of other influences. The global energy balance, temperature and precipitation are influenced by a lot of natural and anthropogenic forcings in addition to CO2. Aerosols are particularly problematic since they offset the warming effect of CO2 and influence cloud formation directly. Since data for total GHG loads (CO2eq), total forcing and temperature, which are more proximate in the causal chain to precipitation, are readily available, using CO2 alone seems like willful ignorance. The authors also discuss issues “downstream” in the causal chain, with difficult-to-assess changes due to human disturbance of watersheds; while these seem plausible (not my area), they are not a good argument for the use of CO2. The authors also test other factors by including oscillatory climate indices, the AMO, PDO and ENSO, but these don’t address the problem either.

Third, the hypothesis that streamflow depends on global mean CO2 is a strawman. Climate models don’t predict that the hydrologic cycle will accelerate uniformly everywhere. Rising global mean temperature and precipitation are merely aggregate indicators of a more complex regional fingerprint. If one wants to evaluate the hypothesis that CO2 affects streamflow, one ought to compare observed streamflow trends with something like the model-predicted spatial pattern of precipitation anomalies. Here’s North America in AR4 WG1 Fig. 11.12, with late-21st-century precipitation anomalies, for example:

The pattern looks suspiciously like the paper’s spatial distribution of regression coefficients:

The eyeball correlation in itself doesn’t prove anything, but it’s suggestive that something has been missed.

Fourth, the treatment of nonlinearity and distributions is a bit fishy. The relationship between CO2 and forcing is logarithmic, which is captured in the regression equation, but I’m surprised that there aren’t other important nonlinearities or nonnormalities. Isn’t flooding heavy-tailed, for example? I’d like to see just a bit more physics in the model to handle such issues.

Fifth, I question the approach of estimating each watershed individually, then examining the distribution of results. The signal to noise ratio on any individual watershed is probably pretty horrible, so one ought to be able to do a lot better with some spatial pooling of the betas (which would also help with issue three above).

I think that it’s actually interesting to hold your nose and use linear regression as a simple screening tool, in spite of violated assumptions. If a relationship is strong, you may still find it. If you don’t find it, that may not tell you much, other than that you need better methods. The authors seem to hold to this philosophy in the conclusion, though it doesn’t come across that way in the abstract. Not everyone is as careful though; Roger Pielke Jr. picked up this paper and read it as,

A new paper out today in the Hydrological Sciences Journal shows that flooding has not increased in the United States over records of 85 to 127 years. This adds to a pile of research that shows similar results around the world. This result is of course consistent with our work that shows that increasing damage related to weather extremes can be entirely explained by societal changes, such as more property in harm’s way. In fact, in the US flood damage has decreased dramatically as a fraction of GDP, which is exactly whet you get if GDP goes up and flooding does not.

Actually, the paper doesn’t even address whether floods are increasing or decreasing. It evaluates CO2 correlations, not temporal trends. To the extent that CO2 has increased monotonically, the regression will capture some trend in the betas on CO2, but it’s not the same thing.

## The BEST of times, the worst of times

Climate skeptics’ opinions about global temperatures and the BEST project are a moving target:

August 27, 2010 (D’Aleo & Watts), there is no warming:

SUMMARY FOR POLICY MAKERS
1. Instrumental temperature data for the pre-satellite era (1850-1980) have been so widely, systematically, and uni-directionally tampered with that it cannot be credibly asserted there has been any significant “global warming” in the 20th century.

February 11, 2011 (Watts), an initial lovefest with the Berkeley Earth Surface Temperature (BEST) project:

Good news travels fast. I’m a bit surprised to see this get some early coverage, as the project isn’t ready yet. However since it has been announced by press, I can tell you that this project is partly a reaction and result of what we’ve learned in the surfacesations project, but mostly, this project is a reaction to many of the things we have been saying time and again, only to have NOAA and NASA ignore our concerns, or create responses designed to protect their ideas, rather than consider if their ideas were valid in the first place. …Note: since there’s been some concern in comments, I’m adding this: Here’s the thing, the final output isn’t known yet. There’s been no “peeking” at the answer, mainly due to a desire not to let preliminary results bias the method. It may very well turn out to agree with the NOAA surface temperature record, or it may diverge positive or negative. We just don’t know yet.

The Berkeley Earth Surface Temperature (BEST) Project aims to do what needs to be done: That is, to develop an independent analysis of the data from land stations, which would include many more stations than had been considered by the Global Historic Climatology Network. The Project is in the hands of a group of recognized scientists, who are not at all “climate skeptics” — which should enhance their credibility….

I applaud and support what is being done by the Project — a very difficult but important undertaking. I personally have little faith in the quality of the surface data, having been exposed to the revealing work by Anthony Watts and others. However, I have an open mind on the issue and look forward to seeing the results of the Project in their forthcoming publications.

… The approaches that I’ve seen during my visit give me far more confidence than the “homogenization solves all” claims from NOAA and NASA GISS, and that the BEST result will be closer to the ground truth that anything we’ve seen.

… I think, based on what I’ve seen, that BEST has a superior method. Of course that is just my opinion, with all of it’s baggage; it remains to be seen how the rest of the scientific community will react when they publish.

In the meantime, never mind the yipping from climate chihuahuas like Joe Romm over at Climate Progress who are trying to destroy the credibility of the project before it even produces a result (hmmm, where have we seen that before?) , it is simply the modus operandi of the fearful, who don’t want anything to compete with the “certainty” of climate change they have been pushing courtesy NOAA and GISS results.

But here’s the thing: I have no certainty nor expectations in the results. Like them, I have no idea whether it will show more warming, about the same, no change, or cooling in the land surface temperature record they are analyzing. Neither do they, as they have not run the full data set, only small test runs on certain areas to evaluate the code. However, I can say that having examined the method, on the surface it seems to be a novel approach that handles many of the issues that have been raised.

As a reflection of my increased confidence, I have provided them with my surfacestations.org dataset to allow them to use it to run a comparisons against their data. The only caveat being that they won’t release my data publicly until our upcoming paper and the supplemental info (SI) has been published. Unlike NCDC and Menne et al, they respect my right to first publication of my own data and have agreed.

And, I’m prepared to accept whatever result they produce, even if it proves my premise wrong. I’m taking this bold step because the method has promise. So let’s not pay attention to the little yippers who want to tear it down before they even see the results. I haven’t seen the global result, nobody has, not even the home team, but the method isn’t the madness that we’ve seen from NOAA, NCDC, GISS, and CRU, and, there aren’t any monetary strings attached to the result that I can tell. If the project was terminated tomorrow, nobody loses jobs, no large government programs get shut down, and no dependent programs crash either.  That lack of strings attached to funding, plus the broad mix of people involved especially those who have previous experience in handling large data sets gives me greater confidence in the result being closer to a bona fide ground truth than anything we’ve seen yet. Dr. Fred Singer also gives a tentative endorsement of the methods.

My gut feeling? The possibility that we may get the elusive “grand unified temperature” for the planet is higher than ever before. Let’s give it a chance.

I still believe that BEST represents a very good effort, and that all parties on both sides of the debate should look at it carefully when it is finally released, and avail themselves to the data and code that is promised to allow for replication.

March 31, 2011 (Watts), beginning to grumble when the results don’t look favorable to the no-warming point of view:

There seems a bit of a rush here, as BEST hasn’t completed all of their promised data techniques that would be able to remove the different kinds of data biases we’ve noted. That was the promise, that is why I signed on (to share my data and collaborate with them). Yet somehow, much of that has been thrown out the window, and they are presenting some results today without the full set of techniques applied. Based on my current understanding, they don’t even have some of them fully working and debugged yet. Knowing that, today’s hearing presenting preliminary results seems rather topsy turvy. But, post normal science political theater is like that.

… I’ll point out that on the front page of the BEST project, they tout openness and replicability, but none of that is available in this instance, even to Dr. Pielke and I. They’ve had a couple of weeks with the surfacestations data, and now without fully completing the main theme of data cleaning, are releasing early conclusions based on that data, without providing the ability to replicate. I’ve seen some graphical output, but that’s it. What I really want to see is a paper and methods. Our upcoming paper was shared with BEST in confidence.

The Berkeley Earth Surface Temperature project puts PR before peer review

… [Lots of ranting, primarily about the use of a 60 year interval] …

So now (pending peer-review and publication) we have the interesting situation of a Koch institution, a left-wing boogy-man, funding an unbiased study that confirms the previous temperature estimates, “consistent with global land-surface warming results previously reported, but with reduced uncertainty.

Oct. 21, 2011 (Keenan @ wattsupwiththat), an extended discussion of smoothing, AR(1) noise and other statistical issues, much of which appears to be founded on misconceptions*:

This problem seems to invalidate much of the statistical analysis in your paper.

Oct. 22, 2011 (Eschenbach @ wattsupwiththat), preceded by a lot of nonsense based on the fact that he’s too lazy to run BEST’s Matlab code:

PS—The world is warming. It has been for centuries.

* Update: or maybe not. Still, the paper has nothing to do with the validity of the BEST version of the observational record.

## Forest Cover Tipping Points

There’s an interesting discussion of forest tipping points in a new paper in Science:

Global Resilience of Tropical Forest and Savanna to Critical Transitions

Marina Hirota, Milena Holmgren, Egbert H. Van Nes, Marten Scheffer

It has been suggested that tropical forest and savanna could represent alternative stable states, implying critical transitions at tipping points in response to altered climate or other drivers. So far, evidence for this idea has remained elusive, and integrated climate models assume smooth vegetation responses. We analyzed data on the distribution of tree cover in Africa, Australia, and South America to reveal strong evidence for the existence of three distinct attractors: forest, savanna, and a treeless state. Empirical reconstruction of the basins of attraction indicates that the resilience of the states varies in a universal way with precipitation. These results allow the identification of regions where forest or savanna may most easily tip into an alternative state, and they pave the way to a new generation of coupled climate models.

Science 14 October 2011

The paper is worth a read. It doesn’t present an explicit simulation model, but it does describe the concept nicely. The basic observation is that there’s clustering in the distribution of forest cover vs. precipitation:

Hirota et al., Science 14 October 2011

In the normal regression mindset, you’d observe that some places with 2m rainfall are savannas, and others are forests, and go looking for other explanatory variables (soil, latitude, …) that explain the difference. You might learn something, or you might get into trouble if forest cover is not-only nonlinear in various inputs, but state-dependent. The authors pursue the latter thought: that there may be multiple stable states for forest cover at a given level of precipitation.

They use the precipitation-forest cover distribution and the observation that, in a first-order system subject to noise, the distribution of observed forest cover reveals something about the potential function for forest cover. Using kernel smoothing, they reconstruct the forest potential functions for various levels of precipitation:

Hirota et al., Science 14 October 2011

I thought that looked fun to play with, so I built a little model that qualitatively captures the dynamics:

The tricky part was reconstructing the potential function without the data. It turned out to be easier to write the rate equation for forest cover change at medium precipitation (“change function” in the model), and then tilt it with an added term when precipitation is high or low. Then the potential function is reconstructed from its relationship to the derivative, dz/dt = f(z) = -dV/dz, where z is forest cover and V is the potential.

That yields the following potentials and vector fields (rates of change) at low, medium and high precipitation:

If you start this system at different levels of forest cover, for medium precipitation, you can see the three stable attractors at zero trees, savanna (20% tree cover) and forest (90% tree cover).

If you start with a stable forest, and a bit of noise, then gradually reduce precipitation, you can see that the forest response is not smooth.

The forest is stable until about year 8, then transitions abruptly to savanna. Finally, around year 14, the savanna disappears and is replaced by a treeless state. The forest doesn’t transition to savanna until the precipitation index reaches about .3, even though savanna becomes the more stable of the two states much sooner, at precipitation of about .55. And, while the savanna state doesn’t become entirely unstable at low precipitation, noise carries the system over the threshold to the lower-potential treeless state.

The net result is that thinking about such a system from a static, linear perspective will get you into trouble. And, if you live around such a system, subject to a changing climate, transitions could be abrupt and surprising (fire might be one tipping mechanism).

The model is in my library.

## Forest Cover Tipping Points

This is a model of forest stability and transitions, inspired by:

Global Resilience of Tropical Forest and Savanna to Critical Transitions

Marina Hirota, Milena Holmgren, Egbert H. Van Nes, Marten Scheffer

It has been suggested that tropical forest and savanna could represent alternative stable states, implying critical transitions at tipping points in response to altered climate or other drivers. So far, evidence for this idea has remained elusive, and integrated climate models assume smooth vegetation responses. We analyzed data on the distribution of tree cover in Africa, Australia, and South America to reveal strong evidence for the existence of three distinct attractors: forest, savanna, and a treeless state. Empirical reconstruction of the basins of attraction indicates that the resilience of the states varies in a universal way with precipitation. These results allow the identification of regions where forest or savanna may most easily tip into an alternative state, and they pave the way to a new generation of coupled climate models.

The paper is worth a read. It doesn’t present an explicit simulation model, but it does describe the concept nicely. I built the following toy model as a loose interpretation of the dynamics.

Some things to try:

Use a Synthesim override to replace Forest Cover with a ramp from 0 to 1 to see potentials and vector fields (rates of change), then vary the precipitation index to see how the stability of the forest, savanna and treeless states changes:

Start the system at different levels of forest cover (varying init forest cover), with default precipitation, to see the three stable attractors at zero trees, savanna (20% tree cover) and forest (90% tree cover):

Start with a stable forest, and a bit of noise (noise sd = .2 to .3), then gradually reduce precipitation (override the precipitation index with a ramp from 1 to 0) to see abrupt transitions in state:

forest savanna treeless 1f.mdl (requires an advanced version of Vensim, or the free Model Reader)

forest savanna treeless 1f.vpm (ditto; includes a sensitivity file for varying the initial forest cover)

## Stochastic Processes

This model replicates a number of the stochastic processes from Dixit & Pindyck’s Investment Under Uncertainty. It includes Brownian motion (Wiener process), geometric Brownian motion, mean-reverting and jump processes, plus forecast confidence bounds for some variations.

Units balance, but after updating this model I’ve decided that there may be a conceptual issue, related to the interpretation of units in parameters of the Brownian process variants. This arises due to the fact that the parameter sigma represents the standard deviation at unit time, and that some of the derivations gloss over units associated with substitutions of dz=epsilon*SQRT(dt). I don’t think these are of practical importance, but will revisit the question in the future. This is what happens when you let economists get hold of engineers’ math. 🙂

These structures would be handy if made into :MACRO:s for reuse.

stochastic processes 3.mdl (requires an advanced version of Vensim)

stochastic processes 3.vpm (published package; includes a sensitivity setup for varying NOISE SEED)

stochastic processes 3 PLE.mdl (Runs in PLE, omits only one equation of low importance)

## Vensim Model Documentation Tool

Ignacio Martinez (U Chicago/Argonne, Vensim distributor, and all around nice guy) has developed a nifty tool that exploits Vensim’s open text file format and .dll to make very thorough, browsable model documentation.

It’s incredibly simple to use. Just unzip the archive, fire up the .exe, and point it at a model (.mdl format; it’ll also read some information out of an accompanying published .vpm, if there is one, but that’s not needed):