Independence of models and errors


Roger Pielke’s blog has an interesting guest post by Ryan Meyer, reporting on a paper that questions the meaning of claims about the robustness of conclusions from multiple models. From the abstract:

Climate modelers often use agreement among multiple general circulation models (GCMs) as a source of confidence in the accuracy of model projections. However, the significance of model agreement depends on how independent the models are from one another. The climate science literature does not address this. GCMs are independent of, and interdependent on one another, in different ways and degrees. Addressing the issue of model independence is crucial in explaining why agreement between models should boost confidence that their results have basis in reality.

Later in the paper, they outline the philosophy as follows,

In a rough survey of the contents of six leading climate journals since 1990, we found 118 articles in which the authors relied on the concept of agreement between models to inspire confidence in their results. The implied logic seems intuitive: if multiple models agree on a projection, the result is more likely to be correct than if the result comes from only one model, or if many models disagree. … this logic only holds if the models under consideration are independent from one another. … using multiple models to analyze the same system is a ‘‘robustness’’ strategy. Every model has its own assumptions and simplifications that make it literally false in the sense that the modeler knows that his or her mathematics do not describe the world with strict accuracy. When multiple independent models agree, however, their shared conclusion is more likely to be true.

I think they’re barking up the right tree, but one important clarification is in order. We don’t actually care about the independence of models per se. In fact, if we had an ensemble of perfect models, they’d necessarily be identical. What we really want is for the models to be right. To the extent that we can’t be right, we’d at least like to have independent systematic errors, so that (a) there’s some chance that mistakes average out and (b) there’s an opportunity to diagnose the differences.

For example, consider three models of gravity, of the form F=G*m1*m2/r^b. We’d prefer an ensemble of models with b = {1.9,2.0,2.1} to one with b = {1,2,3}, even though some metrics of independence (such as the state space divergence cited in the paper) would indicate that the first ensemble is less independent than the second. This means that there’s a tradeoff: if b is a hidden parameter, it’s harder to discover problems with the narrow ensemble, but harder to get good answers out of the dispersed ensemble, because its members are more wrong.

For climate models, ensembles provide some opportunity to discover systematic errors from numerical schemes, parameterization of poorly-understood sub-grid scale phenomena and program bugs, to the extent that models rely on different codes and approaches. As in my gravity example, differences would be revealed more readily by large perturbations, but I’ve never seen extreme conditions tests on GCMs (although I understand that they at least share a lot with models used to simulate other planets). I’d like to see more of that, plus an inventory of major subsystems of GCMs, and the extent to which they use different codes.

While GCMs are essentially the only source of regional predictions, which are a focus of the paper, it’s important to realize that GCMs are not the only evidence for the notion that climate sensitivity is nontrivial. For that, there are also simple energy balance models and paleoclimate data. That means that there are at least three lines of evidence, much more independent than GCM ensembles, backing up the idea that greenhouse gases matter.

It’s interesting that this critique comes up with reference to GCMs, because it’s actually not GCMs we should worry most about. For climate models, there are vague worries about systematic errors in cloud parameterization and other phenomena, but there’s no strong a priori reason, other than Murphy’s Law, to think that they are a problem. Economic models in the climate policy space, on the other hand, nearly all embody notions of economic equilibrium and foresight which we can be pretty certain are wrong, perhaps spectacularly so. That’s what we should be worrying about.

Green labeling is just a waypoint

Alan Atkisson wonders, Can a Glass of Orange Juice in Sweden be “Climate Smart”? He concludes, Maybe consumer items like this could be labeled, “Relatively less climate-stupid.” I agree.

For green labeling to actually work, there must be a “green information” system parallel to the money economy, and people must pay attention to it. That’s a booming business right now.

US_$20_Series_2006_Obverse

Optimistically assuming that all end users have the insight and altruism needed to make the correct environment/money tradeoff, that creates tremendous evolutionary pressure on the production system to evade the intent of the labeling by using cheaper not-so-green alternatives in hidden upstream locations. To paraphrase Groucho, greenness is the key to business success – if you can fake it, you’ve got it made. The evasion need not be so cynical; it simply requires incomplete information, for example sourcing products from places where measurement systems are incomplete. I really rather doubt that we’ll ever have life cycle analysis for every product performed with the same stringency now enforced by money auditing systems.

The optimistic assumptions above are probably misplaced. Altruism is great, but I hate to rely on it, as it’s not clear to me that it’s an ESS. But insight is probably the real constraint. Life cycle analysis is good stuff, but even if it were practical to pass many attributes through the supply chain, with firm-level attribution, the result is complex information about tradeoffs that’s better suited for engineers than for consumers. Add to that the challenges people already face, like making good decisions about saving for retirement and educating children, and I think it’s hard to do much more than muddle minds.

Just as marketers associate cars with love, green labels foster the paradoxical conclusion that some consumption benefits the environment. That may be true for a few goods, but for the most part, it’s not. We should be using green information to examine our broad patterns of consumption, more than to choose what to put in the shopping cart. That might mean non-consumptive tradeoffs, like having more leisure time and less stuff.

Green labeling is great in many cases today, where prices and other incentives are blatantly misaligned with public goods, but ultimately fixing the incentives will get us a lot farther than labeling. That means pricing resources we value upstream, so that value percolates through supply chains as a price signal. In my ideal world, the price tag itself would be a green label.

For green labeling to actually work, there must be a “green information” system parallel to the money economy, and people must pay attention to it. Optimistically assuming that all end users have the insight and altruism needed to make the correct green-money tradeoff, that creates tremendous evolutionary pressure on the production system to evade the intent of the labeling by using cheaper not-so-green alternatives in hidden upstream locations. The evasive response need not be cynical, it simply requires incomplete information, i.e. sourcing products where measurement systems are incomplete. I really rather doubt that we’ll ever have life cycle analysis for every product performed with the same stringency now enforced by money auditing systems. Green labeling is great in many cases today, where prices and other incentives are blatantly misaligned with social goals, but ultimately fixing the incentives will get us a lot farther than labeling.

Other bathtubs – capital

China is rapidly eliminating old coal generating capacity, according to Technology Review.

Draining Bathtub

Coal still meets 70 percent of China’s energy needs, but the country claims to have shut down 60 gigawatts’ worth of inefficient coal-fired plants since 2005. Among them is the one shown above, which was demolished in Henan province last year. China is also poised to take the lead in deploying carbon capture and storage (CCS) technology on a large scale. The gasifiers that China uses to turn coal into chemicals and fuel emit a pure stream of carbon dioxide that is cheap to capture, providing “an excellent opportunity to move CCS forward globally,” says Sarah Forbes of the World Resources Institute in Washington, DC.

That’s laudable. However, the inflow of new coal capacity must be even greater. Here’s the latest on China’s coal output:

ChinaCoalOutput

China Statistical Yearbook 2009 & 2009 main statistical data update

That’s just a hair short of 3 billion tons in 2009, with 8%/yr growth from ’07-’09, in spite of the recession. On a per capita basis, US output and consumption is still higher, but at those staggering growth rates, it won’t take China long to catch up.

A simple model of capital turnover involves two parallel bathtubs, a “coflow” in SD lingo:

CapitalTurnover

Every time you build some capital, you also commit to the energy needed to run it (unless you don’t run it, in which case why build it?). If you get fancy, you can consider 3rd order vintaging and retrofits, as here:

Capital Turnover 3o

To get fancier still, see the structure in John Sterman’s thesis, which provides for limited retrofit potential (that Gremlin just isn’t going to be a Prius, no matter what you do to the carburetor).

The basic challenge is that, while it helps to retire old dirty capital quickly (increasing the outflow from the energy requirements bathtub), energy requirements will go up as long as the inflow of new requirements is larger, which is likely when capital itself is growing and the energy intensity of new capital is well above zero. In addition, when capital is growing rapidly, there just isn’t much old stuff around (proportionally) to throw away, because the age structure of capital will be biased toward new vintages.

Hat tip: Travis Franck

EPA gets the bathtub

Eli Rabett has been posting the comment/response section of the EPA endangerment finding. For the most part the comments are a quagmire of tinfoil-hat pseudoscience; I’m astonished that the EPA could find some real scientists who could stomach wading through and debunking it all – an important but thankless job.

Today’s installment tackles the atmospheric half life of CO2:

A common analogy used for CO2 concentrations is water in a bathtub. If the drain and the spigot are both large and perfectly balanced, then the time than any individual water molecule spends in the bathtub is short. But if a cup of water is added to the bathtub, the change in volume in the bathtub will persist even when all the water molecules originally from that cup have flowed out the drain. This is not a perfect analogy: in the case of CO2, there are several linked bathtubs, and the increased pressure of water in one bathtub from an extra cup will actually lead to a small increase in flow through the drain, so eventually the cup of water will be spread throughout the bathtubs leading to a small increase in each, but the point remains that the “residence time” of a molecule of water will be very different from the “adjustment time” of the bathtub as a whole.

Having tested a lot of low-order carbon cycle models, including I think all possible linear variants up to 3rd order, I agree with EPA – anyone who claims that the effective half life or time constant of CO2 uptake is 10 or 20 or even 50 years is bonkers.

States' role in climate policy

Jack Dirmann passed along an interesting paper arguing for a bigger role for states in setting federal climate policy.

This article explains why states and localities need to be full partners in a national climate change effort based on federal legislation or the existing Clean Air Act. A large share of reductions with the lowest cost and the greatest co-benefits (e.g., job creation, technology development, reduction of other pollutants) are in areas that a federal cap-and-trade program or other purely federal measures will not easily reach. These are also areas where the states have traditionally exercised their powers – including land use, building construction, transportation, and recycling. The economic recovery and expansion will require direct state and local management of climate and energy actions to reach full potential and efficiency.

This article also describes in detail a proposed state climate action planning process that would help make the states full partners. This state planning process – based on a proven template from actions taken by many states – provides an opportunity to achieve cheaper, faster, and greater emissions reductions than federal legislation or regulation alone would achieve. It would also realize macroeconomic benefits and non-economic co-benefits, and would mean that the national program is more economically and environmentally sustainable.

Continue reading “States' role in climate policy”

Climate Science, Climate Policy and Montana

Last night I gave a climate talk at the Museum of the Rockies here in Bozeman, organized by Cherilyn DeVries and sponsored by United Methodist. It was a lot of fun – we had a terrific discussion at the end, and the museum’s monster projector was addictive for running C-LEARN live. Thanks to everyone who helped to make it happen. My next challenge is to do this for young kids.

MT Climate Schematic

My slides are here as a PowerPoint show: Climate Science, Climate Policy & Montana (better because it includes some animated builds) or PDF: Climate Science, Climate Policy & Montana (PDF)

Some related resources:

Climate Interactive & the online C-LEARN model

Montana Climate Change Advisory Committee

Montana Climate Office

Montana emissions inventory & forecast visualization (click through the graphic):

Cb009aee-64f1-11df-8f87-000255111976 Blog_this_caption
Related posts:

Flying South

Montana’s Climate Future

Would you like fries with that?

Education is a mess, and well-motivated policy changes are making it worse.

I was just reading this and this, and the juices got flowing, so my wife and I brainstormed this picture:

Education CLD

Click to enlarge

Yep, it’s spaghetti, like a lot of causal brainstorming efforts. The underlying problem space is very messy and hard to articulate quickly, but I think the essence is simple. Educational outcomes are substandard, creating pressure to improve. In at least some areas, outcomes slipped a lot because the response to pressure was to erode learning goals rather than to improve (blue loop through the green goal). One benefit of No Child Left Behind testing is to offset that loop, by making actual performance salient and restoring the pressure to improve. Other intuitive responses (red loops) also have some benefit: increasing school hours provides more time for learning; standardization yields economies of scale in materials and may improve teaching of low-skill teachers; core curriculum focus aligns learning with measured goals.

The problem is that these measures have devastating side effects, especially in the long run. Measurement obsession eats up time for reflection and learning. Core curriculum focus cuts out art and exercise, so that lower student engagement and health diminishes learning productivity. Low engagement means more sit-down-and-shut-up, which eats up teacher time and makes teaching unattractive. Increased hours lead to burnout of both students and teachers. Long hours and standardization make teaching unattractive. Degrading the attractiveness of teaching makes it hard to attract quality teachers. Students aren’t mindless blank slates; they know when they’re being fed rubbish, and check out. When a bad situation persists, an anti-intellectual culture of resistance to education evolves.

The nest of reinforcing feedbacks within education meshes with one in broader society. Poor education diminishes future educational opportunity, and thus the money and knowledge available to provide future schooling. Economic distress drives crime, and prison budgets eat up resources that could otherwise go to schools. Dysfunction reinforces the perception that government is incompetent, leading to reduced willingness to fund schools, ensuring future dysfunction. This is augmented by flight of the rich and smart to private schools.

I’m far from having all the answers here, but it seems that standard SD advice on the counter-intuitive behavior of social systems applies. First, any single policy will fail, because it gets defeated by other feedbacks in the system. Perhaps that’s why technology-led efforts haven’t lived up to expectations; high tech by itself doesn’t help if teachers have no time to reflect on and refine its use. Therefore intervention has to be multifaceted and targeted to activate key loops. Second, things get worse before they get better. Making progress requires more resources, or a redirection of resources away from things that produce the short-term measured benefits that people are watching.

I think there are reasons to be optimistic. All of the reinforcing feedback loops that currently act as vicious cycles can run the other way, if we can just get over the hump of the various delays and irreversibilities to start the process. There’s enormous slack in the system, in a variety of forms: time wasted on discipline and memorization, burned out teachers who could be re-energized and students with unmet thirst for knowledge.

The key is, how to get started. I suspect that the conservative approach of privatization half-works: it successfully exploits reinforcing feedback to provide high quality for those who opt out of the public system. However, I don’t want to live in a two class society, and there’s evidence that high inequality slows economic growth. Instead, my half-baked personal prescription (which we pursue as homeschooling parents) is to make schools more open, connecting students to real-world trades and research. Forget about standardized pathways through the curriculum, because children develop at different rates and have varied interests. Replace quantity of hours with quality, freeing teachers’ time for process improvement and guidance of self-directed learning. Suck it up, and spend the dough to hire better teachers. Recover some of that money, and avoid lengthy review, by using schools year ’round. I’m not sure how realistic all of this is as long as schools function as day care, so maybe we need some reform of work and parental attitudes to go along.

[Update: There are of course many good efforts that can be emulated, by people who’ve thought about this more deeply than I. Pegasus describes some here. Two of note are the Waters Foundation and Creative Learning Exchange. Reorganizing education around systems is a great way to improve productivity through learner-directed learning, make learning exciting and relevant to the real world, and convey skills that are crucial for society to confront its biggest problems.]

The real Kerry-Lieberman APA stands up, with two big surprises

The official discussion draft of the Kerry-Lieberman American Power Act is out. My heart sank when I saw the page count – 987. I won’t be able to review this in any detail soon. Based on a quick look, I see two potentially huge items: the “hard price collar” has a soft ceiling, and transport fuels are in the market, despite claims to the contrary.

Hard is soft

First, the summary states that there’s a “hard price collar which binds carbon prices and creates a predictable system for carbon prices to rise at a fixed rate over inflation.” That’s not quite right. There is indeed a floor, set by an auction reserve price in Section 790. However, I can’t find a ceiling as such. Instead, Section 726 establishes a “Cost Containment Reserve” that is somewhat like the Waxman-Markey strategic reserve, without the roach motel moving average price (offsets check in, but they don’t check out). Instead, reserve allowances are available at the escalating ceiling price ($25 + 5%/yr). There’s a much larger initial reserve (4 gigatons) and I think a more generous topping off (1.5% of allowances each year initially; 5% after 2030). However, there appears to be no mechanism to provide allowances beyond the set-aside. That means that the economy-wide target is in fact binding. If demand eats up the reserve allowance buffer, prices will have to rise above the ceiling in order to clear the market. So, the market actually faces a hard target, with the reserve/ceiling mechanism merely creating a temporary respite from price spikes. The price ceiling is soft if allowance demand at the ceiling price is sufficient to exhaust the buffer. The mental model behind this design must be that estimated future emission prices are about right, so that one need only protect against short term volatility. However, if those estimates are systematically wrong, and the marginal cost of mitigation persistently exceeds the ceiling, the reserve provides no protection against price escalation.

Transport is in the market

The short transport summary asserts:

Since a robust domestic refining industry is critical to our national security, we needed to make a change. We took fuel providers out of the market. Instead of every refinery participating in the market for allowances, we made sure the price of carbon was constant across the industry. That means all fuel providers see the same price of carbon in a given quarter. The system is simple. First, the EPA and EIA Administrators look to historic product sales to estimate how many allowances will be necessary to cover emissions for the quarter, and they set that number of allowances aside at the market price. Then refineries and fuel providers sell fuel, competing as they have always done to offer the best product at the best price. Finally, at the end of the quarter, the refiners and fuel providers purchase the allowances that have been set aside for them. If there are too many or too few allowances set aside, that difference is made up by adjusting the projection for the following quarter. These allowances cannot be banked or traded, and can only be used for compliance purposes.

In fact, transport is in the market, just via a different mechanism. Instead of buying allowances realtime, with banking and borrowing, refiners are price takers and get allowances via a set-aside mechanism. Since there’s nothing about the mechanism that creates allowances, the market still has to clear. The mechanism simply introduces a one quarter delay into the market clearing process. I don’t see how this additional complication is any better for refiners. Introducing the delay into the negative feedback loops that clear the market could be destabilizing. This is so enticing, I’ll have to simulate it.

My analysis is a bit hasty here, so I could be wrong, but if I’m right these two issues have huge implications for the performance of the bill.