Real estate appraisal – learning the wrong lesson from failure

I just had my house appraised for a refinance. The appraisal came in at least 20% below my worst-case expectation of market value. The basis of the judgment was comps, about which the best one could say is that they’re in the same county.

I could be wrong. But I think it more likely that the appraisal was rubbish. Why did this happen? I think it’s because real estate appraisal uses unscientific methods that would not pass muster in any decent journal, enabling selection bias and fudge factors to dominate any given appraisal.

When the real estate bubble was on the way up, the fudge factors all provided biased confirmation of unrealistically high prices. In the bust, appraisers got burned. They didn’t learn that their methods were flawed; rather they concluded that the fudge factors should point down, rather than up.

Here’s how appraisals work:

A lender commissions an appraisal. Often the appraiser knows the loan amount or prospective sale price (experimenters used to double-blind trials should be cringing in horror).

The appraiser eyeballs the subject property, and then looks for comparable sales of similar properties within a certain neighborhood in space and time (the “market window”). There are typically 4 to 6 of these, because that’s all that will fit on the standard appraisal form.

The appraiser then adjusts each comp for structure and lot size differences, condition, and other amenities. The scale of adjustments is based on nothing more than gut feeling. There are generally no adjustments for location or timing of sales, because that’s supposed to be handled by the neighborhood and market window criteria.

There’s enormous opportunity for bias, both in the selection of the comp sample and in the adjustments. By cherry-picking the comps and fiddling with adjustments, you can get almost any answer you want. There’s also substantial variance in the answer, but a single point estimate is all that’s ever reported.

Here’s how they should work:

The lender commissions an appraisal. The appraiser never knows the price or loan amount (though in practice this may be difficult to enforce).

The appraiser fires up a database that selects lots of comps from a wide neighborhood in time and space. Software automatically corrects for timing and location by computing spatial and temporal gradients. It also automatically computes adjustments for lot size, sq ft, bathrooms, etc. by hedonic regression against attributes coded in the database. It connects to utility and city information to determine operating costs – energy and taxes – to adjust for those.

The appraiser reviews the comps, but only to weed out obvious coding errors or properties that are obviously non-comparable for reasons that can’t be adjusted automatically, and visits the property to be sure it’s still there.

The answer that pops out has confidence bounds and other measures of statistical quality attached. As a reality check, the process is repeated for the rental market, to establish whether rent/price ratios indicate an asset bubble.

If those tests look OK, and the answer passes the sniff test, the appraiser reports a plausible range of values. Only if the process fails to converge does some additional judgment come into play.

There are several patents on such a process, but no widespread implementation. Most of the time, it would probably be cheaper to do things this way, because less appraiser time would be needed for ultimately futile judgment calls. Perhaps it would exceed the skillset of the existing population of appraisers though.

It’s bizarre that lenders don’t expect something better from the appraisal industry. They lose money from current practices on both ends of market cycles. In booms, they (later) suffer excess defaults. In busts, they unnecessarily forgo viable business.

To be fair, fully automatic mass appraisal like Zillow and Trulia doesn’t do very well in my area. I think that’s mostly lack of data access, because they seem to cover only a small subset of the market. Perhaps some human intervention is still needed, but that human intervention would be a lot more effective if it were informed by even the slightest whiff of statistical reasoning and leveraged with some data and computing power.

Update: on appeal, the appraiser raised our valuation 27.5%. Case closed.

A small victory for scientific gobbledygook, arithmetic and Nate Silver

Nate Silver of 538 deserves praise for calling the election in all 50 states, using a fairly simple statistical model and lots of due diligence on the polling data. When the dust settles, I’ll be interested to see a more detailed objective evaluation of the forecast (e.g., some measure of skill, like likelihoods).

Many have noted that his approach stands in stark contrast to big-ego punditry:

Another impressive model-based forecasting performance occurred just days before the election, with successful prediction of Hurricane Sandy’s turn to landfall on the East Coast, almost a week in advance.

On October 22, you blogged that there was a possibility it could hit the East Coast. How did you know that?

There are a few rather reliable global models. They’re models that run all the time, all year long, so they don’t focus on any one storm. They run for the entire globe, not just for North America. There are two types of runs these models can be configured to do. One is called a deterministic run and that’s where you get one forecast scenario. Then the other mode, and I think this is much more useful, especially at longer ranges where things become much more uncertain, is ensemble—where 20 or 40 or 50 runs can be done. They are not run at as high of a resolution as the deterministic run, otherwise it would take forever, but it’s still incredibly helpful to look at 20 runs.

Because you have variation? Do the ensemble runs include different winds, currents, and temperatures?

You can tweak all sorts of things to initialize the various ensemble members: the initial conditions, the inner-workings of the model itself, etc. The idea is to account for observational error, model error, and other sources of uncertainty. So you come up with 20-plus different ways to initialize the model and then let it run out in time. And then, given the very realistic spread of options, 15 of those ensemble members all recurve the storm back to the west when it reaches the East coast, and only five of them take it northeast. That certainly has some information content. And then, one run after the next, you can watch those. If all of the ensemble members start taking the same track, it doesn’t necessarily make them right, but it does mean it’s more likely to be right. You have much more confidence forecasting a track if the model guidance is in in good agreement. If it’s a 50/50 split, that’s a tough call.

– Outside

On October 22, you blogged that there was a possibility it could hit the East Coast. How did you know that?
There are a few rather reliable global models. They’re models that run all the time, all year long, so they don’t focus on any one storm. They run for the entire globe, not just for North America. There are two types of runs these models can be configured to do. One is called a deterministic run and that’s where you get one forecast scenario. Then the other mode, and I think this is much more useful, especially at longer ranges where things become much more uncertain, is ensemble—where 20 or 40 or 50 runs can be done. They are not run at as high of a resolution as the deterministic run, otherwise it would take forever, but it’s still incredibly helpful to look at 20 runs.

Because you have variation? Do the ensemble runs include different winds, currents, and temperatures?
You can tweak all sorts of things to initialize the various ensemble members: the initial conditions, the inner-workings of the model itself, etc. The idea is to account for observational error, model error, and other sources of uncertainty. So you come up with 20-plus different ways to initialize the model and then let it run out in time. And then, given the very realistic spread of options, 15 of those ensemble members all recurve the storm back to the west when it reaches the East coast, and only five of them take it northeast. That certainly has some information content. And then, one run after the next, you can watch those. If all of the ensemble members start taking the same track, it doesn’t necessarily make them right, but it does mean it’s more likely to be right. You have much more confidence forecasting a track if the model guidance is in in good agreement. If it’s a 50/50 split, that’s a tough call.

On October 22, you blogged that there was a possibility it could hit the East Coast. How did you know that?

There are a few rather reliable global models. They’re models that run all the time, all year long, so they don’t focus on any one storm. They run for the entire globe, not just for North America. There are two types of runs these models can be configured to do. One is called a deterministic run and that’s where you get one forecast scenario. Then the other mode, and I think this is much more useful, especially at longer ranges where things become much more uncertain, is ensemble—where 20 or 40 or 50 runs can be done. They are not run at as high of a resolution as the deterministic run, otherwise it would take forever, but it’s still incredibly helpful to look at 20 runs.

 

Because you have variation? Do the ensemble runs include different winds, currents, and temperatures?

You can tweak all sorts of things to initialize the various ensemble members: the initial conditions, the inner-workings of the model itself, etc. The idea is to account for observational error, model error, and other sources of uncertainty. So you come up with 20-plus different ways to initialize the model and then let it run out in time. And then, given the very realistic spread of options, 15 of those ensemble members all recurve the storm back to the west when it reaches the East coast, and only five of them take it northeast. That certainly has some information content. And then, one run after the next, you can watch those. If all of the ensemble members start taking the same track, it doesn’t necessarily make them right, but it does mean it’s more likely to be right. You have much more confidence forecasting a track if the model guidance is in in good agreement. If it’s a 50/50 split, that’s a tough call.

The Capen Quiz at the System Dynamics Conference

I ran my updated Capen quiz at the beginning of my Vensim mini-course on optimization and uncertainty at the System Dynamics conference. The results were pretty typical – people expressed confidence bounds that were too narrow compared to their actual knowledge of the questions. Thus their effective confidence was at the 40% level rather than the 80% level desired. Here’s the distribution of actual scores from about 30 people, compared to a Binomial (10,.8) distribution:

(I’m going from memory here on the actual distribution, because I forgot to grab the flipchart of results. Did anyone take a picture? I won’t trouble you with my confidence bounds on the the confidence bounds.)

My take on this is that it’s simply very hard to be well-calibrated intuitively, unless you dedicate time for explicit contemplation of uncertainty. But it is a learnable skill – my kids, who had taken the original Capen quiz, managed to score 7 out of 10.

Even if you can get calibrated on a set of independent questions, real-world problems where dimensions covary are really tough to handle intuitively. This is yet another example of why you need a model.

Calibrate your confidence bounds: an updated Capen Quiz

Forecasters are notoriously overconfident. This applies to nearly everyone who predicts anything, not just stock analysts. A few fields, like meteorology, have gotten a handle on the uncertainty in their forecasts, but this remains the exception rather than the rule.

Having no good quantitative idea of uncertainty, there is an almost universal tendency for people to understate it. Thus, they overestimate the precision of their own knowledge and contribute to decisions that later become subject to unwelcome surprises.

A solution to this problem involves some better understanding of how to treat uncertainties and a realization that our desire for preciseness in such an unpredictable world may be leading us astray.

E.C. Capen illustrated the problem in 1976 with a quiz that asks takers to state 90% confidence intervals for a variety of things – the length of the Golden Gate bridge, the number of cars in California, etc. A winning score is 9 out of 10 right. 10 out of 10 indicates that the taker was underconfident, choosing ranges that are too wide.

Ventana colleague Bill Arthur has been giving the quiz to clients for years. In fact, it turns out that the vast majority of takers are overconfident in their knowledge – they choose ranges that are too narrow, and get only a three or four questions right. CEOs are the worst – if you score zero out of 10, you’re c-suite material.

My kids and I took the test last year. Using what we learned, we expanded the variance on our guesses of the weight of a giant pumpkin at the local coop – and as a result, brought the monster home.

Now that I’ve taken the test a few times, it spoils the fun, so last time I was in a room for the event, I doodled an updated quiz. Here’s your chance to calibrate your confidence intervals:


For each question, specify a range (minimum and maximum value) within which you are 80% certain that the true answer lies. In other words, in an ideal set of responses, 8 out of 10 answers will contain the truth within your range.

Example*:

The question is, “what was the winning time in the first Tour de France bicycle race, in 1903?”

Your answer is, “between 1 hour and 1 day.”

Your answer is wrong, because the truth (94 hours, 33 minutes, 14 seconds) does not lie within your range.

Note that it doesn’t help to know a lot about the subject matter – precise knowledge merely requires you to narrow your intervals in order to be correct 80% of the time.

Now the questions:

  1. What is the wingspan of an Airbus A380-800 superjumbo jet?
  2. What is the mean distance from the earth to the moon?
  3. In what year did the Russians launch Sputnik?
  4. In what year did Alaric lead the Visigoths in the Sack of Rome?
  5. How many career home runs did baseball giant Babe Ruth hit?
  6. How many iPhones did Apple sell in FY 2007, its year of introduction?
  7. How many transistors were on a 1993 Intel Pentium CPU chip?
  8. How many sheep were in New Zealand on 30 June 2006?
  9. What is the USGA-regulated minimum diameter of a golf ball?
  10. How tall is Victoria Falls on the Zambezi River?

Be sure to write down your answers (otherwise it’s too easy to rationalize ex post). No googling!

Answers at the end of next week.

*Update: edited slightly for greater clarity.

EIA projections – peak oil or snake oil?

Econbrowser has a nice post from Steven Kopits, documenting big changes in EIA oil forecasts. This graphic summarizes what’s happened:

kopits_eia_forecasts_jun_10
Click through for the original article.

As recently as 2007, the EIA saw a rosy future of oil supplies increasing with demand. It predicted oil consumption would rise by 15 mbpd to 2020, an ample amount to cover most eventualities. By 2030, the oil supply would reach nearly 118 mbpd, or 23 mbpd more than in 2006. But over time, this optimism has faded, with each succeeding year forecast lower than the year before. For 2030, the oil supply forecast has declined by 14 mbpd in only the last three years. This drop is as much as the combined output of Saudi Arabia and China.

In its forecast, the EIA, normally the cheerleader for production growth, has become amongst the most pessimistic forecasters around. For example, its forecasts to 2020 are 2-3 mbpd lower than that of traditionally dour Total, the French oil major. And they are below our own forecasts at Douglas-Westwood through 2020. As we are normally considered to be in the peak oil camp, the EIA’s forecast is nothing short of remarkable, and grim.

Is it right? In the last decade or so, the EIA’s forecast has inevitably proved too rosy by a margin. While SEC-approved prospectuses still routinely cite the EIA, those who deal with oil forecasts on a daily basis have come to discount the EIA as simply unreliable and inappropriate as a basis for investments or decision-making. But the EIA appears to have drawn a line in the sand with its new IEO and placed its fortunes firmly with the peak oil crowd. At least to 2020.

Since production is still rising, I think you’d have to call this “inflection point oil,” but as a commenter points out, it does imply peak conventional oil:

It’s also worth note that most of the liquids production increase from now to 2020 is projected to be unconventional in the IEO. Most of this is biofuels and oil sands. They REALLY ARE projecting flat oil production.

Since I’d looked at earlier AEO projections in the past, I wondered what early IEO projections looked like. Unfortunately I don’t have time to replicate the chart above and overlay the earlier projections, but here’s the 1995 projection:

Oil - IEO 1995

The 1995 projections put 2010 oil consumption at 87 to 95 million barrels per day. That’s a bit high, but not terribly inconsistent with reality and the new predictions (especially if the financial bubble hadn’t burst). Consumption growth is 1.5%/year.

And here’s 2002:

Oil - IEO 2002

In the 2002 projection, consumption is at 96 million barrels in 2010 and 119 million barrels in 2020 (waaay above reality and the 2007-2010 projections), a 2.2%/year growth rate.

I haven’t looked at all the interim versions, but somewhere along the way a lot of optimism crept in (and recently, crept out). In 2002 the IEO oil trajectory was generated by a model called WEPS, so I downloaded WEPS2002 to take a look. Unfortunately, it’s a typical open-loop spreadsheet horror show. My enthusiasm for a detailed audit is low, but it looks like oil demand is purely a function of GDP extrapolation and GDP-energy relationships, with no hint of supply-side dynamics (not even prices, unless they emerge from other models in a sneakernet portfolio approach). There’s no evidence of resources, not even synchronized drilling. No wonder users came to “discount the EIA as simply unreliable and inappropriate as a basis for investments or decision-making.”

Newer projections come from a new version, WEPS+. Hopefully it’s more internally consistent than the 2002 spreadsheet, and it does capture stock/flow dynamics and even includes resources. EIA appears to be getting better. But it appears that there’s still a fundamental problem with the paradigm: too much detail. There just isn’t any point in producing projections for dozens of countries, sectors and commodities two decades out, when uncertainty about basic dynamics renders the detail meaningless. It would be far better to work with simple models, capable of exploring the implications of structural uncertainty, in particular relaxing assumptions of equilibrium and idealized behavior.

Update: Michael Levi at the CFR blog points out that much of the difference in recent forecasts can be attributed to changes in GDP projections. Perhaps so. But I think this reinforces my point about detail, uncertainty, and transparency. If the model structure is basically consumption = f(GDP, price, elasticity) and those inputs have high variance, what’s the point of all that detail? It seems to me that the detail merely obscures the fundamentals of what’s going on, which is why there’s no simple discussion of reasons for the change in forecast.

The model that ate Europe

arXiv covers modeling on an epic scale in Europe’s Plan to Simulate the Entire Earth: a billion dollar plan to build a huge infrastructure for global multiagent models. The core is a massive exaflop “Living Earth Simulator” – essentially the socioeconomic version of the Earth Simulator.

FuturIcT

I admire the audacity of this proposal, and there are many good ideas captured in one place:

  • The goal is to take on emergent phenomena like financial crises (getting away from the paradigm of incremental optimization of stable systems).
  • It embraces uncertainty and robustness through scenario analysis and Monte Carlo simulation.
  • It mixes modeling with data mining and visualization.
  • The general emphasis is on networks and multiagent simulations.

I have no doubt that there might be many interesting spinoffs from such a project. However, I suspect that the core goal of creating a realistic global model will be an epic failure, for three reasons. Continue reading “The model that ate Europe”

The 2009 World Energy Outlook

Following up on Carlos Ferreira’s comment, I looked up the new IEA WEO, unveiled today.  A few excerpts from the executive summary:

  • The financial crisis has cast a shadow over whether all the energy investment needed to meet growing energy needs can be mobilised.
  • Continuing on today’s energy path, without any change in government policy, would mean rapidly increasing dependence on fossil fuels, with alarming consequences for climate change and energy security.
  • Non-OECD countries account for all of the projected growth in energy-related CO2 emissions to 2030.
  • The reductions in energy-related CO2 emissions required in the 450 Scenario (relative to the Reference Scenario) by 2020 — just a decade away — are formidable, but the financial crisis offers what may be a unique opportunity to take the necessary steps as the political mood shifts.
  • With a new international climate policy agreement, a comprehensive and rapid transformation in the way we produce, transport and use energy — a veritable lowcarbon revolution — could put the world onto this 450-ppm trajectory.
  • Energy efficiency offers the biggest scope for cutting emissions
  • The 450 Scenario entails $10.5 trillion more investment in energy infrastructure and energy-related capital stock globally than in the Reference Scenario through to the end of the projection period.
  • The cost of the additional investments needed to put the world onto a 450-ppm path is at least partly offset by economic, health and energy-security benefits.
  • In the 450 Scenario, world primary gas demand grows by 17% between 2007 and 2030, but is 17% lower in 2030 compared with the Reference Scenario.
  • The world’s remaining resources of natural gas are easily large enough to cover any conceivable rate of increase in demand through to 2030 and well beyond, though the cost of developing new resources is set to rise over the long term.
  • A glut of gas is looming

This is pretty striking language, especially if you recall the much more business-as-usual tone of WEOs in the 90s.

Marginal Damage has (or will have) more.

Unprincipled Forecast Evaluation

I hadn’t noticed until I heard it here, but Armstrong & Green are back at it, with various claims that climate forecasts are worthless. In the Financial Post, they criticize the MIT Joint Program model,

… No more than 30% of forecasting principles were properly applied by the MIT modellers and 49 principles were violated. For an important problem such as this, we do not think it is defensible to violate a single principle.

As I wrote in some detail here, the Forecasting Principles are a useful seat-of-the-pants guide to good practices, but there’s no evidence that following them all is necessary or sufficient for a good outcome. Some are likely to be counterproductive in many situations, and key elements of good modeling practice are missing (for example, balancing units of measure).

It’s not clear to me that A&G really understand models and modeling. They seem to view everything through the lens of purely statistical methods like linear regression. Green recently wrote,

Another important principle is that the forecasting method should provide a realistic representation of the situation (Principle 7.2). An interesting statement in the MIT report that implies (as one would expect given the state of knowledge and omitted relationships) that the modelers have no idea to what extent their models provide a realistic representation of reality is as follows:

‘Changes in global surface average temperature result from a combination of emissions and climate parameters, and therefore two runs that look similar in terms of temperature may be very different in detail.’ (MIT Report p. 28)

While the modelers have sufficient latitude in their parameters to crudely reproduce a brief period of climate history, there is no reason to believe the models can provide useful forecasts.

What the MIT authors are saying, in essence, is that

T = f(E,P)

and that it is possible to achieve the same future temperature T with different combinations of emissions E and parameters P. Green seems to be taking a leap, to assume that historic T does not provide much constraint on P. First, that’s not necessarily true, given that historic E cannot be chosen freely. It could still be the case that the structure of f(E,P) means that historic T provides a weak constraint on P given E. But if that’s true (as it basically is), the problem is self-diagnosing: estimates of P will have broad confidence bounds, as will forecasts of T. Green completely ignores the MIT authors’ explicit characterization of this uncertainty. He also ignores the fact that the output of the model is not just T, and that we have priors for many elements of P (from more granular models or experiments, for example). Thus we have additional lines of evidence with which to constrain forecasts. Green also neglects to consider the implications of uncertainties in P that are jointly distributed in an offsetting manner (as is likely for climate sensitivity, ocean circulation, and aerosol forcing).

A&G provide no formal method to distinguish between situations in which models yield useful or spurious forecasts. In an earlier paper, they claimed rather broadly,

‘To our knowledge, there is no empirical evidence to suggest that presenting opinions in mathematical terms rather than in words will contribute to forecast accuracy.’ (page 1002)

This statement may be true in some settings, but obviously not in general. There are many situations in which mathematical models have good predictive power and outperform informal judgments by a wide margin.

A&G’s latest paper with Willie Soon, Validity of Climate Change Forecasting for Public Policy Decision Making, apparently forthcoming in IJF, is an attempt to make the distinction, i.e. to determine whether climate models have any utility as predictive tools. An excerpt from the abstract summarizes their argument:

Policymakers need to know whether prediction is possible and if so whether any proposed forecasting method will provide forecasts that are substantively more accurate than those from the relevant benchmark method. Inspection of global temperature data suggests that it is subject to irregular variations on all relevant time scales and that variations during the late 1900s were not unusual. In such a situation, a ‘no change’ extrapolation is an appropriate benchmark forecasting method. … The accuracy of forecasts from the benchmark is such that even perfect forecasts would be unlikely to help policymakers. … We nevertheless demonstrate the use of benchmarking with the example of the Intergovernmental Panel on Climate Change’s 1992 linear projection of long-term warming at a rate of 0.03°C-per-year. The small sample of errors from ex ante projections at 0.03°C-per-year for 1992 through 2008 was practically indistinguishable from the benchmark errors. … Again using the IPCC warming rate for our demonstration, we projected the rate successively over a period analogous to that envisaged in their scenario of exponential CO2 growth’”the years 1851 to 1975. The errors from the projections were more than seven times greater than the errors from the benchmark method. Relative errors were larger for longer forecast horizons. Our validation exercise illustrates the importance of determining whether it is possible to obtain forecasts that are more useful than those from a simple benchmark before making expensive policy decisions.

There are many things wrong here:

  1. Demonstrating that unforced variability (history) can be adequately forecasted by a naive benchmark has no bearing on whether future forced variability will continue to be well-represented, or whether models can predict future emergence of a signal from noise. AG&S’ procedure is like watching an airplane taxi, concluding that aerodynamics knowledge is of no advantage, and predicting that the plane will remain on the ground forever.
  2. Comparing a naive forecast for global mean temperature against models amounts to a rejection of a vast amount of information. What is the naive forecast for the joint behavior of temperature, preciptiation, lapse rates, sea level, and their spatial and seasonal patterns? These have been evaluated for models, but AG&S do not suggest benchmarks.
  3. A no-change forecast is not necessarily the best naive forecast for a series with unknown variability, if that series has some momentum or structure which can be exploited to do better. The particular no change forecast selected byAG&S is suboptimal, because it uses a single year as a forecast, unneccesarily projecting annual variation into the future. In general, a stronger naive forecast (e.g., a smoothed value of a few recent years) would strengthen AG&S’ case, so it’s unclear why they’ve chosen an excessively naive benchmark. Fortunately, their base year, 1991, was rather “average”.
  4. The first exhibit presented is the EPICA ice core temperature. Roughly 85% of the data shown has a time interval too long to show century-scale temperature variations, and none of it could be expected to fully reveal decadal-scale variations, so it’s mostly irrelevant with respect to the kind of forecasts they seek to evaluate.
  5. The mere fact that a series has unknown historic variability does not mean that it cannot be forecast [corrected 8/18/09]. The EPICA and Vostok CO2 records look qualitatively much like the temperature record, yet CO2 accumulation in the atmosphere is quite predictable over decadal time scales, and models could handily beat a naive forecast.
  6. AG&S’ method of forecast evaluation unduly weights the short term, like the A&G sucker bet does. This is not strictly a problem, but it does make interpretation of the bounds on AG&S’ alternate forecast (“The benchmark forecast is that the global mean temperature for each year for the rest of this century will be within 0.5°C of the 2008 figure.”) a little tricky.
  7. The retrospective evaluation of the 1990/1992 IPCC projection of 0.3C/decade ignores many factors. First, 0.3C/decade over a century does not imply a smooth trend over short time scales; models and reality have substantial unforced variability which must be taken into account. The paragraph cited by AG&S includes the statement, “The rise will not be steady because of the influence of other factors.” Second, the 1992 report (in the very paragraph AG&S cite) notes that projections do not account for aerosols, so 0.3C/decade can’t be taken as a point prediction for the future, even if contingency on GHG emissions is resolved. Third, the IPCC projection stated approximate bounds – 0.2 to 0.5 C/decade – that should be accounted for in the evaluation, but are not. Still, the IPCC projection beats the naive benchmark.
  8. AG&S’ evaluation of the 0.3C/decade future BAU projection as a backcast over 1851-1975 is absurd. They write, “It is not unreasonable, then, to suppose for the purposes of our validation illustration that scientists in 1850 had noticed that the increasing industrialization of the world was resulting in exponential growth in ‘greenhouse gases’ and to project that this would lead to global warming of 0.03°C per year.” Actually, it’s completely unreasonable. Many figures in the 1990 FAR clearly indicate that the 0.3C/decade projection was not valid on [-infinity,infinity]. For example, figures 6, 8, and 9 from the SPM – just a few pages from material cited by AG&S – clearly show a gentle trend <0.05C/decade through 1950. Furthermore, even the most rudimentary understanding of the dynamics of GHG and heat accumulation is sufficient to realize that one would not expect a linear historic temperature trend to emerge from the emissions signal.

How do AG&S arrive at this sorry state? Their article embodies a “sh!t happens” epistemology. They write, “The belief that ‘things have changed’ and the future cannot be judged by the past is common, but invalid.” The problem is, one can say with equal confidence that, “the belief that ‘things never change’ and the past reveals the future is common, but invalid.” In reality, there are predictable phenomena (the orbits of the planets) and unpredictable ones (the fall of the Berlin wall). AG&S have failed to establish that climate is unpredictable or to provide us with an appropriate method for deciding whether it is predictable or not. Nor have they given us any insight into how to know or what to do if we can’t decide. Doing nothing because we think we don’t know anything is probably better than sacrificing virgins to the gods, but it doesn’t strike me as a robust strategy.

Another Look at Limits to Growth

I was just trying to decide whether I believed what I said recently, that the current economic crisis is difficult to attribute to environmental unsustainability. While I was pondering, I ran across this article by Graham Turner on the LtG wiki entry, which formally compares the original Limits runs to history over the last 30+ years. A sample:

Industrial output in Limits to Growth runs vs. history

The report basically finds what I’ve argued before: that history does not discredit Limits.