Are Project Overruns a Statistical Artifact?

Erik Bernhardsson explores this possibility:

Anyone who built software for a while knows that estimating how long something is going to take is hard. It’s hard to come up with an unbiased estimate of how long something will take, when fundamentally the work in itself is about solving something. One pet theory I’ve had for a really long time, is that some of this is really just a statistical artifact.

Let’s say you estimate a project to take 1 week. Let’s say there are three equally likely outcomes: either it takes 1/2 week, or 1 week, or 2 weeks. The median outcome is actually the same as the estimate: 1 week, but the mean (aka average, aka expected value) is 7/6 = 1.17 weeks. The estimate is actually calibrated (unbiased) for the median (which is 1), but not for the the mean.

The full article is worth a read, both for its content and the elegant presentation. There are some useful insights, particularly that tasks with the greatest uncertainty rather than the greatest size are likely to dominate a project’s outcome. Interestingly, most lists of reasons for project failure neglect uncertainty just as they neglect dynamics.

However, I think the statistical explanation is only part of the story. There’s an important connection to project structure and dynamics.

First, if you accept that the distribution of task overruns is lognormal, you have to wonder where that heavy-tailed distribution is coming from in the first place. I think the answer is, positive feedbacks. Projects are chock full of reinforcing feedback, from rework cycles, Brooks’ Law, schedule pressure driving overtime leading to errors and burnout, site congestion and other effects. These amplify the right tail response to any disturbance.

Second, I think there’s some reason to think that the positive feedbacks operate primarily at a high level in projects. Schedule pressure, for example, doesn’t kick in when one little subtask goes wrong; it only becomes important when the whole project is off track. But if that’s the case, Bernhardsson’s heavy-tailed estimation errors will provide a continuous source of disturbances that stress the project, triggering the multitude of vicious cycles that lie in wait. In that case, a series of potentially modest misperceptions of uncertainty can be amplified by project structure into a catastrophic failure.

An interesting question is why people and organizations don’t simply adapt, adding a systematic fudge factor to estimates to account for overruns. Are large overruns to rare to perceive easily? Or do organizational pressures to set stretch goals and outcompete other projects favor naive optimism?

 

Emissions Pricing vs. Standards

You need an emissions price in your portfolio to balance effort across all tradeoffs in the economy.

The energy economy consists of many tradeoffs. Some of these are captured in the IPAT framework:

Emissions 
= Population x GDP per Capita x Energy per GDP x Emissions per Energy

IPAT shows that, to reduce emisisons, there are multiple points of intervention. One could, for example, promote lower energy intensity, or reduce the carbon intensity of energy, or both.

An ideal policy, or portfolio of policies, would:

  • Cover all the bases – ensure that no major opportunity is left unaddressed.
  • Balance the effort – an economist might express this as leveling the shadow prices across areas.

We have a lot of different ways to address each tradeoff: tradeable permits, taxes, subsidies, quantity standards, performance standards, command-and-control, voluntary limits, education, etc. So far, in the US, we have basically decided that taxes are a non-starter, and instead pursued subsidies and tax incentives, portfolio and performance standards, with limited use of tradeable permits.

Here’s the problem with that approach. You can decompose the economy a lot more than IPAT does, into thousands of decisions that have energy consequences. I’ve sampled a tiny fraction below.

Is there an incentive?

Decision Standards Emissions Price
Should I move to the city or the suburbs? No  Yes
Should I telecommute? No  Yes
Drive, bike, bus or metro today? No  Yes
Car, truck or SUV? No (CAFE gets this wrong)  Yes
Big SUV or small SUV? CAFE (again)  Yes
Gasoline, diesel, hybrid or electric? ZEV, tax credits  Yes
Regular or biofuel? LCFS, CAFE credits  Yes
Detached house or condo? No  Yes
Big house or small? No  Yes
Gas or heat pump? No  Yes
High performance building envelope or granite countertops? Building codes (lowest common denominator)  Yes
Incandescent or LED lighting? Bulb Ban  Yes
LEDs are cheap – use more? No  Yes
Get up to turn out an unused light? No  Yes
Fridge: top freezer, bottom freezer or side by side? No  Yes
Efficient appliances? Energy Star (badly)  Yes
Solar panels? Building codes, net metering, tax credits, cap & trade  Yes
Green electricity? Portfolio standards  Yes
2 kids or 8? No  Yes

The beauty of an emissions price – preferably charged at the minemouth and wellhead – is that it permeates every economic aspect of life. The extent to which it does so depends on the emissions intensity of the subject activity – when it’s high, there’s a strong price signal, and when it’s low, there’s a weak signal, leaving users free to decide on other criteria. But the signal is always there. Importantly, the signal can’t be cheated: you can fake your EPA mileage rating – for a while – but it’s hard to evade costs that arrive packaged with your inputs, be they fuel, capital, services or food.

The rules and standards we have, on the other hand, form a rather moth-eaten patchwork. They cover a few of the biggest energy decisions with policies like renewable portfolio standards for electricity. Some of those have been pretty successful at lowering emissions. But others, like CAFE and Energy Star, are deficient or perverse in a variety of ways. As a group, they leave out a number of decisions that are extremely consequential. Effort is by no means uniform – what is the marginal cost of a ton of carbon avoided by CAFE, relative to a state’s renewable energy portfolio? No one knows.

So, how is the patchwork working? Not too well, I’d say. Some, like the CAFE standard, have been diluted by loopholes and stalled due to lack of political will:

BTS

Others are making some local progress. The California LCFS, for example, has reduced carbon intensity of fuels 3.5% since authorization by AB32 in 2006:

ARB

But the LCFS’ progress has been substantially undone by rising vehicle miles traveled (VMT). The only thing that put a real dent in driving was the financial crisis:

AFDC

Caltrans


In spite of this, the California patchwork has worked – it has reached its GHG reduction target:
SF Chronicle

This is almost entirely due to success in the electric power sector. Hopefully, there’s more to come, as renewables continue to ride down their learning curves. But how long can the power sector carry the full burden? Not long, I think.

The problem is that the electricity supply side is the “easy” part of the problem. There are relatively few technologies and actors to worry about. There’s a confluence of federal and state incentives. The technology landscape is favorable, with cost-effective emerging technologies.

The technology landscape for clean fuels is not easy. That’s why LCFS credits are trading at $195/ton while electricity cap & trade allowances are at $16/ton. The demand side has more flexibility, but it is technically diverse and organizationally fragmented (like the questions in my table above), making it harder to regulate. Problems are coupled: getting people out of their cars isn’t just a car problem; it’s a land use problem. Rebound effects abound: every LED light bulb is just begging to be left on all the time, because it’s so cheap to do so, and electricity subsidies make it even cheaper.

Command-and-control regulators face an unpleasant choice. They can push harder and harder in a few major areas, widening the performance gap – and the shadow price gap – between regulated and unregulated decisions. Or, they can proliferate regulations to cover more and more things, increasing administrative costs and making innovation harder.

As long as economic incentives scream that the price of carbon is zero, every performance standard, subsidy, or limit is fighting an uphill battle. People want to comply, but evolution selects for those who can figure out how to comply the least. Every idea that’s not covered by a standard faces a deep “valley of death” when it attempts to enter the market.

At present, we can’t let go of this patchwork of standards (wingwalker’s rule – don’t let go of one thing until you have hold of another). But in the long run, we need to start activating every possible tradeoff that improves emissions. That requires a uniform that pervades the economy. Then rules and standards can backfill the remaining market failures, resulting in a system of regulation that’s more effective and less intrusive.

The end of the world is free!

Why should emitting greenhouse gases be free, when it contributes to the destruction of so much we care about?

In the New York Times, David Leonhardt ponders,

The Problem With Putting a Price on the End of the World

Economists have workable policy ideas for addressing
climate change. But what if they’re politically impossible?

I wrote about this exact situation nearly ten years ago, when the Breakthrough Institute (and others) proposed energy R&D as an alternative to politically-infeasible carbon taxes. What has R&D accomplished since then? All kinds of wonderful things, but the implications for climate are … diddly squat.

The emerging climate technology delusion

Leonhardt observes that emissions pricing programs have already failed to win approval several times, which is true. However, I think the diagnosis is partly incorrect. Cap and trade programs like Waxman Markey failed not because they imposed prices, but because they were incredibly complex and involved big property rights giveaways. Anyone who even understands the details of the program is right to wonder if anyone other than traders will profit from it.

In other cases, like the Washington carbon tax initiatives, I think the problem may be that potential backers required that it solve not only climate, but also environmental justice and income inequality more broadly. That’s an impossible task for a single policy.

Leonhardt proposes performance standards and a variety of other economically “second best” measures as alternatives.

The better bet seems to be an “all of the above” approach: Organize a climate movement around meaningful policies with a reasonable chance of near-term success, but don’t abandon the hope of carbon pricing.

At first blush, this seems reasonable to me. Performance standards and information policies have accomplished a lot over the years. Energy R&D is a good investment.

On second thought, these alternatives have already failed. The sum total of all such policies over the last few decades has been to reduce CO2 emissions intensity by 2% per year.

That’s slower than GDP growth, so emissions have actually risen. That’s far short of what we need to accomplish, and it’s not all attributable to policy. Even with twice the political will, and twice the progress, it wouldn’t be nearly enough.

All of the above have some role to play, but without prices as a keystone economic signal, they’re fighting the tide. Moreover, together they have a large cost in administrative complexity, which gives opponents a legitimate reason to whine about bureaucracy and promotes regulatory capture. This makes it hard to innovate and helps large incumbents contribute to worsening inequality.

Adapted from Tax Time

So, I think we need to do a lot more than not “abandon the hope” of carbon pricing. Every time we push a stopgap, second-best policy, we must also be building the basis for implementation of emissions prices. This means we have to get smarter about carbon pricing, and address the cognitive and educational gaps that explain failure so far. Leonhardt identifies one key point:

‘If we’re going to succeed on climate policy, it will be by giving people a vision of what’s in it for them.’

I think that vision has several parts.

  • One is multisolving – recognizing that clever climate policy can improve welfare now as well as in the future through health and equity cobenefits. This is tricky, because a practical policy can’t do everything directly; it just has to be compatible with doing everything.
  • Another is decentralization. The climate-economy system is too big to permit monolithic solution designs. We have to preserve diversity and put signals in place that allow it to evolve in beneficial directions.

Finally, emissions pricing has to be more than a vision – it has to be designed so that it’s actually good for the median voter:

As Nordhaus acknowledged in his speech, curbing dirty energy by raising its price “may be good for nature, but it’s not actually all that attractive to voters to reduce their income.”

Emissions pricing doesn’t have to be harmful to most voters, even neglecting cobenefits, as long as green taxes include equitable rebates, revenue finances good projects, and green sectors have high labor intensity. (The median voter has to understand this as well.)

Personally, I’m frustrated by decades of excuses for ineffective, complicated, inequitable policies. I don’t know how to put it in terms that don’t trigger cognitive dissonance, but I think there’s a question that needs to be asked over and over, until it sinks in:

Why should emitting greenhouse gases be free, when it contributes to the destruction of so much we care about?

Why should emitting greenhouse gases be free, when it contributes to the destruction of so much we care about?

Why should emitting greenhouse gases be free, when it contributes to the destruction of so much we care about?

Why should emitting greenhouse gases be free, when it contributes to the destruction of so much we care about?

Why should emitting greenhouse gases be free, when it contributes to the destruction of so much we care about?

Breakthrough Optimism

From Models of Doom, the Sussex critique of the Limits to Growth:

Real challenges will no doubt arise if world energy consumption continues to grow in the long-term at the current rate, but limited reserves of non-renewable energy resources are unlikely to represent a serious threat on reasonable assumptions about the ultimate size of the reserves and technical progress. …

It is not unreasonable to expect that within 30 years a breakthrough with fusion power will provide virtually inexhaustible cheap energy supplies, but should this breakthrough take considerably longer, pessimism would still be unjustified. There are untapped reserves of non-conventional hydrocarbons which will become economic after further technical development and if prices of conventional fossil fuels continue to rise.

At AAAS in 2005, a fusion researcher pointed out that 1950s predictions of working fusion 50 years out had expired … with fusion prospects still 50 years out.

This MIT Project Says Nuclear Fusion Is 15 Years Away (No, Really, This Time)

Expert: “I’m 100 Percent Confident” Fusion Power Will Be Practical
Companies chasing after the elusive technology hope to build reactors by 2030.

Is fusion finally just around the corner? I wouldn’t count on it. Even if we do get a breakthrough in 10 to 15 years, or tomorrow, it’s still a long way from proof of concept to deployment on a scale that’s helpful for mitigating CO2 emissions and avoiding use of destructive resources like tar sands.

Forest Tipping in the Rockies

Research shows that some forests in the Rockies aren’t recovering from wildfires.

Evidence for declining forest resilience to wildfires under climate change

Abstract
Forest resilience to climate change is a global concern given the potential effects of increased disturbance activity, warming temperatures and increased moisture stress on plants. We used a multi‐regional dataset of 1485 sites across 52 wildfires from the US Rocky Mountains to ask if and how changing climate over the last several decades impacted post‐fire tree regeneration, a key indicator of forest resilience. Results highlight significant decreases in tree regeneration in the 21st century. Annual moisture deficits were significantly greater from 2000 to 2015 as compared to 1985–1999, suggesting increasingly unfavourable post‐fire growing conditions, corresponding to significantly lower seedling densities and increased regeneration failure. Dry forests that already occur at the edge of their climatic tolerance are most prone to conversion to non‐forests after wildfires. Major climate‐induced reduction in forest density and extent has important consequences for a myriad of ecosystem services now and in the future.

I think this is a simple example of a tipping point in action.

Forest Cover Tipping Points

Using an example from Hirota et al., in my toy model article above, here’s what happens:

At high precipitation, a fire (red arrow, top) takes the forest down to zero tree cover, but regrowth (green arrow, top) restores the forest. At lower precipitation, due to climate change, the forest remains stable, until fire destroys it (lower red arrow). Then regrowth can’t get past the newly-stable savanna state (lower green arrow). No amount of waiting will take the trees from 30% cover to the original 90% tree cover. (The driving forces might be more complex than precipitation and fire; things like insects, temperature, snowpack and evaporation also matter.)

The insidious thing about this is that you can’t tell that the forest state has become destabilized until the tipping event happens. That means the complexity of the system defeats any simple heuristic for managing the trees. The existence of healthy, full tree cover doesn’t imply that they’ll grow back to the same state after a catastrophe or clearcut.

Limits to Big Data

I’m skeptical of the idea that machine learning and big data will automatically lead to some kind of technological nirvana, a Star Trek future in which machines quickly learn all the physics needed for us to live happily ever after.

First, every other human technology has been a mixed bag, with improvements in welfare coming along with some collateral damage. It just seems naive to think that this one will be different.


These are not the primary problem.

Second, I think there are some good reasons to think that problems will get harder at the same rate that machines get smarter. The big successes I’ve seen are localized point prediction problems, not integrated systems with a lot of feedback. As soon as causality are separated in time and space by complex mechanisms, you’re into sloppy systems territory, where data may constrain only a few parameters at a time. Making progress in such systems will increasingly require integration of multiple theories and data from multiple sources.

People in domains that have made heavy use of big data increasingly recognize this: Continue reading “Limits to Big Data”

Why is national modeling hard?

If you’ve followed the work on the System Dynamics National Model, you know that it came to an end uncompleted. Yet, there is a vast amount of interesting structure in the model, and there have been many productive spinoffs from the work. How can this be?

I think there are several explanations. One is that the problem is intrinsically hard. Economies are big, and they operate at many scales. There are micro processes (firms investing in capacity and choosing technologies) but also evolutionary processes (firms getting it wrong die).

This means there’s no one to ask when you want to understand how things work. You can’t ask someone about their car fuel purchase habits and aggregate up to national energy intensity, because their understanding encompasses Ford vs. Chevy, not all the untried and future contingencies in the economic network beyond their limited sphere of influence.

You can’t ask the data. Data must always be interpreted through the lens of a model, and model structure is what we lack. If we had a lot more data, we might be able to infer more about the constraints on plausible structures, but economic data is pretty sparse compared to the number of constructs we need to understand.

In spite of this, dynamic general equilibrium models have managed to model whole economies anyway. Why have they succeeded? I think there are two answers. First, they cheat. They reduce all behavior to an optimization algorithm. That’s guaranteed to yield an answer, but whether that answer has any relevance to the real world is debatable. Second, they give answers that people who fund economic models like: the world is just fine as it is, externalities don’t exist, and all policy interventions are costly.

All this is not to say that we’ll never have useful national models; indeed we already have many models (including the DGEs) that are useful for some purposes. But we still have a long way to go before we have solid macrobehavior from microfoundations to inform policy broadly.

 

Opiod Epidemic Dynamics

I ran across an interesting dynamic model of the opioid epidemic that makes a good target for replication and critique:

Prevention of Prescription Opioid Misuse and Projected Overdose Deaths in the United States

Qiushi Chen; Marc R. Larochelle; Davis T. Weaver; et al.

Importance  Deaths due to opioid overdose have tripled in the last decade. Efforts to curb this trend have focused on restricting the prescription opioid supply; however, the near-term effects of such efforts are unknown.

Objective  To project effects of interventions to lower prescription opioid misuse on opioid overdose deaths from 2016 to 2025.

Design, Setting, and Participants  This system dynamics (mathematical) model of the US opioid epidemic projected outcomes of simulated individuals who engage in nonmedical prescription or illicit opioid use from 2016 to 2025. The analysis was performed in 2018 by retrospectively calibrating the model from 2002 to 2015 data from the National Survey on Drug Use and Health and the Centers for Disease Control and Prevention.

Conclusions and Relevance  This study’s findings suggest that interventions targeting prescription opioid misuse such as prescription monitoring programs may have a modest effect, at best, on the number of opioid overdose deaths in the near future. Additional policy interventions are urgently needed to change the course of the epidemic.

The model is fully described in supplementary content, but unfortunately it’s implemented in R and described in Greek letters, so it can’t be run directly:

That’s actually OK with me, because I think I learn more from implementing the equations myself than I do if someone hands me a working model.

While R gives you access to tremendous tools, I think it’s not a good environment for designing and testing dynamic models of significant size. You can’t easily inspect everything that’s going on, and there’s no easy facility for interactive testing. So, I was curious whether that would prove problematic in this case, because the model is small.

Here’s what it looks like, replicated in Vensim:

It looks complicated, but it’s not complex. It’s basically a cascade of first-order delay processes: the outflow from each stock is simply a fraction per time. There are no large-scale feedback loops. Continue reading “Opiod Epidemic Dynamics”