The bubble regulator’s dilemma

More from Galbraith on the crash of ’29:

Some of those in positions of authority wanted the boom to continue. They were making money out of it, and they may have had an intimation of the personal disaster which awaited them when the boom came to an end. But there were also some who saw, however dimly, that a wild speculation was in progress, and that something should be done. For these people, however, every proposal to act raised the same intractable problem. The consequences of successful action seemed almost as terrible as the consequences of inaction, and they could be more horrible for those who took the action.

A bubble can easily be punctured. But to incise it with a needle so that it subsides gradually is a task of no small delicacy. Among those who sensed what was happening in early 1929, there was some hope but no confidence that the boom could be made to subside. The real choice was between an immediate and deliberately engineered collapse and a more serious disaster later on. Someone would certainly be blamed for the ultimate collapse when it came. There was no question whatever who would be blamed should the boom be deliberately deflated.

This presents an evolutionary problem, preventing emergence of wise regulators, even absent “power corrupts” dynamics. The solution may be to incise the bubble in a distributed fashion, by inoculating the individuals who create the bubble with more wisdom and memory of past boom-bust cycles.

Defense Against the Black Box

Baseline Scenario has a nice account of the role of Excel in the London Whale (aka Voldemort) blowup.

… To summarize: JPMorgan’s Chief Investment Office needed a new value-at-risk (VaR) model for the synthetic credit portfolio (the one that blew up) and assigned a quantitative whiz (“a London-based quantitative expert, mathematician and model developer” who previously worked at a company that built analytical models) to create it. The new model “operated through a series of Excel spreadsheets, which had to be completed manually, by a process of copying and pasting data from one spreadsheet to another.” The internal Model Review Group identified this problem as well as a few others, but approved the model, while saying that it should be automated and another significant flaw should be fixed. After the London Whale trade blew up, the Model Review Group discovered that the model had not been automated and found several other errors. Most spectacularly,

“After subtracting the old rate from the new rate, the spreadsheet divided by their sum instead of their average, as the modeler had intended. This error likely had the effect of muting volatility by a factor of two and of lowering the VaR . . .”

Microsoft Excel is one of the greatest, most powerful, most important software applications of all time. …

As a consequence, Excel is everywhere you look in the business world—especially in areas where people are adding up numbers a lot, like marketing, business development, sales, and, yes, finance. …

But while Excel the program is reasonably robust, the spreadsheets that people create with Excel are incredibly fragile. There is no way to trace where your data come from, there’s no audit trail (so you can overtype numbers and not know it), and there’s no easy way to test spreadsheets, for starters. The biggest problem is that anyone can create Excel spreadsheets—badly. Because it’s so easy to use, the creation of even important spreadsheets is not restricted to people who understand programming and do it in a methodical, well-documented way.

This is why the JPMorgan VaR model is the rule, not the exception: manual data entry, manual copy-and-paste, and formula errors. This is another important reason why you should pause whenever you hear that banks’ quantitative experts are smarter than Einstein, or that sophisticated risk management technology can protect banks from blowing up. …

System Dynamics has a strong tradition of model quality control, dating all the way back to its origins in Industrial Dynamics. Some of it is embodied in software, while other bits are merely habits and traditions. If the London Whale model had been an SD model, would the crucial VaR error have occurred? Since the model might not have employed much feedback, one might also ask, had it been built with SD software, like Vensim, would the error have occurred?

There are multiple lines of defense against model errors:

  • Seeing the numbers. This is Excel’s strong suit. It apparently didn’t help in this case though.
  • Separation of model and data. A model is a structure that one can populate with different sets of parameters and data. In Excel, the structure and the data are intermingled, so it’s tough to avoid accidental replacement of structure (an equation) by data (a number), and tough to compare versions of models or model runs to recover differences. Vensim is pretty good at that. But it’s not clear that such comparisons would have revealed the VaR structure error.
  • Checking units of measure. When I was a TA for the MIT SD course, I graded a LOT of student models. I think units checking would have caught about a third of conceptual errors. In this case though, the sum and average of a variable have the same units, so it wouldn’t have helped.
  • Fit to data. Generally, people rely far too much on R^2, and too little on other quality checks, but the VaR error is exactly the kind of problem that might be revealed by comparison to history. However, if the trade was novel, there might not be any relevant data to use. In any case, there’s no real obstacle to evaluating fit in Excel, though the general difficulties of building time series models are an issue where time is relevant.
  • Conservation laws. SD practitioners are generally encouraged to observe conservation of people, money, material, etc. Software supports this with the graphical stock-flow convention, though it ought to be possible to do more. Excel doesn’t provide any help in this department, though it’s not clear whether it would have mattered to the Whale trade model.
  • Extreme conditions tests. “Kicking the tires” of models has been a good idea since the beginning. This is an ingrained SD habit, and Vensim provides Reality Check™ to automate it. It’s not clear that this would have revealed the VaR sum vs. average error, because that’s a matter of numerical sensitivity that might not reveal itself as a noticeable change in behavior. But I bet it would reveal lots of other problems with the model boundary and limitations to validity of relationships.
  • Abstraction. System Dynamics focuses on variables as containers for time series, and distinguishes stocks (state variables) from flows and other auxiliary conversions. Most SD languages also include some kind of array facility, like subscripts in Vensim, for declarative processing of detail complexity. Excel basically lacks such conventions, except for named ranges that are infrequently used. Time and other dimensions exist spatially as row-column layout. This means that an Excel model is full of a lot of extraneous material for handling dynamics, is stuck in discrete time, can’t be checked for DT stability, and requires a lot of manual row-column fill operations to express temporal phenomena that are trivial in SD and many other languages. With less busywork needed, it might have been much easier for auditors to discover the VaR error.
  • Readable equations. It’s not uncommon to encounter =E1*EXP($D$3)*SUM(B32:K32)^2/(1+COUNT(A32:K32)) in Excel. While it’s possible to create such gobbledygook in Vensim, it’s rare to actually encounter it, because SD software and habits encourage meaningful variable names and “chunking” equations into comprehensible components. Again, this might have made it much easier for auditors to discover the VaR error.
  • Graphical representation of structure. JPMorgan should get some credit for having a model audit process at all, even though it failed to prevent the error. Auditors’ work is much easier when they can see what the heck is going on in the model. SD software provides useful graphical conventions for revealing model structure. Excel has no graphics. There’s an audit tool, but it’s hampered by the lack of a variable concept, and it’s slower to use than Vensim’s Causal Tracing™.

I think the score’s Forrester 8, Gates 1. Excel is great for light data processing and presentation, but it’s way down my list of tools to choose for serious modeling. The secret to its success, cell-level processing that’s easy to learn and adaptable to many problems, is also its Achilles heel. Add in some agency problems and confirmation bias, and it’s a deadly combination:

There’s another factor at work here. What if the error had gone the wrong way, and the model had incorrectly doubled its estimate of volatility? Then VaR would have been higher, the CIO wouldn’t have been allowed to place such large bets, and the quants would have inspected the model to see what was going on. That kind of error would have been caught. Errors that lower VaR, allowing traders to increase their bets, are the ones that slip through the cracks. That one-sided incentive structure means that we should expect VaR to be systematically underestimated—but since we don’t know the frequency or the size of the errors, we have no idea of how much.

Sadly, the loss on this single trade would probably just about pay for all the commercial SD that’s ever been done.


The Trouble with Spreadsheets


The crisis was not predicted because crises aren't predictable?

There’s a terrific essay on economics by John Kay on the INET blog. Some juicy excerpts follow, but it’s really worth the trip to read the whole thing. They’ve invited some other economists to respond, which should be interesting.

The Map is Not the Territory: An Essay on the State of Economics


The reputation of economics and economists, never high, has been a victim of the crash of 2008. The Queen was hardly alone in asking why no one had predicted it. An even more serious criticism is that the economic policy debate that followed seems only to replay the similar debate after 1929. The issue is budgetary austerity versus fiscal stimulus, and the positions of the protagonists are entirely predictable from their previous political allegiances.

The doyen of modern macroeconomics, Robert Lucas, responded to the Queen’s question in a guest article in The Economist in August 2009.[1] The crisis was not predicted, he explained, because economic theory predicts that such events cannot be predicted. Faced with such a response, a wise sovereign will seek counsel elsewhere.

[…]All science uses unrealistic simplifying assumptions. Physicists describe motion on frictionless plains, gravity in a world without air resistance. Not because anyone believes that the world is frictionless and airless, but because it is too difficult to study everything at once. A simplifying model eliminates confounding factors and focuses on a particular issue of interest. To put such models to practical use, you must be willing to bring back the excluded factors. You will probably find that this modification will be important for some problems, and not others – air resistance makes a big difference to a falling feather but not to a falling cannonball.

But Lucas and those who follow him were plainly engaged in a very different exercise, as the philosopher Nancy Cartwright has explained.[4] The distinguishing characteristic of their approach is that the list of unrealistic simplifying assumptions is extremely long. Lucas was explicit about his objective[5] – ‘the construction of a mechanical artificial world populated by interacting robots that economics typically studies’. An economic theory, he explains, is something that ‘can be put on a computer and run’. Lucas has called structures like these ‘analogue economies’, because they are, in a sense, complete economic systems. They loosely resemble the world, but a world so pared down that everything about them is either known, or can be made up. Such models are akin to Tolkien’s Middle Earth, or a computer game like Grand Theft Auto.

[… interesting discussion of the fiscal crisis as a debate over Ricardian equivalence …]
But another approach would discard altogether the idea that the economic world can be described by a universally applicable model in which all key relationships are predetermined. Economic behaviour is influenced by technologies and cultures, which evolve in ways that are certainly not random but which cannot be described fully, or perhaps at all, by the kinds of variables and equations with which economists are familiar. Models, when employed, must therefore be context specific, in the manner suggested in a recent book by Roman Frydman and Michael Goldberg.[8]


But you would not nowadays be able to publish similar articles in a good economics journal. You would be told that your model was theoretically inadequate – it lacked rigour, failed to demonstrate consistency. You might be accused of the cardinal sin of being ‘ad hoc’. Rigour and consistency are the two most powerful words in economics today.


Consistency and rigour are features of a deductive approach, which draws conclusions from a group of axioms – and whose empirical relevance depends entirely on the universal validity of the axioms. The only descriptions that fully meet the requirements of consistency and rigour are complete artificial worlds, like those of Grand Theft Auto, which can ‘be put on a computer and run’.

For many people, deductive reasoning is the mark of science, while induction – in which the argument is derived from the subject matter – is the characteristic method of history or literary criticism. But this is an artificial, exaggerated distinction. ‘The first siren of beauty’, says Cochrane, ‘is logical consistency’. It seems impossible that anyone acquainted with great human achievements – whether in the arts, the humanities or the sciences – could really believe that the first siren of beauty is consistency. This is not how Shakespeare, Mozart or Picasso – or Newton or Darwin – approached their task.

[…] Economists who assert that the only valid prescriptions in economic policy are logical deductions from complete axiomatic systems take prescriptions from doctors who often know little more about these medicines than that they appear to treat the disease. Such physicians are unashamedly ad hoc; perhaps pragmatic is a better word. With exquisite irony, Lucas holds a chair named for John Dewey, the theorist of American pragmatism.

[…] The modern economist is the clinician with no patients, the engineer with no projects. And since these economists do not appear to engage with the issues that confront real businesses and actual households, the clients do not come.There are, nevertheless, many well paid jobs for economists outside academia. Not, any more, in industrial and commercial companies, which have mostly decided economists are of no use to them. Business economists work in financial institutions, which principally use them to entertain their clients at lunch or advertise their banks in fillers on CNBC. Economic consulting employs economists who write lobbying documents addressed to other economists in government or regulatory agencies.

[…]A review of economics education two decades ago concluded that students should be taught ‘to think like economists’. But ‘thinking like an economist’ has come to be interpreted as the application of deductive reasoning based on a particular set of axioms. Another Chicago Nobel Prize winner, Gary Becker, offered the following definition: ‘the combined assumptions of maximising behaviour, market equilibrium, and stable preferences, used relentlessly and consistently form the heart of the economic approach’.[13] Becker’s Nobel citation rewards him for ‘having extended the domain of microeconomic analysis to a wide range of economic behavior.’ But such extension is not an end in itself: its value can lie only in new insights into that behaviour.

‘The economic approach’ as described by Becker is not, in itself, absurd. What is absurd is the claim to exclusivity he makes for it: a priori deduction from a particular set of unrealistic simplifying assumptions is not just a tool but ‘the heart of the economic approach’. A demand for universality is added to the requirements of consistency and rigour. Believing that economics is like they suppose physics to be – not necessarily correctly – economists like Becker regard a valid scientific theory as a representation of the truth – a description of the world that is independent of time, place, context, or the observer. […]

The further demand for universality with the consistency assumption leads to the hypothesis of rational expectations and a range of arguments grouped under the rubric of ‘the Lucas critique’. If there were to be such a universal model of the economic world, economic agents would have to behave as if they had knowledge of it, or at least as much knowledge of it as was available, otherwise their optimising behaviour be inconsistent with the predictions of the model. This is a reductio ad absurdum argument, which demonstrates the impossibility of any universal model – since the implications of the conclusion for everyday behaviour are preposterous, the assumption of model universality is false.

[…]Economic models are no more, or less, than potentially illuminating abstractions. Another philosopher, Alfred Korzybski, puts the issue more briefly: ‘the map is not the territory’.[15] Economics is not a technique in search of problems but a set of problems in need of solution. Such problems are varied and the solutions will inevitably be eclectic.

This is true for analysis of the financial market crisis of 2008. Lucas’s assertion that ‘no one could have predicted it’ contains an important, though partial, insight. There can be no objective basis for a prediction of the kind ‘Lehman Bros will go into liquidation on September 15’, because if there were, people would act on that expectation and, most likely, Lehman would go into liquidation straight away. The economic world, far more than the physical world, is influenced by our beliefs about it.

Such thinking leads, as Lucas explains, directly to the efficient market hypothesis – available knowledge is already incorporated in the price of securities. […]

In his Economist response, Lucas acknowledges that ‘exceptions and anomalies’ to the efficient market hypothesis have been discovered, ‘but for the purposes of macroeconomic analyses and forecasts they are too small to matter’. But how could anyone know, in advance not just of this crisis but also of any future crisis, that exceptions and anomalies to the efficient market hypothesis are ‘too small to matter’?

[…]The claim that most profit opportunities in business or in securities markets have been taken is justified. But it is the search for the profit opportunities that have not been taken that drives business forward, the belief that profit opportunities that have not been arbitraged away still exist that explains why there is so much trade in securities. Far from being ‘too small to matter’, these deviations from efficient market assumptions, not necessarily large, are the dynamic of the capitalist economy.


The preposterous claim that deviations from market efficiency were not only irrelevant to the recent crisis but could never be relevant is the product of an environment in which deduction has driven out induction and ideology has taken over from observation. The belief that models are not just useful tools but also are capable of yielding comprehensive and universal descriptions of the world has blinded its proponents to realities that have been staring them in the face. That blindness was an element in our present crisis, and conditions our still ineffectual responses. Economists – in government agencies as well as universities – were obsessively playing Grand Theft Auto while the world around them was falling apart.

The first response, from Paul Davidson, is already in.

Debt crisis in the European Minifigure Union

A clever visualization from a 9-year-old:

Click through to the original .pdf for the numbered legend.

This is isn’t quite a causal loop diagram; arrows indicate “where each entity would shift the burden of bailout costs,” but the network of relationships implies a lot of interesting dynamics.

Via 4D Pie Charts.

Exploring stimulus policy

To celebrate the debt ceiling deal, I updated my copy of Nathan Forrester’s model, A Dynamic Synthesis of Basic Macroeconomic Theory.

Now, to celebrate the bad economic news and increasing speculation of a double-dip depression replay, here are some reflections on policy, using that model.

The model combines a number of macro standards: the multiplier-accelerator, inventory adjustment, capital accumulation, the IS-LM model, aggregate supply/aggregate demand dynamics, the permanent income hypothesis and the Phillips curve.

Forrester experimented with the model to identify the effects of five policies intended to stabilize fluctuations: countercyclical government transfers and spending, graduated income taxes, and money supply growth or targets. He used simulations experiments and linear system analysis (frequency response and eigenvalue elasticity) to identify the contribution of policies to stability.

Interestingly, the countercyclical policies tend to destabilize the business cycle. However, they prove to be stabilizing for a long-term cycle associated with the multiplier-accelerator and involving capital stock and long-term expectations.

I got curious about the effect of these policies through a simulated recession like the one we’re now in. So, I started from equilibrium and created a recession by imposing a negative shock to final sales, which passes immediately into aggregate demand. Here’s what happens:

There’s a lot of fine detail, so you may want to head over to Vimeo to view in full screen HD.

This is part of a couple of experiments I’ve tried with screencasting models, as practice for creating some online Vensim training materials. My preliminary observation is that even a perfunctory exploration of a simple model is time consuming to create and places high demands on audience attention. It’s no wonder you never see any real data or math on the Discovery Channel. I’d be interested to hear of examples of this sort of thing done well.

Downgrade causality confusion

A sort of causal loop diagram made the cover of the WSJ today:

Source: WSJ h/t Drew Jone

Is it useful, or chartjunk? When I started to look at it from the perspective of good SD diagramming practice, I realized that it’s the latter.

First off, this isn’t really a structural diagram at all. It depicts a sequence of events mixed up with icons depicting some entities involved in those events. From the chain of events, one might infer that there is causality, but that would be hazardous, particularly in this case where there is no operational description of what’s happening. Did money rush to havens because stocks fell, or did stocks fall because money was rushing to havens? How could we tell, without articulating the mechanics of stocks and flows of money in price formation?

A good diagram ought to include quantifiable elements that can vary, with clear directionality. Traders and bars of gold are clearly not helpful variables. Nor are events particularly useful; mental accounting for a “decrease in Stocks Fell” is difficult, for example.

A good diagram should also distinguish key states that describe a system, and distinguish actual states from desired states. Presumably the magnitude and direction of the “money rush to havens” is a function of desired and actual positions in various securities, but we won’t learn much about that from this picture.

Finally, a good diagram ought to give some indication of the polarity of relationships. But what exactly is happening at the top of this diagram, where blue seems to pass to red through the treasury building? Is the diagram arguing that rising treasuries caused falling stocks, so that this is a runaway positive feedback loop? (stock value down, flight to havens, treasuries up, stocks down…). Or are we to be reassured that rising treasuries lower yields, reversing the fall in stocks?

Personally, I preferred the old black & white all-text WSJ.

So far, the comedy coverage of the downgrade is more illuminating than some serious efforts:

Stimulus regret revisited

A year ago I wrote,

Stimulus regret seems to be pretty widespread now. The undercurrent seems to be that, because unemployment is still 10% etc., the stimulus didn’t work …. This conclusion is based on pattern matching thinking. Pattern matching assumes simple A->B correlation: Stimulus->Unemployment. Working backwards from that assumption, one concludes from ongoing high unemployment and the fact that stimulus did occur that the correlation between stimulus and unemployment is low.

There are two problems with this logic. First, there are many confounding factors in the A->B relationship that could be responsible for ongoing problems. Second, there’s feedback between A and B, which also means that there are (possibly large) intervening stocks (integrations, accumulations). Stocks decouple the temporal relationship between A and B, so that pattern matching doesn’t work.

Today, Paul Krugman decries similar thinking, and identifies a third misperception (that an effect may be small either because of weak causal links, or because the cause was small),

It’s kind of annoying when people claim that I said the stimulus would work; how much noisier could I have been in warning both that it was grossly inadequate, and that by claiming that a far-too-small stimulus was just right, Obama would discredit the whole idea?

Krugman points out that evaluating suites of predictions, not just a single outcome, provides a way to discriminate between competing mental models:

Of course, the WSJ also said that the stimulus wouldn’t work. The difference was in how it was supposed to fail.

The WSJ view was that federal borrowing would crowd out private spending by driving interest rates sky-high, that the bond vigilantes would destroy the economy. …

My view was that government borrowing in a liquidity trap does not drive up rates, and indeed that rates would stay low as long as the economy stayed depressed.

How it turned out.

That’s a pretty clear test; among other things, you would have lost a lot of money if you believed the WSJ view.

The problem remains that there is relatively little of such thoughtful evaluation going on in the public discourse.

For a politician evaluated by people who ignore system structure, this is a no-win situation. As long as things get worse, blame follows, regardless of what policy is chosen.

Delayed negative feedback on the financial crisis

The wheels of justice grind slowly, but they grind exceedingly fine:

Too Big to Fail or Too Big to Change

While the SEC has reached several settlements in connection with misconduct related to the financial meltdown, those settlements have been characterized as “cheap,” “hollow,” “bloodless,” and merely “cosmetic,” as noted by Columbia University law professor John C. Coffee in a recent article. Moreover, one of the SEC’s own Commissioners, Luis Aguilar, has recently admitted that the SEC’s penalty guidelines are “seriously flawed” and have “adversely impact[ed]” civil enforcement actions.

For example, Judge Jed Rakoff castigated the SEC for its attempted settlement of charges that Bank of America failed to disclose key information to investors in connection with its acquisition of Merrill Lynch (“Merrill”), including that Merrill was on the brink of insolvency (necessitating a massive taxpayer bailout), and that Bank of America had entered into a secret agreement to allow Merrill to pay its executives billions of dollars in bonuses prior to the close of the merger regardless of Merrill’s financial condition. The SEC agreed to settle its action against Bank of America for $33 million in August 2009, even though its acquisition of Merrill resulted in what The New York Times characterized as “one of the greatest destructions of shareholder value in financial history.” In rejecting the deal, Judge Rakoff declared that the proposed settlement was “misguided,” “inadequate” and failed to “comport with the most elementary notions of justice and morality.” …

It has increasingly fallen to institutional investors to hold mortgage lenders, investment banks and other large financial institutions accountable for their role in the mortgage crisis by seeking redress for shareholders injured by corporate misconduct and sending a powerful message to executives that corporate malfeasance is unacceptable. For example, sophisticated public pension funds are currently prosecuting actions involving billions of dollars of losses against Bank of America, Goldman Sachs, JPMorgan Chase, Lehman Brothers, Bear Stearns, Wachovia, Merrill Lynch, Washington Mutual, Countrywide, Morgan Stanley and Citigroup, among many others. In some instances, litigations have already resulted in significant recoveries for defrauded investors.

Historically, institutional investors have achieved impressive results on behalf of shareholders when compared to government- led suits. Indeed, since 1995, SEC settlements comprise only 5 percent of the monetary recoveries arising from securities frauds, with the remaining 95 percent obtained through private litigation ….

I think the problem here is that litigation works slowly. It’s not clear that punitive legal outcomes occur on a relevant time scale. Once bonuses have been paid and leaders have moved on, there are no heads left to roll, so organizations may only learn that they’d better have good lawyers.

A billion prices

Econbrowser has an interesting article on the Billion Prices Project, which looks for daily price movements on items across the web. This yields a price index that’s free of quality change assumptions, unlike hedonic CPI measures, but introduces some additional issues due to the lack of control over the changing portfolio of measured items.

A couple of years ago we built the analytics behind the RPX index of residential real estate prices, and grappled with many of the same problems. The competition was the CSI – the Case-Shiller indes, which uses the repeat-sales method. With that approach, every house serves as its own control, so changes in neighborhoods or other quality aspects wash out. However, the clever statistical control introduces some additional problems. First, it reduces the sample of viable data points, necessitating a 3x longer reporting lag. Second, the processing steps reduce transparency. Third, one step in particular involves downweighting of homes with (possibly implausibly) large price movements, which may have the side effect of reducing sensitivity to real extreme events. Fourth, users may want to see effects of a changing sales portfolio.

For the RPX, we chose instead a triple power law estimate, ignoring quality and mix issues entirely. The TPL is basically a robust measure of the central tendency of prices. It’s not too different from the median, except that it provides some diagnostics of data quality issues from the distribution of the tails. The payoff is a much more responsive index, which can be reported daily with a short lag. We spent a lot of time comparing the RPX to the CSI, and found that, while changes in quality and mix of sales could matter in principle, in practice the two approaches yield essentially the same answer, even over periods of years. My (biased)  inclination, therefore, is to prefer the RPX approach. Your mileage may vary.

One interesting learning for me from the RPX project was that traders don’t want models. We went in thinking that sophisticated dynamics coupled to data would be a winner. Maybe it is a winner, but people want their own sophisticated dynamics. They wanted us to provide only a datastream that maximized robustness and transparency, and minimized lag. Those are sensible design principles. But I have to wonder whether a little dynamic insight would have been useful as well since, after all, many data consumers evidently did not have an adequate model of the housing market.