Sources of Information for Modeling

The traditional picture of information sources for modeling is a funnel. For example, in Some Basic Concepts in System Dynamics (2009), Forrester showed:

I think the diagram, or at least the concept, is much older than that.

However, I think the landscape has changed a lot, with more to come. Generally, the mental database hasn’t changed too much, but the numerical database has grown a lot. The funnel isn’t 1-dimensional, so the relationships have changed on some axes, but not so much on others.

Notionally, I’d propose that the situation is something like this:

The mental database is still king for variety of concepts and immediacy or salience of information (especially to the owner of the brain involved). And, it still has some weaknesses, like the inability to easily observe, agree on and quantify the constructs included in it. In the last few decades, the numerical database has extended its reach tremendously.

The proper shape of the plot is probably very domain specific. When I drew this, I had in mind the typical corporate or policy setting, where information systems contain only a fraction of the information necessary to understand the organizations involved. But in some areas, the reverse may be true. For example, in earth systems, datasets are vast and include measurements that human senses can’t even make, whereas personal experience – and therefore mental models – is limited and treacherous.

I think I’ve understated the importance of the written database in the diagram above – perhaps I’m missing a dimension characterizing its cumulative nature (compared to the transience of mental databases). There’s also an interesting evolution underway, as tools for text analysis and large language models (ChatGPT) are making the written database more numerical in nature.

Finally, I think there’s a missing database in the traditional framework, which has growing importance. That’s the database of models themselves. They’ve been around for a long time – especially in physical sciences, but also corporate spreadsheets and the like. But increasingly, reasonably sophisticated models of organizational components are available as inputs to higher-level strategic problem solving modeling efforts.

Scientific Revolutions in Ventity

I’ve long wanted to translate the Sterman-Wittenberg model of Kuhnian paradigm revolutions to Ventity. The original was in Dynamo, and I translated that to Vensim, but neither is really satisfactory, because both require provisioning array space for new paradigms statically, before it’s needed. This means simulating lots of useless 0s, and even worse, looking at them in the output.

The model is about the lifecycle of scientific paradigms, so a central feature is the occasional introduction and evolution of new paradigms, which eventually accumulate enough anomalies to erode confidence, making them vulnerable to the next great idea. So ideally, you’d like to introduce new paradigms dynamically and delete them when they no longer have many adherents. Dynamic creation and deletion of entities is of course a core feature of Ventity – it’s the tool this model has been waiting for all those years.

I finally got around to translating my Vensim version to Ventity recently. It works beautifully:

Above, paradigm confidence, showing eight dominant paradigms as well as many smaller paradigms that never rise to dominance. They disappear when they run out of adherents. Below, puzzles under attack for the same paradigms.

Links to the source papers and more notes on the model are in the Vensim library entry. I think the dynamics are generalizable to other aspects of thinking in paradigms, like filter bubbles. The model is also a bit ‘meta’: Ventity as a distinct modeling paradigm that’s neither in the classical array-based world nor the code-based discrete agent world has struggled to win mindshare.

A minor note on use: the Run Config includes two setups: “replicate” and “random”. The “replicate” setup, which is inactive by default, launches paradigms at fixed times given by initialization data from a run of the Vensim version. This makes it possible to compare the simulations without divergence from randomness. However, the randomized run will normally be the more interesting way to work with this model.

The model (requires Ventity, which has a free trial license):

SciRev 15.zip

Computer Collates Climate Contrarian Claims

Coan et al. in Nature have an interesting text analysis of climate skeptics’ claims.

I’ve been at this long enough to notice that a few perennial favorites are missing, perhaps because they date from the 90s, prior to the dataset.

The big one is “temperature isn’t rising” or “the temperature record is wrong.” This has lots of moving parts. Back in the 90s, a key idea was that satellite MSU records showed falling temperatures, implying that the surface station record was contaminated by Urban Heat Island (UHI) effects. That didn’t end well, when it turned out that the UAH code had errors and the trend reversed when they were fixed.

Later UHI made a comeback when the SurfaceStations project crowdsourced an assessment of temperature station quality. Some turned out to be pretty bad. But again, when the dust settled, it turned out that the temperature trend was bigger, not smaller, when poor sites were excluded and TOD was corrected. This shouldn’t have been a surprise, because windy day analsyses and a dozen other things already ruled out UHI, but …

I consider this a reminder of the fact that part of the credibility of mainstream climate science arises not from the fact that models are so good, but because so many alternatives have been tried, and proved so bad, only to rise again and again.

Spreadsheets Strike Again

In this BBC podcast, stand-up mathematician Matt Parker explains the latest big spreadsheet screwup: overstating European productivity growth.

There are a bunch of killers in spreadsheets, but in this case the culprit was lack of a time axis concept, making it easy to misalign times for the GDP and labor variables. The interesting thing is that a spreadsheet’s strong suite – visibility of the numbers – didn’t help. Someone should have seen 22% productivity growth and thought, “that’s bonkers” – but perhaps expectations of a COVID19 rebound short-circuited the mental reality check.

ChatGPT struggles with pandemics

I decided to try out a trickier problem on ChatGPT: epidemiology.

This is tougher, because it requires some domain knowledge about terminology as well as some math. R0 itself is a slippery concept. It appears that ChatGPT is essentially equating R0 and the transmission rate; perhaps the result would be different had I used a different concept like force of infection.

Notice how ChatGPT is partly responding to my prodding, but stubbornly refuses to give up on the idea that the transmission rate needs to be less than R0, even though the two are not comparable.

Well, we got there in the end.

ChatGPT and the Department Store Problem

Continuing with the theme, I tried the department store problem out on ChatGPT. This is a common test of stock-flow reasoning, in which participants assess the peak stock of people in a store from data on the inflow and outflow.

I posed a simplified version of the problem:

Interestingly, I had intended to have 6 people enter at 8am, but I made a typo. ChatGPT did a remarkable job of organizing my data into exactly the form I’d doodled in my notebook, but then happily integrated to wind up with -2 people in the store at the end.

This is pretty cool, but it’s interesting that ChatGPT was happy to correct the number of people in the room, without making the corresponding correction to people leaving. That makes the table inconsistent.

We got there in the end, but I think ChatGPT’s enthusiasm for reality checks may be a little weak. Overall though I’d still say this is a pretty good demonstration of stock-flow reasoning. I’d be curious how humans would perform on the same problem.

Can ChatGPT generalize Bathtub Dynamics?

Research indicates that insights about stock-flow management don’t necessarily generalize from one situation to another. People can fill their bathtubs without comprehending the federal debt or COVID prevalence.

ChatGPT struggles a bit with the climate bathtub, so I wondered if it could reason successfully about real bathtubs.

The last sentence is a little tricky, but I think ChatGPT is assuming that the drain might not be at the bottom of the tub. Overall, I’d say the AI nailed this one.