Scientific Revolutions in Ventity

I’ve long wanted to translate the Sterman-Wittenberg model of Kuhnian paradigm revolutions to Ventity. The original was in Dynamo, and I translated that to Vensim, but neither is really satisfactory, because both require provisioning array space for new paradigms statically, before it’s needed. This means simulating lots of useless 0s, and even worse, looking at them in the output.

The model is about the lifecycle of scientific paradigms, so a central feature is the occasional introduction and evolution of new paradigms, which eventually accumulate enough anomalies to erode confidence, making them vulnerable to the next great idea. So ideally, you’d like to introduce new paradigms dynamically and delete them when they no longer have many adherents. Dynamic creation and deletion of entities is of course a core feature of Ventity – it’s the tool this model has been waiting for all those years.

I finally got around to translating my Vensim version to Ventity recently. It works beautifully:

Above, paradigm confidence, showing eight dominant paradigms as well as many smaller paradigms that never rise to dominance. They disappear when they run out of adherents. Below, puzzles under attack for the same paradigms.

Links to the source papers and more notes on the model are in the Vensim library entry. I think the dynamics are generalizable to other aspects of thinking in paradigms, like filter bubbles. The model is also a bit ‘meta’: Ventity as a distinct modeling paradigm that’s neither in the classical array-based world nor the code-based discrete agent world has struggled to win mindshare.

A minor note on use: the Run Config includes two setups: “replicate” and “random”. The “replicate” setup, which is inactive by default, launches paradigms at fixed times given by initialization data from a run of the Vensim version. This makes it possible to compare the simulations without divergence from randomness. However, the randomized run will normally be the more interesting way to work with this model.

The model (requires Ventity, which has a free trial license):

SciRev 15.zip

Computer Collates Climate Contrarian Claims

Coan et al. in Nature have an interesting text analysis of climate skeptics’ claims.

I’ve been at this long enough to notice that a few perennial favorites are missing, perhaps because they date from the 90s, prior to the dataset.

The big one is “temperature isn’t rising” or “the temperature record is wrong.” This has lots of moving parts. Back in the 90s, a key idea was that satellite MSU records showed falling temperatures, implying that the surface station record was contaminated by Urban Heat Island (UHI) effects. That didn’t end well, when it turned out that the UAH code had errors and the trend reversed when they were fixed.

Later UHI made a comeback when the SurfaceStations project crowdsourced an assessment of temperature station quality. Some turned out to be pretty bad. But again, when the dust settled, it turned out that the temperature trend was bigger, not smaller, when poor sites were excluded and TOD was corrected. This shouldn’t have been a surprise, because windy day analsyses and a dozen other things already ruled out UHI, but …

I consider this a reminder of the fact that part of the credibility of mainstream climate science arises not from the fact that models are so good, but because so many alternatives have been tried, and proved so bad, only to rise again and again.

Spreadsheets Strike Again

In this BBC podcast, stand-up mathematician Matt Parker explains the latest big spreadsheet screwup: overstating European productivity growth.

There are a bunch of killers in spreadsheets, but in this case the culprit was lack of a time axis concept, making it easy to misalign times for the GDP and labor variables. The interesting thing is that a spreadsheet’s strong suite – visibility of the numbers – didn’t help. Someone should have seen 22% productivity growth and thought, “that’s bonkers” – but perhaps expectations of a COVID19 rebound short-circuited the mental reality check.

ChatGPT struggles with pandemics

I decided to try out a trickier problem on ChatGPT: epidemiology.

This is tougher, because it requires some domain knowledge about terminology as well as some math. R0 itself is a slippery concept. It appears that ChatGPT is essentially equating R0 and the transmission rate; perhaps the result would be different had I used a different concept like force of infection.

Notice how ChatGPT is partly responding to my prodding, but stubbornly refuses to give up on the idea that the transmission rate needs to be less than R0, even though the two are not comparable.

Well, we got there in the end.

ChatGPT and the Department Store Problem

Continuing with the theme, I tried the department store problem out on ChatGPT. This is a common test of stock-flow reasoning, in which participants assess the peak stock of people in a store from data on the inflow and outflow.

I posed a simplified version of the problem:

Interestingly, I had intended to have 6 people enter at 8am, but I made a typo. ChatGPT did a remarkable job of organizing my data into exactly the form I’d doodled in my notebook, but then happily integrated to wind up with -2 people in the store at the end.

This is pretty cool, but it’s interesting that ChatGPT was happy to correct the number of people in the room, without making the corresponding correction to people leaving. That makes the table inconsistent.

We got there in the end, but I think ChatGPT’s enthusiasm for reality checks may be a little weak. Overall though I’d still say this is a pretty good demonstration of stock-flow reasoning. I’d be curious how humans would perform on the same problem.

Can ChatGPT generalize Bathtub Dynamics?

Research indicates that insights about stock-flow management don’t necessarily generalize from one situation to another. People can fill their bathtubs without comprehending the federal debt or COVID prevalence.

ChatGPT struggles a bit with the climate bathtub, so I wondered if it could reason successfully about real bathtubs.

The last sentence is a little tricky, but I think ChatGPT is assuming that the drain might not be at the bottom of the tub. Overall, I’d say the AI nailed this one.

ChatGPT does the Climate Bathtub

Following up on our earlier foray into AI conversations about dynamics, I decided to follow up on ChatGPT’s understanding of bathtub dynamics. First I repeated our earlier question about climate:

This is close, but note that it’s suggesting that a decrease in emissions corresponds with a decrease in concentration. This is not necessarily true in general, due to the importance of emissions relative to removals. ChatGPT seems to recognize the issue, but fails to account for it completely in its answer. My parameter choice turned out to be a little unfortunate, because a 50% reduction in CO2 emissions is fairly close to the boundary between rising and falling CO2 concentrations in the future.

I asked again with a smaller reduction in emissions. This should have an unambiguous effect: emissions would remain above removals, so the CO2 concentration would continue to rise, but at a slower rate.

This time the answer is a little better, but it’s not clear whether “lead to a reduction in the concentration of CO2 in the atmosphere” means a reduction relative to what would have happened otherwise, or relative to today’s concentration. Interestingly, ChatGPT does get that the emissions reduction doesn’t reduce temperature directly; it just slows the rate of increase.