MetaSD – Page 4 – Don't just do something, stand there! Reflections on the counterintuitive behavior of complex systems, seen through the eyes of System Dynamics, Systems Thinking and simulation.

Better Documentation

There’s a recent talk by Stefan Rahmstorf that gives a good overview of the tipping point in the AMOC, which has huge implications.

I thought it would be neat to add the Stommel box model to my library, because it’s a nice low-order example of a tipping point. I turned to a recent update of the model by Wei & Zhang in GRL.

It’s an interesting paper, but it turns out that documentation falls short of the standards we like to see in SD, making it a pain to replicate. The good part is that the equations are provided:

The bad news is that the explanation of these terms is brief to the point of absurdity:

This paragraph requires you to maintain a mental stack of no less than 12 items if you want to be able to match the symbols to their explanations. You also have to read carefully if you want to know that ‘ means “anomaly” rather than “derivative”.

The supplemental material does at least include a table of parameters – but it’s incomplete. To find the delay taus, for example, you have to consult the text and figure captions, because they vary. Initial conditions are also not conveniently specified.

I like the terse mathematical description of a system because you can readily take in the entirety of a state variable or even the whole system at a glance. But it’s not enough to check the “we have Greek letters” box. You also need to check the “serious person could reproduce these results in a reasonable amount of time” box.

Code would be a nice complement to the equations, though that comes with it’s own problems: tower-of-Babel language choices and extraneous cruft in the code. In this case, I’d be happy with just a more complete high-level description – at least:

A complete table of parameters and units, with values used in various experiments.
Inclusion of initial conditions for each state variable.
Separation of terms in the RhoH-RhoL equation.

A lot of these issues are things you wouldn’t even know are there until you attempt replication. Unfortunately, that is something reviewers seldom do. But electrons are cheap, so there’s really no reason not to do a more comprehensive documentation job.

Destroying agency competence

Normally, and maybe ideally, provision of government services is managed by a political process that balances enthusiasm for services received against the cost of the taxes required to provide those services (green loops). There are lots of ways this can go wrong, but at present 3 pathologies seem especially prevalent. The driving force behind these is wealth inequality, because it unbalances the benefits of services and the costs. The benefits generally accrue broadly, whereas costs (taxes) fall where the money is (at least in a flat or progressive system). This means that, if you’re wealthy, it’s cheaper to use FedEx than to fund the USPS, and cheaper to move to a place with clean air than to clean up your refinery. This process is shown with heavy lines below.

The oldest pathology this triggers is outright corruption (red loop), by hijacking agency resources for private gain rather than public benefit. I’m thinking of the mysterious award of a $300m contract to restore Puerto Rico’s electric power to a company with 2 employees, coincidentally acquaintances of Interior Secretary Zinke.

While there may not be anything new under the sun, the other two pathologies seem to be ascendant lately. These rely on the fact that you don’t have to steal an agency’s money if your goal is to quit paying for it. If you can’t defeat it politically in an open contest, because a constituency enjoys its services, you can undermine that support by destroying those services (orange loop). This reminds me of destruction of mail sorting machinery and the general degradation of USPS service that has happened under DeJoy’s tenure.

If you can’t destroy the reality of the agency, you can destroy the perception of the agency by attacking its measurement systems. If, for example, the EPA can’t measure air and water quality, or climate, it not only undermines the ability to operate standards and enforcement, it destroys the ability to even perceive the need for these measurements. This is often easy to do, because measurements don’t have direct constituencies, unlike roads or education. This is the first deadly sin of complex system management, and will leave us effectively flying an airplane with a clown car cockpit. Even worse, it makes it easier for the leaders of these misguided efforts to believe their own BS, and get away with it – at least in the short run.

A case for strict unit testing

Over on the Vensim forum, Jean-Jacques Laublé points out an interesting bug in the World3 population sector. His forum post includes the model, with a revealing extreme conditions test and a correction. I think it’s important enough to copy my take here:

This is a very interesting discovery. The equations in question are:
maturation 14 to 15 =
 ( ( Population 0 To 14 ) )
 * ( 1
 - mortality 0 to 14 )
 / 15
 Units: Person/year
 The fractional rate at which people aged 0-14 mature into the
 next age cohort (MAT1#5).

**************************************************************
 mortality 0 to 14=
 IF THEN ELSE(Time = 2020 * one year, 1 / one year, mortality 0 to 14 table
 ( life expectancy/one year ) )
 Units: 1/year
 The fractional mortality rate for people aged 0-14 (M1#4).

**************************************************************
(The second is the one modified for the pulse mortality test.)

In the ‘maturation 14 to 15′ equation, the obvious issue is that ’15’ is a hidden dimensioned parameter. One might argue that this instance is ‘safe’ because 15 years is definitionally the residence time of people in the 0 to 15 cohort – but I would still avoid this usage, and make the 15 yrs a named parameter, like “child cohort duration”, with a corresponding name change to the stock. If nothing else, this would make the structure easier to reuse.

The sneaky bit here, revealed by JJ’s test, is that the ‘1’ in the term (1 – mortality 0 to 14) is not a benign dimensionless number, as we often assume in constructions like 1/(1+a*x). This 1 actually represents the maximum feasible stock outflow rate, in fraction/year, implying that a mortality rate of 1/yr, as in the test input, would consume the entire outflow, leaving no children alive to mature into the next cohort. This is incorrect, because the maximum feasible outflow rate is 1/TIME STEP, and TIME STEP = 0.5, so that 1 should really be 2 ~ frac/year. This is why maturation wrongly goes to 0 in JJ’s experiment, where some children remain to age into the next cohort.

In addition, this construction means that the origin of units in the equation are incorrect – the ’15’ has to be assumed to be dimensionless for this to work. If we assign correct units to the inputs, we have a problem:
maturation 14 to 15 = ~ people/year/year
 ( ( Population 0 To 14 ) ) ~ people
 * ( 1 - mortality 0 to 14 ) ~ fraction/year
 / 15 ~ 1/year
Obviously the left side of this equation, maturation, cannot be people/year/year.

JJ’s correction is:
maturation 14 to 15=
 ( ( Population 0 To 14 ) )
 * ( 1 - (mortality 0 to 14 * TIME STEP))
 / size of the 0 to 14 population
In this case, the ‘1’ represents the maximum fraction of the population that can flow out in a time step, so it really is dimensionless. (mortality 0 to 14 * TIME STEP) represents the fractional outflow from mortality within the time step, so it too is properly dimensionless (1/year * year). You could also write this term as:
( 1/TIME STEP - mortality 0 to 14 ) / (1/TIME STEP)
In this case you can see that the term is reducing maturation by the fraction of cohort residents who don’t make it to the next age group. 1/TIME STEP represents the maximum feasible outflow, i.e. 2/year if TIME STEP = 0.5 year. In this form, it’s easy to see that this term approaches 1 (no effect) in the continuous time limit as TIME STEP approaches 0.

I should add that these issues probably have only a tiny influence on the kind of experiments performed in Limits to Growth and certainly wouldn’t change the qualitative conclusions. However, I think there’s still a strong argument for careful attention to units: a model that’s right for the wrong reasons is a danger to future users (including yourself), who might use it in unanticipated ways that challenge the robustness in extremes.

AI, population and limits

Elon says we’re in danger of a population crash.

Interestingly, he invokes Little’s Law: “UN projections are utter nonsense. Just multiply last year’s births by life expectancy.” Doing his math, 135 million births/year * 71 years life expectancy = 9.6 billion people in equilibrium. Hardly a crash. And, of course, life expectancy is going up (US excepted).

But Elon also says AI is going to do all the work.

So what exactly do we need all those people for? A lower population, with no work to do and more fun resources per capita sounds pretty good to me. But apparently, they’re not for here. “If there aren’t enough people for Earth, then there definitely won’t be enough for Mars.”

Surely he knows that the physics of moving a significant chunk of Earth’s population to Mars is sketchy, and that it will likely be a homegrown effort, unconstrained by the availability of Earthlings?

Morons Controlling Weather

For the last 30 years, I’ve been hearing from climate skeptics that man can’t possibly affect the climate. Now MTG says it’s all a lie!

Hilarious that this reverses the usual conflation of weather and climate. I’d say this is so dumb it beggars the imagination, but I’ve heard so much dumb climate denial, this is barely top-10.

Still waiting for that new Maunder Minimum, by the way.

Climate Policy Effectiveness, Pricing and Causality

A new paper by Stechemesser et al. in Science evaluates the large suite of climate policies in

Editor’s summary

It is easy for countries to say they will reduce their emissions of greenhouse gases, but these statements do not mean that the policies they adopt will be effective. Stechemesser et al. evaluated 1500 climate policies that have been implemented over the past 25 years and identified the 63 most successful ones. Some of those successes involved rarely studied policies and unappreciated combinations. This work illustrates the kinds of policy efforts that are needed to close the emissions gaps in various economic sectors. —Jesse Smith

Abstract

Meeting the Paris Agreement’s climate targets necessitates better knowledge about which climate policies work in reducing emissions at the necessary scale. We provide a global, systematic ex post evaluation to identify policy combinations that have led to large emission reductions out of 1500 climate policies implemented between 1998 and 2022 across 41 countries from six continents. Our approach integrates a comprehensive climate policy database with a machine learning–based extension of the common difference-in-differences approach. We identified 63 successful policy interventions with total emission reductions between 0.6 billion and 1.8 billion metric tonnes CO₂. Our insights on effective but rarely studied policy combinations highlight the important role of price-based instruments in well-designed policy mixes and the policy efforts necessary for closing the emissions gap.

Effective policies from Stechemesser et al.

Emil Dimanchev has a useful critique on the platform formerly known as Twitter. I think there are two key points. First, the method may have a hard time capturing gradual changes. This is basically a bathtub statistics problem: policies are implemented as step changes, but the effect may involve one or more integrations (from implementation lags, slow capital turnover, R&D pipeline delays, etc.). The structural problem is probably exacerbated by a high level of noise. The bottom line is that some real effects may not be readily detectable.

ED’s second key point is essentially a variant of “correlation is not causation”:

To understand effectiveness of a policy (or of a policy mix), we are interested in the probability (P) that it reduces emissions (our hypothesis, H) when implemented (our condition, E). In statistics, we denote that P(H|E). But the authors do something very different.

Instead, the authors take all cases (arbitrarily) defined as effective (E) and then estimate how often a policy was implemented around that time. That’s P(E|H). The two shouldn’t be conflated (though Tversky and Kahneman showed people often make that mistake).

An example of this conflation is when people conclude from the paper that a policy mix needs CO2 pricing to be effective. But the data merely show that in most of their historical emission breaks, pricing was part of the policy mix, a statement of P(E|H).

Concluding from the paper that pricing increases the probability of a break in emissions, or P(H|E), is exactly like saying that you should play a musical instrument to increase your chances of winning a Nobel prize because most Nobel laureates play a musical instrument.

I have mixed feelings about this argument. It’s correct in principle, but I think it’s incomplete, because there are strong mechanistic arguments for the effectiveness of some climate policies, that should complement the probabilistic reasoning here. I think getting to this mechanistic view is actually what ED drives at in his prescription:

An empirical investigation of effectiveness would involve looking at all cases an instrument or a mix was implemented, and estimating the CO2 reductions it caused while controlling for all confounding variables. That’s obviously very hard. Again, that’s not what the paper does.

To me, this means use an a priori energy model to detect policy effects, rather than an abstract ML method. When the IEA climate policy database first came out, I thought that would be a really cool project. Then I thought about how much work it would be, so that will have to wait for another time. But given the abundance of mechanistic arguments favoring some policies, I can’t help leaning towards the view that “correlation is not causation – but it’s a good start”. Here’s a suggestive view of the results, stratifying the price and non-price initiatives, with the pies roughly sized by absolute numbers of policies:

Lets suppose, in the worst case, that the effects observed in Stechemesser et al. are simply random luck. Why should the non-price policies be associated with effects so much less frequently? The paper may be weak evidence, but I think it’s not inconsistent with the idea – backed by models – that a lot of climate and energy policies are seriously flawed (CAFE standards and Energy Star come to mind) or simply fluffy feel-good greenwash.

Just Say No to Complex Equations

Found in an old version of a project model:

IF THEN ELSE( First Time Work Flow[i,Proj,stage
] * TIME STEP >= ( Perceived First Time Scope UEC Work
[i,Proj,uec] + Unstarted Work[i,Proj,stage] )
:OR: Task Is Active[i,Proj,Workstage] = 0
:OR: avg density of OOS work[i,Proj,stage] > OOS density threshold,
Completed work still out of sequence[i,Proj,stage] / TIME STEP
+ new work out of sequence[i,Proj,stage] ,
MIN( Completed work still out of sequence[i,Proj,stage] / Minimum Time to Retrofit Prerequisites into OOS Work
+ new work out of sequence[i,Proj,stage],
new work in sequence[i,Proj,stage]
* ZIDZ( avg density of OOS work[i,Proj,stage],
1 – avg density of OOS work[i,Proj,stage] ) ) )

An equation like this needs to be broken into at least 3 or 4 human-readable chunks. In reviewing papers for the SD conference, I see similar constructions more often than I’d like.

ChatGPT plays the Beer Game

Well, sort of. I decided to use the Beer Game inventory management framework as a test of reasoning in a dynamic situation. I changed “beer” to “televisions” for easy typing, and to avoid triggering any safety provisions on alcohol, but otherwise I tried to stay close to the classic game.

You

I run a retail store. Most weeks, we sell 4 televisions. We typically have 8 televisions in inventory. Each week, we order more televisions to restock the ones sold and maintain our inventory. It takes 2 weeks to receive a television we order. It costs $1 to keep a television in stock for a week, but the opportunity cost of lost sales if no televisions are in stock is $2. This week we sold 8 televisions. How many should we order this week?

ChatGPT

To determine how many televisions you should order this week, let’s break down the process step by step:

1. *Current Inventory:* You have 8 televisions in inventory.

2. *Demand:* This week, you sold 8 televisions.

3. *Lead Time:* It takes 2 weeks to receive a television after ordering.

4. *Costs:*

* Holding Cost: $1 per television per week.
* Stockout Cost: $2 per television per week.

5. *Objective:* Minimize the total cost, considering both holding and stockout costs.

Given that it takes 2 weeks for new stock to arrive, you need to cover the demand for the next 2 weeks.

Let’s calculate:

* *Demand for the next 2 weeks:* 8 televisions per week * 2 weeks = 16 televisions

Now, let’s calculate the total cost for different ordering quantities:

*Scenario 1:* Ordering 16 televisions

o Holding cost for 2 weeks: $1 * 16 televisions * 2 weeks = $32
o No stockout cost
o Total Cost: $32

*Scenario 2:* Ordering 17 televisions

o Holding cost for 2 weeks: $1 * 17 televisions * 2 weeks = $34
o Stockout cost for 1 television for 1 week: $2 * 1 television * 1 week = $2
o Total Cost: $34 + $2 = $36

*Scenario 3:* Ordering 18 televisions

o Holding cost for 2 weeks: $1 * 18 televisions * 2 weeks = $36
o Stockout cost for 2 televisions for 1 week: $2 * 2 televisions * 1 week = $4
o Total Cost: $36 + $4 = $40

Based on these calculations, ordering 16 televisions would minimize your total cost. Therefore, you should order 16 televisions this week.

Verdict: The problem statement is OK, but GPT seems to have taken this as a one-shot static situation. It appears to be neglecting the possibility of reordering in following weeks, and therefore trying to cover the entire shipping pipeline with the order. Also, it’s misunderstood the stockout cost. In any case, it’s overreacting.

You

16 televisions sounds like a lot. What if I just replace the 8 sold by ordering 8 televisions? Or should I order more than 8 in order to replace depleted inventory?

ChatGPT

Ordering just 8 televisions to replace the ones sold might not be the best strategy because it doesn’t account for potential stockouts and the time it takes for new stock to arrive.

Let’s analyze the scenario if you only order 8 televisions:

1. *Demand:* You sold 8 televisions this week.
2. *Lead Time:* It takes 2 weeks to receive new televisions after ordering.
3. *Inventory Depletion:* If you only order 8 televisions, you won’t have enough stock to cover demand for the next 2 weeks.
4. *Potential Stockouts:* There’s a risk of running out of stock, leading to lost sales and stockout costs.

Given these factors, if you only order 8 televisions, you’ll likely face stockouts and incur additional costs. It’s better to order enough to cover the demand for the next 2 weeks and maintain a buffer to minimize the risk of stockouts.

Ordering 16 televisions allows you to cover the demand for the next 2 weeks and maintain your inventory level, considering the lead time for new stock to arrive. It helps minimize the total cost by balancing holding costs and potential stockout costs.

Therefore, ordering 16 televisions would be a more prudent approach to ensure you meet customer demand while minimizing costs in the long run.

Verdict: I’m not quite sure what it’s “thinking” here. For one thing, what I order this week will have no effect on stock for the next two weeks, due to the shipping delay. So if I’m going to be out of stock next week, that’s inevitable. I’ll try to help it out …

You

I forgot to mention that previously we have been ordering 4 televisions per week to replace those sold. That means there are already 8 televisions being shipped to us – 4 arriving next week, and 4 the week after. Does that change the answer?

ChatGPT

Thanks for the clarification. Considering that you have 4 televisions already on order for each of the next 2 weeks, that changes the situation. Let’s adjust the analysis:

1. *Current Inventory:* You have 8 televisions in inventory.
2. *Demand:* You sold 8 televisions this week.
3. *Lead Time:* You have 4 televisions arriving next week and another 4 arriving the week after.
4. *Total Demand to Cover:* 8 televisions sold this week + 8 televisions to cover the next 2 weeks = 16 televisions.
5. *Costs:* Holding cost is $1 per television per week, and the stockout cost is $2 per television per week.

With 16 televisions to cover the next 2 weeks’ demand, you have a couple of options:

*Ordering 8 televisions:* This would cover the immediate demand for the next week, but you would still have to cover the following week’s demand with the existing order. However, you risk stockouts for the second week’s demand.

*Ordering 16 televisions:* This would cover both weeks’ demand without any risk of stockouts. You would not need to hold excess inventory beyond what’s necessary for the next 2 weeks.

Given the potential stockout costs and the need to maintain inventory to cover future demand, ordering 16 televisions seems like the safer option. It ensures that you can meet customer demand for the next 2 weeks without incurring additional costs due to stockouts.

Verdict: I think it still doesn’t get the sequencing of orders and shipments, probably because it’s not really managing with an understanding of bathtub dynamics. I’ll see if it does better on the big picture:

You

I compromised and ordered 12 televisions this week. However, the supplier informed me that they are experiencing stockouts, and it might take 4 weeks to receive televisions in the future. How should I respond by changing my ordering next week?

ChatGPT

Given the new information about the supplier experiencing stockouts and the potential increase in lead time to 4 weeks, you’ll need to adjust your ordering strategy accordingly to mitigate the risk of running out of stock.

Here’s how you can respond by changing your ordering next week:

1. *Assess Current Inventory:* Evaluate how many televisions you have in stock after accounting for this week’s sales and the 12 televisions you ordered.

2. *Estimate Future Demand:* Consider the expected demand for the next 4 weeks based on historical sales data and any other relevant factors.

3. *Calculate Total Demand:* Add up the expected demand for each of the next 4 weeks.

4. *Account for Lead Time:* Since the lead time has increased to 4 weeks, you’ll need to order enough televisions to cover the expected demand during this extended lead time period.

5. *Consider Safety Stock:* Given the uncertainty with the supplier’s stockouts, you may want to increase your safety stock level to buffer against any unexpected fluctuations in demand or delays in delivery.

6. *Place Order:* Based on your assessment of current inventory, future demand, and the need for safety stock, place an order with the supplier for the necessary quantity of televisions to cover the next 4 weeks.

7. *Monitor and Adjust:* Continuously monitor inventory levels, sales data, and any updates from the supplier regarding lead times. Be prepared to adjust your ordering strategy as needed to maintain optimal inventory levels and minimize the risk of stockouts.

Considering the potential for longer lead times and the need to maintain adequate inventory levels during this period of uncertainty, you may want to order a higher quantity of televisions next week to ensure that you can meet customer demand and avoid stockouts.

Verdict: It’s retreated to a list of considerations, rather than proposing specific order numbers, which is probably good. It does mention “lead time” but it doesn’t quite come out and say “keep track of the product in the order pipeline,” which is a key omission. Also, it’s arbitrarily looking at a 4 week horizon.

You

I’ve been talking to some other retail stores, and it turns out that they’re all ordering extra televisions to build up inventory to offset the longer shipping lead times and supplier stockouts. Should I be concerned about this?

ChatGPT

It’s understandable that other retail stores are adjusting their ordering strategies to build up inventory in response to longer shipping lead times and supplier stockouts. However, whether you should be concerned depends on various factors specific to your own retail store, including your sales patterns, storage capacity, financial situation, and the level of risk you’re comfortable with.

Here are some considerations to help you evaluate whether you should be concerned:

1. *Sales Patterns:* Analyze your historical sales data to understand your typical demand patterns. If your sales are relatively stable and predictable, you may not need to build up as much inventory compared to stores with more fluctuating demand.

2. *Storage Capacity:* Consider whether you have sufficient storage space to accommodate extra inventory. Building up inventory can tie up valuable space and increase holding costs if you’re unable to sell the televisions quickly.

3. *Financial Situation:* Evaluate your cash flow and financial resources. Building up inventory requires additional investment, so ensure that you have the financial capacity to handle the increased inventory levels without negatively impacting your cash flow or liquidity.

4. *Supplier Relationships:* Assess the reliability of your supplier and the likelihood of continued stockouts or longer lead times. If you have concerns about your supplier’s ability to fulfill orders consistently, it may be prudent to build up inventory as a precautionary measure.

5. *Competitive Landscape:* Consider how your competitors’ actions might impact your business. If other retail stores are building up inventory to meet customer demand more effectively, you may need to adjust your strategy to remain competitive.

Ultimately, the decision to build up inventory should be based on a careful assessment of your unique circumstances and risk tolerance. While it’s important to be aware of industry trends and competitive dynamics, your focus should be on optimizing inventory levels to meet customer demand efficiently while minimizing costs and risks to your business. If you’re unsure about the best course of action, consulting with a financial advisor or industry expert may provide valuable insights tailored to your specific situation.

Verdict: The list of considerations 1-5 is not bad, but it’s not really on point to the important question, which is whether the bullwhip effect may come around to bite us.

You

Should I be concerned about the possibility that the current supply shortage will eventually be replaced by a supply glut, when manufacturer capacity catches up with these orders? I don’t want to get stuck with a lot of televisions I can’t sell, because suppliers interpreted strategic phantom ordering for real demand.

ChatGPT

Concerns about the possibility of a supply glut once manufacturer capacity catches up with current orders are valid, especially if there’s been a surge in orders due to strategic stockpiling by retailers. Being stuck with excess inventory can tie up capital, lead to storage issues, and potentially result in markdowns or losses if the televisions can’t be sold at full price.

To address this concern, consider the following strategies:

1. *Monitor Industry Trends:* Stay informed about industry trends, including updates on manufacturer capacity, production levels, and market demand. Keep track of news and reports from reliable sources to anticipate potential shifts in supply and demand dynamics.

2. *Maintain Flexibility:* Maintain flexibility in your ordering strategy to adapt to changing market conditions. Avoid overcommitting to large orders or long-term contracts that could leave you vulnerable to a sudden downturn in demand or an oversupply situation.

3. *Diversify Product Range:* Diversify your product range to reduce reliance on any single category or supplier. By offering a broader selection of products, you can mitigate the impact of supply fluctuations in any particular category and capitalize on emerging trends.

4. *Strengthen Supplier Relationships:* Cultivate strong relationships with your suppliers to facilitate open communication and collaboration. Work closely with your suppliers to share information about inventory levels, sales forecasts, and market trends, enabling them to better anticipate your needs and adjust production accordingly.

5. *Implement Just-in-Time Inventory:* Consider implementing a just-in-time inventory system to minimize excess inventory and optimize stock levels based on real-time demand signals. This approach can help you reduce carrying costs and mitigate the risk of overstocking.

6. *Monitor Inventory Levels:* Continuously monitor your inventory levels and sales performance to identify any signs of excess inventory or slow-moving products. Take proactive measures, such as offering promotions or discounts, to clear out excess inventory before it becomes a liability.

By staying vigilant, maintaining flexibility, and fostering strong relationships with suppliers, you can mitigate the risks associated with potential supply gluts and position your retail store to adapt effectively to changing market conditions.

Verdict: again this is a good list, and several of these items might be useful strategic advice for the supply glut problem. #6 mentions inventory monitoring, which is good, but there’s nothing about supply line monitoring, which is bad, though improving communication with suppliers (#4) and switching to JIT (#5) might help. Still, I feel it’s wise to heed the statement at the bottom of every page:

ChatGPT can make mistakes. Check important info.

AI for modeling – what (not) to do

Ali Akhavan and Mohammad Jalali have a nice new article in the SDR on the use of AI (LLMs) to complement simulation modeling.

Generative AI and simulation modeling: how should you (not) use large language models like ChatGPT

Ali Akhavan, Mohammad S. Jalali

Abstract

Generative Artificial Intelligence (AI) tools, such as Large Language Models (LLMs) and chatbots like ChatGPT, hold promise for advancing simulation modeling. Despite their growing prominence and associated debates, there remains a gap in comprehending the potential of generative AI in this field and a lack of guidelines for its effective deployment. This article endeavors to bridge these gaps. We discuss the applications of ChatGPT through an example of modeling COVID-19’s impact on economic growth in the United States. However, our guidelines are generic and can be applied to a broader range of generative AI tools. Our work presents a systematic approach for integrating generative AI across the simulation research continuum, from problem articulation to insight derivation and documentation, independent of the specific simulation modeling method. We emphasize while these tools offer enhancements in refining ideas and expediting processes, they should complement rather than replace critical thinking inherent to research.

It’s loaded with useful examples of prompts and responses:

I haven’t really digested this yet, but I’m looking forward to writing about it. In the meantime, I’m very interested to hear your take in the comments.

How Beauty Dies

I’m lucky to live in a beautiful place, with lots of wildlife, open spaces, relative quiet and dark skies, and clean water. Keeping it that way is expensive. The cost is not money; it’s the time it takes to keep people from loving it to death, or simply exploiting it until the essence is lost.

Fifty years ago, some far sighted residents realized that development would eventually ruin the area, so they created conservation zoning designed to limit density and preserve natural resources. For a long time, that structure attracted people who were more interested in beauty than money, and the rules enjoyed strong support.

However, there are some side effects. First, the preservation of beauty in a locale, when everywhere else turns to burbs and billboards raises values. Second, the low density needed to preserve resources creates scarcity, again raising property values. High valuations attract people who come primarily for the money, not for the beauty itself.

Every person who moves in exploits a little or a lot of the remaining resources, with the money people leaning towards extraction of as much as possible. As the remaining beauty degrades, fewer people are attracted for the beauty, and the nature of the place becomes more commercial.

A reinforcing feedback drives the system to a tipping point, when the money people outnumber the beauty people. Then they erode the regulations, and paradise is lost to a development free-for-all.