Talking with COVID conspiracy theorists

Tech review has a nice article on how to talk to conspiracy theorists.

What are they hiding in the woods?

I think some of the insights here are also applicable to talking about models, which is turning out to be a real challenge in the COVID era, with high rates of belief in conspiracies. My experience in social media settings is very negative. If I mention anything indicating that I might actually know something about the problem, that triggers immediate suspicion – oh, so you work for the government, eh? Somehow non sequiturs and hearsay beat models every time.

h/t Chris Soderquist for an interesting resource:

“Any assertion of expertise from an actual expert, meanwhile, produces an explosion of anger from certain quarters of the American public, who immediately complain that such claims are nothing more than fallacious “appeals to authority,” sure signs of dreadful “elitism,” and an obvious effort to use credentials to stifle the dialogue required by a “real” democracy. Americans now believe that having equal rights in a political system also means that each person’s opinion about anything must be accepted as equal to anyone else’s. This is the credo of a fair number of people despite being obvious nonsense. It is a flat assertion of actual equality that is always illogical, sometimes funny, and often dangerous.”

Notes From: Tom Nichols. “The Death of Expertise.” Apple Books.

I’m finding the Tech Review article’s points 3 & 6 to be most productive: test the waters first, and use the Socratic method (careful questioning to reveal gaps in thinking). But the best advice is really proving to be, don’t look and don’t take on the trolls directly. It’s more productive to help people who are curious and receptive to modeling than to battle people who basically resist everything since the enlightenment.

 

Dynamics of Hoarding

“I’m not hoarding, I’m just stocking up before the hoarders get here.”
Behavioral causes of phantom ordering in supply chains
John D. Sterman
Gokhan Dogan

When suppliers are unable to fill orders, delivery delays increase and customers receive less than they desire. Customers often respond by seeking larger safety stocks (hoarding) and by ordering more than they need to meet demand (phantom ordering). Such actions cause still longer delivery times, creating positive feedbacks that intensify scarcity and destabilize supply chains. Hoarding and phantom ordering can be rational when customers compete for limited supply in the presence of uncertainty or capacity constraints. But they may also be behavioral and emotional responses to scarcity. To address this question we extend Croson et al.’s (2014) experimental study with the Beer Distribution Game. Hoarding and phantom ordering are never rational in the experiment because there is no horizontal competition, randomness, or capacity constraint; further, customer demand is constant and participants have common knowledge of that fact. Nevertheless 22% of participants place orders more than 25 times greater than the known, constant demand. We generalize the ordering heuristic used in prior research to include the possibility of endogenous hoarding and phantom ordering. Estimation results strongly support the hypothesis, with hoarding and phantom ordering particularly strong for the outliers who placed extremely large orders. We discuss psychiatric and neuroanatomical evidence showing that environmental stressors can trigger the impulse to hoard, overwhelming rational decision‐making. We speculate that stressors such as large orders, backlogs or late deliveries trigger hoarding and phantom ordering for some participants even though these behaviors are irrational. We discuss implications for supply chain design and behavioral operations research.

Eroding Environmental Goals

In System Dynamics we typically refer to this as the eroding goals archetype, or the boiled frog syndrome:

Shifting baseline syndrome: causes, consequences, and implications

With ongoing environmental degradation at local, regional, and global scales, people’s accepted thresholds for environmental conditions are continually being lowered. In the absence of past information or experience with historical conditions, members of each new generation accept the situation in which they were raised as being normal. This psychological and sociological phenomenon is termed shifting baseline syndrome (SBS), which is increasingly recognized as one of the fundamental obstacles to addressing a wide range of today’s global environmental issues. Yet our understanding of this phenomenon remains incomplete. We provide an overview of the nature and extent of SBS and propose a conceptual framework for understanding its causes, consequences, and implications. We suggest that there are several self‐reinforcing feedback loops that allow the consequences of SBS to further accelerate SBS through progressive environmental degradation. Such negative implications highlight the urgent need to dedicate considerable effort to preventing and ultimately reversing SBS.

Eugenics rebooted – what could go wrong?

Does DNA IQ testing create a meritocracy, or merely reinforce existing biases?

Technology Review covers new efforts to use associations between DNA and IQ.

… Intelligence is highly heritable and predicts important educational, occupational and health outcomes better than any other trait. Recent genome-wide association studies have successfully identified inherited genome sequence differences that account for 20% of the 50% heritability of intelligence. These findings open new avenues for research into the causes and consequences of intelligence using genome-wide polygenic scores that aggregate the effects of thousands of genetic variants.

The new genetics of intelligence

Robert Plomin and Sophie von Stumm

I have no doubt that there’s much to be learned here. However, research is not all they’re proposing:

IQ GPSs will be used to predict individuals’ genetic propensity to learn, reason and solve problems, not only in research but also in society, as direct-to-consumer genomic services provide GPS information that goes beyond single-gene and ancestry information. We predict that IQ GPSs will become routinely available from direct-to-consumer companies along with hundreds of other medical and psychological GPSs that can be extracted from genome-wide genotyping on SNP chips. The use of GPSs to predict individuals’ genetic propensities requires clear warnings about the probabilistic nature of these predictions and the limitations of their effect sizes (BOX 7).

Although simple curiosity will drive consumers’ interests, GPSs for intelligence are more than idle fortune telling. Because intelligence is one of the best predictors of educational and occupational outcomes, IQ GPSs will be used for prediction from early in life before intelligence or educational achievement can be assessed. In the school years, IQ GPSs could be used to assess discrepancies between GPSs and educational achievement (that is, GPS-based overachievement and underachievement). The reliability, stability and lack of bias of GPSs make them ideal for prediction, which is essential for the prevention of problems before they occur. A ‘precision education’ based on GPSs could be used to customize education, analogous to ‘precision medicine’

There are two ways “precision education” might be implemented. An egalitarian model would use information from DNA IQ measurements to customize resource allocations, so that all students could perform up to some common standard:

An efficiency model, by contrast, would use IQ measurements to set achievement expectations for each student, and customize resources to ensure that students who are underperforming relative to their DNA get a boost:

This latter approach is essentially a form of tracking, in which DNA is used to get an early read on who’s destined to flip bonds, and who’s destined to flip burgers.

One problem with this scheme is noise (as the authors note, seemingly contradicting their own abstract’s claim of reliability and stability). Consider the effect of a student receiving a spuriously low DNA IQ score. Under the egalitarian scheme, they receive more educational resources (enabling them to overperform), while under the efficiency scheme, resources would be lowered, leading self-fulfillment of the predicted low performance. The authors seem to regard this as benign and self-correcting:

By contrast, GPSs are ‘less dangerous’ because they are intrinsically probabilistic, not hardwired and deterministic like single-gene disorders. It is important to recall here that although all complex traits are heritable, none is 100% heritable. A similar logic can be applied to IQ scores: although they have great predic­tive validity for key life outcomes, IQ is not determin­istic but probabilistic. In short, an individual is always more than the sum of their genes or their IQ scores.

I think this might be true when you consider the local effects on the negative loops governing resource allocation. But I don’t think that remains true when you put it in context. Education is a nest of positive feedbacks. This creates path dependence that amplifies errors in resource allocation, whether they come from subjective teacher impressions or DNA measurements.

In a perfect world, DNA-IQ provides an independent measurement that’s free of those positive feedbacks. In that sense, it’s perfectly meritocratic:

But how do you decide what to measure? Are the measurements good, or just another way to institutionalize bias? This is hotly contested. Let’s suppose that problems of gender and race/ethnicity bias have been, or can be solved. There are still questions about what measurements correlate with better individual or societal outcomes. At some point, implicit or explicit choices have to be made, and these are not value-free. They create reinforcing feedbacks:

I think it’s inevitable that, like any other instrument, DNA IQ scores are going to reflect the interests of dominant groups in society. (At a minimum, I’d be willing to bet that IQ tests don’t measure things that would result in low scores for IQ test designers.) If that means more Einsteins, Bachs and Ghandis, maybe it’s OK. But I don’t think that’s guaranteed to lead to a good outcome. First, there’s no guarantee that a society composed of apparently high-performing individuals is in itself high-performing. Second, the dominant group may be dominant, not by virtue of faster CPUs in their heads, but something less appetizing.

I think there’s no guarantee that DNA IQ will not reflect attributes that are dysfunctional for society. We would hate to inadvertently produce more Stalins and Mengeles by virtue of inadvertent correlations with high achievement of less virtuous origin. And certainly, like any instrument used for high-stakes decisions, the pressure to distort and manipulate results will increase with use.

Note that if education is really egalitarian, the link between Measured IQ and Educational Resources Allocated reverses polarity, becoming negative. Then the positive loops become negative loops, and a lot of these problems go away. But that’s not often a choice societies make, presumably because egalitarian education is in itself contrary to the interests of dominant groups.

I understand researchers’ optimism for this technology in the long run. But for now, I remain wary, due to the decided lack of systems thinking about the possible side effects. In similar circumstances, society has made poor choices about teacher value added modeling, easily negating any benefits it might have had. I’m expecting a similar outcome here.

Vi Hart on positive feedback driving polarization

Vi Hart’s interesting comments on the dynamics of political polarization, following the release of an innocuous video:

I wonder what made those commenters think we have opposite views; surely it couldn’t just be that I suggest people consider the consequences of their words and actions. My working theory is that other markers have placed me on the opposite side of a cultural divide that they feel exists, and they are in the habit of demonizing the people they’ve put on this side of their imaginary divide with whatever moral outrage sounds irreproachable to them. It’s a rather common tool in the rhetorical toolset, because it’s easy to make the perceived good outweigh the perceived harm if you add fear to the equation.

Many groups have grown their numbers through this feedback loop: have a charismatic leader convince people there’s a big risk that group x will do y, therefore it seems worth the cost of being divisive with those who think that risk is not worth acting on, and that divisiveness cuts out those who think that risk is lower, which then increases the perceived risk, which lowers the cost of being increasingly divisive, and so on.

The above feedback loop works great when the divide cuts off a trust of the institutions of science, or glorifies a distrust of data. It breaks the feedback loop if you act on science’s best knowledge of the risk, which trends towards staying constant, rather than perceived risk, which can easily grow exponentially, especially when someone is stoking your fear and distrust.

If a group believes that there’s too much risk in trusting outsiders about where the real risk and harm are, then, well, of course I’ll get distrustful people afraid that my mathematical views on risk/benefit are in danger of creating a fascist state. The risk/benefit calculation demands it be so.

AI is killing us now

I’ve been watching the debate over AI with some amusement, as if it were some other planet at risk. The Musk-Zuckerberg kerfuffle is the latest installment. Ars Technica thinks they’re both wrong:

At this point, these debates are largely semantic.

I don’t see how anyone could live through the last few years and fail to notice that networking and automation have enabled an explosion of fake news, filter bubbles and other information pathologies. These are absolutely policy relevant, and smarter AI is poised to deliver more of what we need least. The problem is here now, not from some impending future singularity.

Ars gets one point sort of right:

Plus, computer scientists have demonstrated repeatedly that AI is no better than its datasets, and the datasets that humans produce are full of errors and biases. Whatever AI we produce will be as flawed and confused as humans are.

I don’t think the data is really the problem; it’s the assumptions the data’s treated with and the context in which that occurs that’s really problematic. In any case, automating flawed aspects of ourselves is not benign!

Here’s what I think is going on:

AI, and more generally computing and networks are doing some good things. More data and computing power accelerate the discovery of truth. But truth is still elusive and expensive. On the other hand, AI is making bullsh!t really cheap (pardon the technical jargon). There are many mechanisms by which this occurs:

These amplifiers of disinformation serve increasingly concentrated wealth and power elites that are isolated from their negative consequences, and benefit from fueling the process. We wind up wallowing in a sea of information pollution (the deadliest among the sins of managing complex systems).

As BS becomes more prevalent, various reinforcing mechanisms start kicking in. Accepted falsehoods erode critical thinking abilities, and promote the rejection of ideas like empiricism that were the foundation of the Enlightenment. The proliferation of BS requires more debunking, taking time away from discovery. A general erosion of trust makes it harder to solve problems, opening the door for opportunistic rent-seeking non-solutions.

I think it’s a matter of survival for us to do better at critical thinking, so we can shift the balance between truth and BS. That might be one area where AI could safely assist. We have other assets as well, like the explosion of online learning opportunities. But I think we also need some cultural solutions, like better management of trust and anonymity, brakes on concentration, sanctions for lying, rewards for prediction, and more time for reflection.

The survival value of wrong beliefs

… reasons for the survival of antiscientific views. It’s basically a matter of evolution. When crazy ideas negatively affect survival, they die out. But evolutionary forces are vastly diminished under some conditions, or even point the wrong way …

NPR has an alarming piece on school science.

She tells her students — like Nick Gurol, whose middle-schoolers believe the Earth is flat — that, as hard as they try, science teachers aren’t likely to change a student’s misconceptions just by correcting them. Gurol says his students got the idea of a flat planet from basketball star Kyrie Irving, who said as much on a podcast.

“And immediately I start to panic. How have I failed these kids so badly they think the Earth is flat just because a basketball player says it?” He says he tried reasoning with the students and showed them a video. Nothing worked.

“They think that I’m part of this larger conspiracy of being a round-Earther. That’s definitely hard for me because it feels like science isn’t real to them.”

For cases like this, Yoon suggests teachers give students the tools to think like a scientist. Teach them to gather evidence, check sources, deduce, hypothesize and synthesize results. Hopefully, then, they will come to the truth on their own.

This called to mind a post from way back, in which I considered reasons for the survival of antiscientific views.

It’s basically a matter of evolution. When crazy ideas negatively affect survival, they die out. But evolutionary forces are vastly diminished under some conditions, or even point the wrong way:

  1. Non-experimental science (reliance on observations of natural experiments; no controls or randomized assignment)
  2. Infrequent replication (few examples within the experience of an individual or community)
  3. High noise (more specifically, low signal-to-noise ratio)
  4. Complexity (nonlinearity, integrations or long delays between cause and effect, multiple agents, emergent phenomena)
  5. “Unsalience” (you can’t touch, taste, see, hear, or smell the variables in question)
  6. Cost (there’s some social or economic penalty  imposed by the policy implications of the theory)
  7. Commons (the risk of being wrong accrues to society more than the individual)

These are, incidentally, some of the same circumstances that make medical trials difficult, such that most papers are false.

Consider the flat earth idea. What cost accrues to students who hold this belief? None whatsoever, I think. A flat earth model will make terrible predictions of all kinds of things, but students are not making or relying on such predictions. The roundness of the earth is obviously not salient. So really, the only survival value that matters to students is the benefit of tribal allegiance.

If there are intertemporal dynamics, the situation is even worse. For any resource or capability investment problem, there’s worse before better behavior. Recovering depleted fish stocks requires diminished effort, and less to eat, in the near term. If a correct belief implies good long run stock management, adherents of the incorrect belief will have an advantage in the short run. You can’t count on selection weeding out the “dumb tribes” for planetary-scale problems, because we’re all in one.

This seems like a pretty intractable problem. If there’s a way out, it has to be cultural. If there were a bit more recognition of the value on making correct predictions, the halo of that would spill over to diminish the attractiveness of silly theories. That’s a case that ought to be compelling for basketball fans. Who wants to play on a team that can’t predict what the opponents will do, or how the ball will bounce?

System 3 thinking

There was lots of talk of dual process theory at the 2017 System Dynamics Conference. Nelson Repenning discussed it in his plenary presentation. The Donella Meadows Award paper investigated the effects on stock-flow task performance of priming subjects to think in System 2:

The dual-process theory and understanding of stocks and flows

Arash Baghaei Lakeh and Navid Ghaffarzadegan

Recent evidence suggests that using the analytic mode of thinking (System 2) can improve people’s performance in stock–flow (SF) tasks. In this paper, we further investigate the effects by implementing several different interventions in two studies. First, we replicate a previous finding that answering analytical questions before the SF task approximately doubles the likelihood of answering the stock questions correctly. We also investigate effects of three other interventions that can potentially prime participants to use their System 2. Specifically, the first group is asked to justify their response to the SF task; the second group is warned about the difficulty of the SF task; and the third group is offered information about cognitive biases and the role of the analytic mode of thinking. We find that the second group showed a statistically significant improvement in their performance. We claim that there are simple interventions that can modestly improve people’s response in SF tasks.

Dual process refers to the idea that there are two systems of thinking at work in our minds. System 1 is fast, automatic intuition. System 2 is slow, rational reasoning.

I’ve lost track of the conversation, but some wag at the conference (not me; possibly Arash)  coined the term “System 3” for model-assisted thinking.

In a sense, any reasoning is “model-assisted,” but I think there’s an important distinction between purely mental reasoning and reasoning with a formal (usually computerized) modeling method like a dynamic simulation or even a spreadsheet.

When we reason in our heads, we have to simultaneously (A) describe the structure of the problem, (B) predict the behavior implied by the structure, and (C) test the structure against available information. Mentally, we’re pretty good at A, but pretty bad at B and C. No one can reliably simulate even a low-order dynamic system in their head, and there are too many model checks against data and thought experiments (like extreme conditions) to “run” without help.

System 3’s great weakness is that it takes still more time than using System 2. But it makes up for that in three ways. First, reliable predictions and tests of behavior reveal misconceptions about the problem/system structure that are otherwise inaccessible, so the result is higher quality. Second, the model is shareable, so it’s easier to convey insights to other stakeholders who need to be involved in a solution. Third, formal models can be reused, which lowers the effective cost of an application.

But how do you manage that “still more time” problem? Consider this advice:

I discovered a simple solution to making challenging choices more efficiently at an offsite last week with the CEO and senior leadership team of a high tech company. They were facing a number of unique, one-off decisions, the outcomes of which couldn’t be accurately predicted.

These are precisely the kinds of decisions which can linger for weeks, months, or even years, stalling the progress of entire organizations. …

But what if we could use the fact that there is no clear answer to make a faster decision?

“It’s 3:15pm,” He [the CEO] said. “We need to make a decision in the next 15 minutes.”

“Hold on,” the CFO responded, “this is a complex decision. Maybe we should continue the conversation at dinner, or at the next offsite.”

“No,” The CEO was resolute, “We will make a decision within the next 15 minutes.”

And you know what? We did.

Which is how I came to my third decision-making method: use a timer.

I’m in favor of using a timer to put a stop to dithering. Certainly a body with scarce time should move on when it perceives that it can’t add value. But this strikes me as a potentially costly reversion to System 1.

If a problem is strategic enough to make it to the board, but the board sees a landscape that prevents a clear decision, it ought to be straightforward to articulate why. Are there tradeoffs that make the payoff surface flat? The timer is a sensible response to that, because the decision doesn’t require precision. Are there competing feedback loops that suggest different leverage points, for which no one can agree about the gain? In that case, the consequence of an error could be severe, so the default answer should include a strategy for detection and correction. One ought to have a way to discriminate between these two situations, and a simple application of System 3 might be just the tool.