Nelson Rules

I ran across the Nelson Rules in a machine learning package. These are a set of heuristics for detecting changes in statistical process control. Their inclusion felt a bit like navigating a 787 with a mechanical flight computer (which is a very cool device, by the way).

The idea is pretty simple. You have a time series of measurements, normalized to Z-scores, and therefore varying (most of the time) by plus or minus 3 standard deviations. The Nelson Rules provide a way to detect anomalies: drift, oscillation, high or low variance, etc. Rule 1, for example, is just a threshold for outlier detection: it fires whenever a measurement is more than 3 SD from the mean.

In the machine learning context, it seems strange to me to use these heuristics when more powerful tests are available. This is not unlike the problem of deciding whether a random number generator is really random. It’s fairly easy to determine whether it’s producing a uniform distribution of values, but what about cycles or other long-term patterns? I spent a lot of time working on this when we replaced the RNG in Vensim. Many standard tests are available. They’re not all directly applicable, but the thinking is.

In any case, I got curious how the Nelson rules performed in the real world, so I developed a test model.

This feeds a test input (Normally distributed random values, with an optional signal superimposed) into a set of accounting variables that track metrics and compare with the rule thresholds. Some of these are complex.

Rule 4, for example, looks for 14 points with alternating differences. That’s a little tricky to track in Vensim, where we’re normally more interested in continuous time. I tackle that with the following structure:

Difference = Measurement-SMOOTH(Measurement,TIME STEP)
**************************************************************
Is Positive=IF THEN ELSE(Difference>0,1,-1)
**************************************************************
N Switched=INTEG(IF THEN ELSE(Is Positive>0 :AND: N Switched<0
,(1-2*N Switched )/TIME STEP
,IF THEN ELSE(Is Positive<0 :AND: N Switched>0
 ,(-1-2*N Switched)/TIME STEP
 ,(Is Positive-N Switched)/TIME STEP)),0)
**************************************************************
Rule 4=IF THEN ELSE(ABS(N Switched)>14,1,0)
**************************************************************

There’s a trick here. To count alternating differences, we need to know (a) the previous count, and (b) whether the previous difference encountered was positive or negative. Above, N Switched stores both pieces of information in a single stock (INTEG). That’s possible because the count is discrete and positive, so we can overload the storage by giving it the sign of the previous difference encountered.

Thus, if the current difference is negative (Is Positive < 0) and the previous difference was positive (N Switched > 0), we (a) invert the sign of the count by subtracting 2*N Switched, and (b) augment the count, here by subtracting 1 to make it more negative.

Similar tricks are used elsewhere in the structure.

How does it perform? Surprisingly well. Here’s what happens when the measurement distribution shifts by one standard deviation halfway through the simulation:

There are a few false positives in the first 1000 days, but after the shift, there are many more detections from multiple rules.

The rules are pretty good at detecting a variety of pathologies: increases or decreases in variance, shifts in the mean, trends, and oscillations. The rules also have different false positive rates, which might be OK, as long as they catch nonoverlapping problems, and don’t have big differences in sensitivity as well. (The original article may have more to say about this – I haven’t checked.)

However, I’m pretty sure that I could develop some pathological inputs that would sneak past these rules. By contrast, I’m pretty sure I’d have a hard time sneaking anything past the NIST or Diehard RNG test suites.

If I were designing this from scratch, I’d use machine learning tools more directly – there are lots of tests for distributions, changes, trend breaks, oscillation, etc. that can be used online with a consistent likelihood interpretation and optimal false positive/negative tradeoffs.

Here’s the model:

NelsonRules1.mdl

NelsonRules1.vpm

Reforesting Iceland

The NYT has an interesting article on the difficulties of reforesting Iceland.

This is an example of forest cover tipping points.

Iceland appears to be stuck in a state in which “no trees” is locally stable. So, the system pushes back when you try to reforest, at least until you can cross into another basin of attraction that’s forested.

Interestingly, in the Hirota et al. data above, a stable treeless state is a product of low precipitation. But Iceland is wet. So, deserts are a multidimensional thing.

Bernoulli and Poisson are in a bar …

Bernoulli asks, “how long have we been here?” Poisson replies, “I have no idea.”

Bad joke aside, memoryless behavior is a key component of a toy model of car rentals I made a while ago. I recently noticed that I was a bit lazy in my choice of RANDOM functions, so I’ve produced an update.

The difference is in the use of Poisson and Binomial distribution functions. In the original, I used the Poisson distribution everywhere to represent arrival processes. That’s reasonable in the limit, where a large number of candidate arrivals are realized with a small probability, such that the expected arrivals occur at some finite rate.

Think of a lemonade stand on a busy street – there’s a very large population of potential lemonade buyers, but only a small fraction actually stop for a drink. Normally, we don’t want to model the street and the traffic generation process, so it’s reasonable to assume independent arrivals from a large pool at some rate that we can measure, using the Poisson distribution. This is similar to using a cloud in SD to indicate a source or sink that we aren’t modeling. Continue reading “Bernoulli and Poisson are in a bar …”

Answer to A Bongard Problem

As a few people nearly guessed, the left side is “things a linear system can do” and the right side is “(additional) things a nonlinear system can do.”

On the left:

  • decaying oscillation
  • exponential decay
  • simple accumulation
  • equilibrium
  • exponential growth
  • 2nd order goal seeking with damped oscillation

On the right:

Bongard problems test visual pattern recognition, but there’s no reason to be strict about that. Here’s a slightly nontraditional Bongard problem:

The six on the left conform to a pattern or rule, and your task is to discover it. As an aid, the six boxes on the right do not conform to the same pattern. They might conform to a different pattern, or simply reflect the negation of the rule on the left. It’s possible that more than one rule discriminates between the sets, but the one that I have in mind is not strictly visual (that’s a hint).

The original problem was here.

A Bongard problem

Bongard problems test visual pattern recognition, but there’s no reason to be strict about that. Here’s a slightly nontraditional Bongard problem:

The six on the left conform to a pattern or rule, and your task is to discover it. As an aid, the six boxes on the right do not conform to the same pattern. They might conform to a different pattern, or simply reflect the negation of the rule on the left. It’s possible that more than one rule discriminates between the sets, but the one that I have in mind is not strictly visual (that’s a hint).

If you’re stumped, you might go read this nice article about meta-rationality instead.

I’ll post the solution in a few days. Post your guess in comments (no peeking).

Update to Path Dependence, Competition, and Succession in the Dynamics of Scientific Revolution model

For the 2017 Balaton Group meeting, I’ve updated Sterman & Wittenberg’s Path Dependence, Competition, and Succession in the Dynamics of Scientific Revolution model. The new version is far more usable, with readable variable names and improved diagrams.

This is an extremely interesting model for our current situation of clashing paradigms, fake news and filter bubbles. I encourage you to take a look at the model and paper.

This is actually much more natural as a Ventity model, so watch for another update.

Dynamics of Dictatorship

I’m preparing for a talk on the dynamics of dictatorship or authoritarianism, which touches on many other topics, like polarization, conflict, terror and insurgency, and filter bubbles. I thought I’d share a few references, in the hope of attracting more. I’m primarily interested in mathematical models, or at least conceptual models that have clearly-articulated structure->behavior relationships. Continue reading “Dynamics of Dictatorship”

Ad Experiment

In the near future I’ll be running an experiment with serving advertisements on this site, starting with Google AdSense.

This is motivated by a little bit of greed (to defray the costs of hosting) and a lot of curiosity.

  • What kind of ads will show up here?
  • Will it change my perception of this blog?
  • Will I feel any editorial pressure? (If so, the experiment ends.)

I’m generally wary of running society’s information system on a paid basis. (Recall the first deadly sin of complex system management.) On the other hand, there are certainly valid interests in sharing commercial information.

I plan to write about the outcome down the road, but first I’d like to get some firsthand experience.

What do you think?

Update: The experiment is over.

AI babble passes the Turing test

Here’s a nice example of how AI is killing us now. I won’t dignify this with a link, but I found it posted by a LinkedIn user.

I’d call this an example of artificial stupidity, not AI. The article starts off sounding plausible, but quickly degenerates into complete nonsense that’s either automatically generated or translated, with catastrophic results. But it was good enough to make it past someone’s cognitive filters.

For years, corporations have targeted on World Health Organization to indicate ads to and once to indicate the ads. AI permits marketers to, instead, specialize in what messages to indicate the audience, therefore, brands will produce powerful ads specific to the target market. With programmatic accounting for 67% of all international show ads in 2017, AI is required quite ever to make sure the inflated volume of ads doesn’t have an effect on the standard of ads.

One style of AI that’s showing important promise during this space is tongue process (NLP). informatics could be a psychological feature machine learning technology which will realize trends in behavior and traffic an equivalent method an individual’s brain will. mistreatment informatics during this method can match ads with people supported context, compared to only keywords within the past, thus considerably increasing click rates and conversions.