Another TED talk argues for replacing calculus with statistics at the pinnacle of mathematics education.

There’s an interesting discussion at Wild About Math!.

I’m a bit wary of the idea. First, I don’t think there needs to be a pinnacle – math can be a Bactrian camel. Second, some of the concepts are commingled anyway (limits and convergence, for example), so it hardly makes sense to treat them as competitors. Third, both are hugely important to good decision making (which is ultimately what we want out of education). Fourth, the world is a dynamic, stochastic system, so you need to understand a little of each.

Where the real opportunity lies, I think, is in motivating the teaching of both experientially. Start calculus with stocks and flows and physical systems, and start statistics with games of chance and estimation. Use both to help people learn how to make better inferences about a complex world. Then do the math as it gets interesting and necessary. Whether you come at the problem from the angle of dynamics or uncertainty first hardly matters.

After working 2 decades of advanced multivariate statistics, especially analyzeing observational data, I came to a conclusion that statistics on observational data is a curve fitting, nothing more nothing less. John Tukey and David A. Freedman at UC-Berkeley had expressed similar conclusion and I was puzzled and I know why. Actually, the problem is not statistics but the observational data where factors (predictors) are confounded and thus collinear factors hinder the interpretation of the factors. Statistics on observational data is good for prediction (intepolation) but not useful for interpretation.

That’s a pretty interesting conclusion to convey to students all by itself.

Do you have a citation for Tukey/Freedman’s thoughts on this?

Mosteller, F., and Tukey, J. H., Data Analysis and Regression, Addition Wesley, MA, 1977. See the chaper regarding interpretation of regression coefficients (don’t remember the exact chapter)

David A. Freedman, Statistical models & shoe leather, Sociological methodology v21, 1991 p291-313. The pdf file is available on-line.

People might think that these are just a random selection out of vast amount of analysis on statistical analysis. I hope not because of 1) difference between observational data and data from controlled experiments and 2) collinearity (ill conditioning, confounding).

Usually the regression coefficients are interpreted as measures of importance of individual predictors (i.e., which predictor is the most important to change the output?). This interpretation assumes that predictors are not collinear. The ill effects of collinear predictors include the following; (1) regression coefficients become highly sensitive to small perturbations in a sample (thus often uninterpretable), (2) important predictors are declared statistically insignificant, and (3) the signs of regression coefficients are frequently opposite to common observations (i.e., “wrong” signs in the regression coefficients ).

There are two types of data that have different implications in deriving the cause and effect relationship: 1) data from controlled experiments and 2) data from observations.

The cause-and-effect relationship should be ensured by performing a variety of controlled experiments in which respondents are randomly assigned, the sample size are equally balanced between treatment and control groups, and most importantly, the factors of interests are assigned so that they are not confounded. The sole objective of controlled experiments is to make factors of interest (design variables, predictors in regression) un-confounded so that the effects of each factors can be separated. When one factor is changed, the other factors should remain unchanged (thus they are orthogonal or not collinear); otherwise, they are confounded and the effects of two factors are inseparable. This is why controlled experiments are recommended whenever possible.

y = Xb. y=response, X=predictors, b= partial regression coefficients

b = (X’X)-1X’y.

Collinearity is also known as ill-conditioning, near-singularity, or rank degeneracy in numerical analysis and confounding/alias in design of experiments (DOE).

Actually, the problem is not statistics but the observational data because the “analysis of variance approach” in controlled experiments (DOE) and the least square estimation in regression are “identical” in mathematics.

Collinearity is a critical and universal issue affecting from various regression methods to structural dynamics because computation of inverse matrix is involved. The ill effects of collinear variable are caused by the computation of the inverse matrix, (XT X) -1. Matrix division is carried out during Gaussian elimination to find out inverse matrix, in which the pivots become very close to zero, which is analogous to a division by a small number being close to zero. Thus, a numerator divided by a very small denominator causes the division to change considerably, thus not interpretable. Complex methods (e.g., artificial neural network or non-linear models) are widely adopted in the futile hope that some advanced methods will handle collinearity and all needed is to locate a method in a pool of various advanced methods. Actually, they make the ill-conditioning worse.

As an example, there is good correlation between foot size and IQ: the larger the foot, the higher the IQ. Well…., there is one key factor that is missing in concluding that foot size is related to IQ: “AGE”. As kids grow older, their feet get larger and at the same time their IQ too. Usually, the age is included in a regression as a covariate to “statistically” control the effects of age on IQ. Unfortunately, the foot size is collinear with age (i.e., older kids have larger foot) thus it is difficult to separate the effects of foot size against age on IQ. Then, there is gender difference (i.e., girls grow faster than boys) as well as nutrition (i.e., kids in poor countries), which might be the results of parents’ income that may be affected by parent’s education level, etc. As these factors are included in a regression, it is likely that they are collinear (high income is associated with high education). So,there goes the never-ending story of discussion. Well there are practical limits to consider in life , “we cannot include all the important factors and we have to draw a line”. Even though we accept the practical limits, collinearity is not checked and few people are aware of it, and the difference between data from observation and from controlled experiments.

Technical aspects of ill-conditioning, see

Stewart, G.W., Collinearity and least squares regression, Statistical Science, 2, pp.68–100, 1987.

Moler, C., Numerical Computing with Matlab, Society for Industrial and Applied Mathematics, 2004.

For full explanation, see

Belsley, D. A., Kuh, E., and Welsch, R. E., Conditioning Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley, New York, 1980.

See wiki “confounding” “Multicollinearity” “observational data” “controlled experiments”

Hope this helps.

b = inverse((X’X))X’y. ‘ indicates tranpose

Thanks! We actually run into this issue a lot in consulting. I’ll follow up on your links and maybe do another post.