Data Science should be about more than data

There are lots of “top 10 skills” lists for data science and analytics. The ones I’ve seen are all missing something huge.

Here’s an example:

Business Broadway – Top 10 Skills in Data Science

Modeling barely appears here. Almost all the items concern the collection and analysis of data (no surprise there). Just imagine for a moment what it would be like if science consisted purely of observation, with no theorizing.

What are you doing with all those data points and the algorithms that sift through them? At some point, you have to understand whether the relationships that emerge from your data make any sense and answer relevant questions. For that, you need ways of thinking and talking about the structure of the phenomena you’re looking at and the problems you’re trying to solve.

I’d argue that one’s literacy in data science is greatly enhanced by knowledge of mathematical modeling and simulation. That could be system dynamics, control theory, physics, economics, discrete event simulation, agent based modeling, or something similar. The exact discipline probably doesn’t matter, so long as you learn to formalize operational thinking about a problem, and pick up some good habits (like balancing units) along the way.