Reading back over my last post a few weeks ago, I realized that I sounded like a data-mining zealot or a pure empiricist. The post reads like it was written by someone that has no use at all for Aristotle or Huntington, or any of the other great political theorists… someone who believes that he can somehow pull pure truth out of a collection of social measurements.

No. That’s dumb. In fact, I think that data is worse than useless without theory. The fanciest statistical package in the world cannot make sense of a dataset by itself. Obviously, somebody needs to be gathering data, selecting an appropriate model, and choosing which independent variables to focus on, but that is still an overly facile representation of the problem.

Most packages at this point have automated procedures for selecting the “best” model according to some informational criterion. These come in various flavors and levels of complexity, but almost always boil down to a measurement of model fit–or how closely the mathematical function you’ve cooked up approximates the observed phenomena.

A perfect fit is laughably easy to achieve. All you need to do is throw every explanatory variable you’ve got (at the theoretical limit, this means every measurable quantity in the universe) into the bin marked “factors that somehow cause phenomena xyz” and you’re done. Your graph will connect every single data point you’ve got in a stunning display of super-squiggliness, you’ll have an R-squared of 1, and you’ll be published in every top journal in the land.

Not hardly (and that’s not just because more sophisticated goodness-of-fit measurements penalize you for such “chance capitalization”). You need to be able to explain the process by which you arrived at your magic recipe. In other words, why should we believe that factors a, b and c (and perhaps a*c) cause xyz, and what does that sequence of events look like?

This presents a chicken-and-egg quandary if we are really trying to discover “the truth” about xyz’s causes: if we don’t know what we’re looking for in the first place, we’ll have no idea what data to gather. And when we’re dealing with social outcomes, simply “measuring everything” doesn’t work. These are complex phenomena that demonstrate exceptionally high-dimensional causality. By discipline, and in causal order, political science is built out of mathematics, physics, chemistry, biology, psychology, and economics. We are still working on finding the elementary building blocks of matter; how are we supposed to know what to watch for five or six orders of magnitude later?

Note that I am not making an argument for extreme reductionism here. We’re not supposed to know, we’re supposed to make educated guesses. That’s what a theory is: a carefully considered, logically consistent guess, which in turn ultimately boils down to…. intuition. Yes: when trying to explain what individual people, or crowds of people, or millions people living in a state have done (or are going to do), and why they’ve done it (or are going to do it), I think our best bet is to have a flash of insight.

Where does that kind of insight come from? I have no idea, but reading lots and lots of work by the smartest, wisest, and most insightful people in history is a fantastic place to start.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s