When Does Correlation Equal Causation?

Data purists would reply “Never” and beat me up for even asking this question.

On the other hand, “data sophists” who’re accustomed to lying with Big Data would wonder, “Duh, what’s the difference?”

If you’re like me, you don’t belong to either camp, and might wonder if there could be a golden mean between the two extremes.

Let me use the following examples to get a feel of when correlation can equal causation and when it can’t.

EXAMPLE 1

Correlation: US spending on science, space and technology goes up or down in tandem with suicides by hanging, strangulation and suffocation.

cec01

Source: Spurious Correlations (http://tylervigen.com/)

Causation: If suicides by hanging etc. go up, US spending on science etc. will also go up.

Action: Monitor suicide rate by hanging. If it goes up, release more budget for R&D. If it goes down, downsize R&D.

Even a diehard Data Sophist would intuitively agree that correlation does not equal causation in this case.

EXAMPLE 2

Correlation: There’s a higher attach rate of business loans with home loans in Norfolk as compared to other offices.

Causation: If home loan sales go up in Norfolk, business loan sales will also go up.

Action: Monitor home loan volume. If it goes up, source additional funds for business loans. If it goes down, release funds earmarked for business loans.

As we saw in What The Obama Credit Card Decline Means For The Future Of Analytics, the correlation in this example made immediate business sense to the head of credit of the bank when he found out that, unlike in its other offices, business banking and retail banking sales people sat at the same office in Norfolk, a practice that led to better exchange of information of mortgage buyers being in-market for business loans and vice versa.

Therefore, intuitively, even a Data Purist would agree that correlation could equal causation in this case.


Correlation does not equal causation in the first example.

Correlation may equal causation in the second one.

If you’ve noticed my frequent use of the term “intuition” in this post, it’s intentional. When all the data is collected, crunched and visualized, many business actions are guided by the gut to some extent. At least heuristic ones like developing a marketing plan, writing a book or recruiting a sales rep.

By abstracting the basic differences between the two examples, I propose that correlation can equal causation if the following three conditions are met:

  1. The measured variables belong to the same domain
  2. The correlation makes intuitive sense, thereby making the causation plausible
  3. The causation can be validated by backtesting it on past data.

So, the answer to the question in the title of this post is, “sometimes”!