Suppose we find marketplace data stating:
“Customers who chat with the seller are more likely to make a purchase (higher conversion rate).”
At first glance, it sounds like the chat feature itself is boosting the purchase rate. So we might think: let’s add more chat features, because in the end that should raise the conversion rate.
This seems straightforward, but consider this perspective:
“In reality, customers who choose to chat may already have the intention to buy. It’s not that the chat is causing sales. The need to chat may be due to insufficient product information on the product page.”
You can see that if a PM misunderstands the data, they might invest effort in the wrong area, wasting resources without making a real business impact.
This situation involves two key concepts:
Correlation vs. Causation
Correlation
This occurs when two variables appear related — for example, when one variable rises, the other also seems to rise.
Causation
This implies one variable directly affects another, such as higher temperatures causing increased demand for air conditioners.
These two concepts are often confused, leading to the phrase:
“Correlation doesn’t imply Causation.” A and B may simply be coincidentally related; it doesn’t mean A causes B.
Common misinterpretations include:
1. Coincidence (Spurious Correlations)
Two variables may seem related but aren’t. For example, ice cream sales and drowning incidents both increase in summer, but one does not cause the other.
2. Reverse Causality
Misreading the direction of influence. For example, people who buy health supplements might not be healthier because of the supplements; rather, they buy supplements because they already care about their health.
3. Third Variables (Confounding Factors)
A third variable might be causing both A and B. For instance, an increase in both fan and ice sales might be due to rising temperatures, not because they cause each other.
How does this relate to Product Managers?
From the chat feature example:
Using the chat feature correlates with higher conversion rates, but it doesn’t necessarily mean chat causes higher conversion. Without proper analysis, we might focus on making chat even more accessible, only to find no real improvement in the conversion rate. Perhaps the real issue is insufficient product information or that customers don’t trust the seller and need to chat to confirm details.
How to avoid the “Correlation doesn’t imply Causation” trap
Always ask:
- Could it be just a coincidence? Maybe the two variables move in the same direction by chance.
- Are there other variables involved? Consider whether some other factor might be making both variables appear related without a direct causal link. In the chat case, maybe customers already intend to buy, so they chat and are more likely to purchase.
In the end, whether you’re a PM or someone interested in data analysis, remembering that “Correlation doesn’t imply Causation” is crucial. Don’t just follow where the numbers lead because two metrics move together. Ask yourself why they’re related and whether other factors are at play. Taking a more thorough approach can help you avoid misguided decisions.