What Does a "Strong" Statistical Relation Really Mean?
When you hear a researcher claiming that he found a strong correlation between certain variables you probably think about relations like the one on the left in the image below.
Correlation coefficient, which will be close to 1.00 is however almost nonexistent phenomena. This would mean two variables have identical determinants and that is not happening in the reality of social science.
The Reality of Correlations in Social Sciences
As noted by Kahneman et al. (2021) an extensive review of research in social psychology, covering 25,000 studies and involving 8 million subjects over one hundred years, concluded that “social psychological effects typically yield a value of correlation coefficient equal to 0.21.” Higher correlations are common for physical measurements but are still far from perfect. For example, the correlation between adult height and foot size is approximately 0.60.
Correlation coefficient of 0.50 looks much less convincing as you can see on the right side of the picture but is typically considered as a statistically significant. This can be confusing for readers, and it's crucial to understand that statistical significance doesn't necessarily indicate a strong relationship between variables. Rather, it suggests that the findings are unlikely to be the result of chance alone.

Understanding vs. Predictions: Introducing Percent Concordant
Kahneman et al. raise an important distinction between understanding and predictions. Connections between certain variables can be well understood and documented with statistically significant correlations and still be ineffective in providing accurate predictions of the future events. They introduce a statistic called Percent Concordant. It can be calculated directly from correlation coefficient and is much more intuitive for interpretation. Please see the chart below. Percent Concordant is expressed in percentage and indicates likelihood that the person with certain characteristic will achieve certain outcome as compared with the person without this characteristic. 50% means that they have equal chances and variable has no impact on this specific outcome, 0% and 100% means that the outcome is fully determined by the variable.
Example: IQ and Academic Achievement
Correlation coefficient of 0.50 is considered a strong correlation in social science. For example many studies have found correlations between IQ scores and academic achievement to be around 0.50. Respective value of Percent Concordant is 66,67% and it means that out of two student, the one with higher IQ will have higher academic achievement in 66,67% cases, which is far from deterministic. There is quite a good chance that a student with lower IQ will be a better performer.
Using IQ score as a sole predictor of academic performance will yield quite high error rate, which in reality will be substantially augmented by the inevitable impact of other variables.

Strategies for Improved Forecasting
1. Incorporate Multiple Variables
One approach to mitigate the reliance on one factor is to incorporate multiple variables rather than relying on a single predictor. I was writing about benefits of combining multiple factors in an integrated measure in the post about SDG Index: #31 SDG Index - One number that rules them all
2. Accept Objective Ignorance
While using multiple factors can enhance predictions, it won't completely overcome the limited predictability of individual outcomes. Therefore, a vital strategy is to accept "objective ignorance" - acknowledging the important unknowns that limit achievable accuracy. This caution helps prevent overconfidence bias, which can manifest in many fields, particularly in predictive endeavors.
Conclusion
Understanding the nuances of statistical relationships in social sciences is crucial for interpreting research findings and making informed decisions. While correlations can provide valuable insights, it's important to recognize their limitations, especially in predictive contexts. By employing strategies like incorporating multiple variables and acknowledging objective ignorance, we can develop a more realistic and effective approach to understanding and forecasting social phenomena.
If you like this post join the growing community of forward-thinking readers and sign-up to my newsletter. My weekly posts explore how individuals and organizations adapt and evolve. Gain evidence-based insights to boost resilience across domains.
References and Notes
Kahneman, D., Sibony, O., & Sunstein, C. R. (2021). Noise: A flaw in human judgment.
A classic meta-analysis by Judge et al. (2001) found the corrected correlation between job satisfaction and job performance to be 0.30. Some subsequent studies have found correlations closer to 0.5 in specific contexts. - Judge, T. A., Thoresen, C. J., Bono, J. E., & Patton, G. K. (2001). The job satisfaction–job performance relationship: A qualitative and quantitative review. Psychological Bulletin, 127(3), 376-407.
Many studies have found correlations between IQ scores and academic achievement to be around 0.5. For instance, a meta-analysis by Roth et al. (2015) found an overall correlation of 0.54 between intelligence and school grades. - Roth, B., Becker, N., Romeyke, S., Schäfer, S., Domnick, F., & Spinath, F. M. (2015). Intelligence and school grades: A meta-analysis. Intelligence, 53, 118-137.
Charts were developed with the support of Claude and ChatGPT, which were used to prepare Python code run later in Jupyter Notebook.
Comentarios