Happiness index: A Multivariate Regression Analysis
In this project, I transition from a simple economic view to a multidimensional analysis of what actually makes a country “happy.” Using the World Happiness Report and environmental data, I built and refined a Multivariate Linear Regression model to see if factors like health, corruption, and carbon footprints explain more than just raw wealth (GDP).
🧐 The data?
We’ve got a mix of economic, health, and environmental indicators for 112 countries:
- Happiness (Felicidad): Our target variable. A score measuring subjective well-being.
- log_GDP: The natural log of GDP per capita. We log-transformed this because the relationship between money and happiness isn’t a straight line—it has “diminishing returns”.
- Life Expectancy: The average years a person is expected to live. This turned out to be our “MVP” predictor.
- Corruption Index: A 0-100 scale of institutional transparency. Higher scores mean less perceived corruption.
- log_CO2: Log-transformed annual $CO_2$ emissions. We transformed this to handle extreme industrial outliers and found it actually has a significant negative impact on happiness once normalized.
🛠️ The Tech Part
- Observations: 112 countries (after cleaning and merging).
- Model Performance: We boosted our explanatory power ($R^2$) from a weak 22% in the simple model to a robust 64.7% in the final multivariate version.
- Precision: Reduced the “Average Prediction Error” from 0.97 to 0.67 happiness points.
- Tools: Python (Pandas, Statsmodels for OLS, Scipy for F-tests, Matplotlib/Seaborn for distributions).
Project Files:
One of the coolest findings was the “GDP Paradox.” In a simple model, GDP looks like it’s everything. But in my multivariate model, the p-value for GDP actually struggled ($p=0.760$ in the 3-variable test) because Life Expectancy and Corruption were doing all the work. It turns out money doesn’t make you happy directly; it just buys the healthcare and honest government that actually do the job.
I also kept an eye on recent research regarding environmental psychology. It’s fascinating because it suggests that while we love industrial progress, the $CO_2$ “side effect” might be dragging down our collective mood more than we realized.