Longevity Study → Investigation 15

Does Wealth Buy Years?

30,000+ Americans aged 50+. Same health conditions, same chronic diseases — but different bank accounts. Does adding socioeconomic data to clinical predictions improve mortality prediction?

The Question

Investigation 13 showed that ML can predict 10-year mortality from health data alone. But people aren’t just their diagnoses — they’re also their bank accounts, their education, their social networks. The social determinants of health (SDOH) literature is clear: wealth predicts longevity. The question is whether adding wealth, income, education, and marital status to a model that already knows your chronic diseases, smoking status, and BMI actually improves prediction — or whether health status already captures most of what wealth does.

--
People Tracked
--
Died Within 10 Years
--
Health-Only AUC
--
Health+SDOH AUC

ADM Prediction (Made Before Running Models)

Predicted winner: Health+SDOH ML, but modest gain. Chetty et al. (2016) showed a 10–15 year life expectancy gap between top and bottom income quintiles. But most of this gap works through health behaviors and conditions already in our feature set. The interesting question isn’t whether wealth matters — it’s whether wealth adds information BEYOND what health status already captures.

Results

ROC Curves

Feature Importance (Top 8, Health+SDOH)

Wealth Mortality Gradient

Multi-Model Comparison

Subgroup Analysis: Does SDOH Help Everyone Equally?

The wealth-health gradient may differ by age, sex, or baseline health. Does adding SDOH features improve prediction more for some groups than others?

Calibration: Predicted vs Observed

A well-calibrated model should match predicted probabilities to observed frequencies. Does adding SDOH improve calibration as well as discrimination?

The ADM Insight

Wealth matters for longevity — but not as much as your doctor’s notes. Adding socioeconomic data improves prediction, confirming that the health-wealth gradient is real. But the marginal improvement suggests that most of wealth’s effect on mortality is mediated through the health conditions we already measure: diabetes, heart disease, obesity, depression. Wealth’s independent effect is real but smaller than the mediated pathway.

Cohort: HRS RAND respondents aged 50–90 (same as Q13 mortality). Binary outcome: died within 10 years of baseline.

Three models compared: (1) Domain: Charlson-style published risk score using age, chronic conditions, smoking, BMI. (2) Health-only ML: GradientBoostingClassifier on demographics + chronic conditions + health behaviors + functional status. (3) Health+SDOH ML: same GradientBoosting architecture but adds wealth quintile, income quintile, education years, married/partnered status, and medication count.

SDOH variables: Wealth and income quintiles computed within-cohort using pd.qcut. Education measured in years of schooling. Marital status coded as married/partnered vs not. Medication count is total number of prescription medications.

Evaluation: 5-fold stratified cross-validation. Bootstrap 95% CIs from 1,000 resamples. Calibration analysis (Brier score, ECE). Multi-model comparison (LogisticRegression, RandomForest, GradientBoosting).

Wealth is measured at a single timepoint: Lifetime wealth trajectory may matter more than a snapshot. A recently-bankrupt former millionaire and a lifelong low-income worker look very different despite similar current wealth.

HRS oversamples Black and Hispanic respondents: Wealth distributions may not be nationally representative. Within-cohort quintiles may not match population quintiles.

Medication count is ambiguous: It’s partly an SDOH variable (access to healthcare) and partly a health variable (disease burden). Its placement in the “SDOH” feature set is debatable.

Reverse causation: Poor health may cause low wealth (medical bankruptcy, inability to work), not just the other way. Cross-sectional data cannot disentangle the direction.