The Data Paradox

The Evidence Is Real

These aren't weak correlations. Wealth and sleep are among the strongest predictors of health outcomes ever documented. The question is whether knowing them improves prediction beyond what we already know.

Income & Mortality

2.3× mortality gap

Poorest income quintile: 28.0% die within 10 years. Highest: 12.3%. Data from 28,636 adults pooled across ten NHANES cycles (1999–2018).

If wealth predicts death this strongly, shouldn't it improve our models?

Sleep & Health Decline

69% vs 39% outcome rate

People with zero sleep problems have a 69% positive health outcome rate. One or more problems: 39–41%. Data from 24,155 people (HRS).

If sleep quality separates outcomes this clearly, shouldn't it help?

The ROC Curves Tell the Truth

ROC curves reveal how well a model distinguishes positive from negative cases across every possible threshold. If adding data helps, the curve should shift upward. Watch what actually happens.

Adding Income & Education Data

Q15: 28,636 NHANES adults · 6,626 deaths · 10-year follow-up

Adding Sleep Data

Q16: 24,155 people · 6-year health decline outcome

More data made it worse: Adding sleep variables to the domain model dropped performance from AUC 0.604 to 0.600. The extra parameters introduced noise without adding signal.

The Reclassification Shell Game

Net Reclassification Index (NRI) counts how many patients move to a more appropriate risk category. For every person the new data helps, another is hurt.

SDOH Reclassification

28,636 adults across 4 risk categories

Sleep Reclassification

24,155 patients across 4 risk categories

The Signal Is Already Captured

Feature importance reveals why: the new variables rank near the bottom. Age, self-rated health, and existing conditions already carry the signal that wealth and sleep correlate with.

Health + SDOH Model Features

GradientBoosting importance — SDOH features highlighted

Health + Sleep Model Features

GradientBoosting importance — sleep feature highlighted

Why Wealth Doesn't Help

Wealth doesn't kill directly — it works through the health conditions it causes: diabetes, heart disease, obesity, smoking. Those conditions are already in the model. Adding wealth is adding a proxy for information the model already has.

Why Sleep Doesn't Help

Poor sleep doesn't cause health decline independently — it's a symptom of the conditions that do. Self-rated health (the #1 feature at 19.4%) already captures what sleep problems signal. Adding sleep is adding a downstream indicator.

Two Paradoxes, One Lesson

The longevity study surfaced two symmetrical findings. Together, they define the boundaries of the right-fidelity principle.

The Biology Paradox

More computation doesn't always help. ML alone (r = 0.478) was worse than a textbook formula (r = 0.527) for biological age.

More data doesn't always help. Adding wealth (+0.003 AUC) and sleep (+0.001 AUC) to mortality models changed virtually nothing.

The ADM Principle

The right model at the right fidelity. Not the most complex, not the most data-rich — the one matched to the question and the decision it supports.

The right-fidelity lesson isn't just "simpler is better." In 7 of 13 longevity investigations, ML genuinely outperformed domain knowledge — sometimes dramatically (diabetes risk on NHANES: AUC 0.784 → 0.851, +6.7 points). The lesson is that fidelity has a ceiling for each question, and adding complexity beyond that ceiling wastes resources without improving decisions. Analysis Driven Modeling™ finds that ceiling before you build.