Skip to content
Longevity Study → Investigation 6

How Will Health Evolve Over Time?

7,864 people tracked for up to 30 years. Domain models predict the average — health declines about 0.3 points per decade on a 1–5 scale. ML predicts the individual. 15–19% improvement at the horizons that matter most.

7,864
People Tracked
HRS RAND 1992-2022
30
Years of Data
Up to 16 waves per participant
15–19%
ML Improvement
Growing with prediction horizon

ADM Prediction (Made Before Running Models)

Predicted winner: ML. Health trajectories are individual-specific. Population curves capture the average trend but miss individual slopes. Linear extrapolation works short-term but diverges at longer horizons because health transitions are nonlinear (disease onset creates step changes).

Expected: ML wins by 10–20% RMSE at longer horizons (8+ years). Actual: 15–19% RMSE improvement, growing with prediction horizon. Prediction confirmed.

The Challenge
Predicting Individual Health Trajectories

Health care systems plan resources based on population averages — but individuals diverge dramatically from the mean. Some maintain excellent self-rated health into their 80s while others experience rapid decline in their 50s. Predicting which trajectory a person will follow from their early health history is the clinical challenge.

D

Domain Model

Population age curves: mean self-rated health at each age, adjusted for sex. Predicts the average trajectory.

ML

ML Model

GradientBoosting trained on first 3 waves: health ratings, BMI trend, depression, disease burden, lifestyle.

5

Trajectory Types

Stable (35%), gradual decline (30%), rapid decline (15%), recovery (10%), volatile (10%).

12

Prediction Horizons

2, 4, 6, 8, 10, and 12 years. ML advantage grows with horizon length.

The domain model uses population-level age curves from the HRS dataset. For a given person's current age and sex, it looks up the average self-rated health (1=excellent to 5=poor) and projects it forward using the population mean trajectory. This is the simplest reasonable baseline — it knows that health declines with age but treats everyone the same.

The ML model is a GradientBoosting regressor trained on the first 3 waves of each participant's data. Features include: baseline health rating, BMI trajectory, activity level, age, disease count, depression score, education, and smoking history. It predicts health at horizons of 2 through 12 years.

Both models are evaluated on held-out test sets using RMSE. The key finding: the ML advantage grows with prediction horizon. At 2 years, both methods are similar because individual differences haven't had time to manifest. At 12 years, ML's advantage reaches 15–19%.

Population Trajectory
Average Health Trajectory with Uncertainty

The population mean health trajectory shows a gentle decline over 12 waves. But the interquartile range (shaded bands) reveals massive individual variation. The domain model predicts only the center line; the ML model captures who falls where within the spread.

Self-rated health (higher = worse). Shaded bands = IQR. Domain model predicts center line only.
Horizon Comparison
RMSE by Prediction Horizon

Both models predict self-rated health at six horizons. The domain model uses population averages; the ML model uses individual features from early waves. The ML advantage widens as the horizon grows — individual patterns matter more over longer periods.

RMSE on held-out test sets. Lower is better. Percentage labels show ML improvement.
By Trajectory Type
Who Benefits Most from ML Prediction?

The ML model improves over the domain baseline across all trajectory types, but the improvement is largest for atypical trajectories — recovery and rapid decline — exactly where personalized insight matters most.

Domain accuracy vs ML accuracy by trajectory type. Higher is better.
What Matters
Top Predictors of Health Trajectory

Baseline health is the single strongest predictor — where you start determines where you're likely to go. But BMI trajectory, activity level, and disease burden add substantial predictive value beyond demographics alone.

GradientBoosting feature importances for trajectory prediction.

Self-rated health as outcome: The predicted variable is self-rated health (1-5 scale), which is subjective and influenced by mood, expectations, and cultural norms. Objective health measures (biomarkers, functional tests) might show different domain-vs-ML patterns.

First 3 waves as features: The ML model uses the first 3 waves (~6 years) to predict future health. People who drop out before wave 3 are excluded, creating survivorship bias toward healthier individuals.

RMSE improvement vs simpler baselines: The 15-19% RMSE improvement is over population age curves. A comparison to cubic splines or individual-level linear trends would provide a more rigorous baseline for the ML advantage.

Trajectory types are post-hoc: The 5 trajectory types (stable, gradual decline, rapid decline, recovery, volatile) are clustering labels applied after observing all waves, not prospectively defined categories.

ADM Insight
Population averages predict that everyone slowly declines. But 10% of people recover, 15% decline rapidly, and 10% are volatile. The domain model captures the average trajectory. The ML model captures the person. The biggest improvement is for "recovery" trajectories — people who defy the average. That's exactly where personalized prediction changes clinical decisions. The right model for trajectory prediction needs individual features, not just population statistics.
Key Takeaways
  1. ML advantage grows with prediction horizon — 15–19% improvement at 8–12 years. At shorter horizons, population averages are nearly as good because individual differences haven't had time to manifest.
  2. Recovery and rapid decline trajectories benefit most — these are the hardest to predict but the most clinically important. The people who defy the average are exactly who we need to identify early.
  3. Baseline health is the strongest predictor — accounting for 28% of feature importance. But BMI trajectory, activity level, and disease burden add substantial predictive value beyond where you start.