ADM Prediction (Made Before Running Models)
Predicted winner: Hybrid. KDM provides a physiologically grounded prior using published normative aging curves. ML alone must learn these from scratch with only 1,907 samples — it will overfit. But KDM assumes linear biomarker-age relationships and ignores interactions. The hybrid uses KDM’s estimate as a strong prior, then learns residual nonlinearities.
Expected: KDM > ML alone (insufficient data); Hybrid > KDM (residual nonlinearities). Actual: KDM r=0.527, ML r=0.478, Hybrid r=0.631. Prediction confirmed.
Three grouped bar clusters showing Pearson r for KDM (domain), ML, and Hybrid across forward split-half, reverse split-half, and self-rated health correlation. Hybrid wins the forward validation that matters most for clinical use. ML wins on reverse (predicting biomarkers from outcomes) and self-rated health.
Three individuals illustrate the range of biological aging. The “healthy ager” (chrono 56, bio 47.5) shows how good biomarkers translate to younger biological age. The “accelerated ager” (chrono 45, bio 60.1) shows how elevated HbA1c and CRP can add 15 years of biological aging.
The KDM formula regresses each biomarker against chronological age to get slopes and residual standard deviations, then combines them using a weighted maximum-likelihood estimator. The result is a biological age estimate depending only on blood biomarkers.
Split-half validation splits the sample in two halves. KDM coefficients are trained on Half A, predictions made for Half B. Then we check: does the predicted biological age correlate with aging outcomes in Half B? This prevents the circular reasoning that inflated earlier studies (r=0.965 was just predicting chronological age).
Self-rated health is fully external: people who rate their health as “Poor” should have higher biological age deltas than those who rate it “Excellent.” This provides independent confirmation.
ADM Insight
This investigation had to be rewritten. The original hybrid model achieved r=0.965 — suspiciously perfect. It was predicting risk factors from the biomarkers that define those risk factors. Split-half cross-validation breaks this circularity: train on blood labs, validate against clinical outcomes. The hybrid drops to r=0.631 — genuine biology, not tautology. And the star finding: KDM_Delta (the domain knowledge variable) is the #1 feature in the hybrid model, with 44% of total importance. Domain knowledge isn't replaced by ML. It's the backbone that ML extends.
Why does ML alone (r=0.478) underperform KDM (r=0.527)? With only 1,907 samples and 7 biomarkers, the neural network overfits to noise. KDM provides a physiologically grounded prior — it knows which biomarkers should increase or decrease with age and weights them by their biological reliability. The hybrid model succeeds because KDM constrains the search space while ML learns the residual patterns KDM misses. This is the ADM thesis in miniature: encode what physiology tells you, then let data handle the rest.
Small sample: N=1,907 from NHANES 2017-2018. This limits the ML model's ability to learn complex patterns and may explain why ML alone underperforms KDM. Larger cohorts (UK Biobank, N>500K) would likely narrow or reverse this gap.
Cross-sectional only: Biological age is estimated from a single snapshot. Longitudinal tracking (does higher bio-age predict faster decline?) would be a stronger validation than cross-sectional correlation.
Self-rated health as external validator: SRH is subjective and correlated with depression, socioeconomic status, and cultural factors. Stronger external validators would include mortality, disease onset, or functional decline — requiring longitudinal follow-up not available in NHANES.
Seven biomarkers only: KDM uses 7 blood biomarkers. Adding epigenetic clocks (DNA methylation), telomere length, or proteomics could substantially improve all three models.
No fairness analysis: Bio-age estimation may perform differently across racial/ethnic groups, sexes, and socioeconomic strata. Subgroup validation not yet performed.