From Point Estimates to a Pooled Posterior
Di et al. 2017 (Medicare, ≥65) and Krewski et al. 2009 (ACS CPS-II, ≥30) are both landmark PM2.5 mortality studies. Each gives a single hazard ratio (1.073 and 1.056 per 10 µg/m³ respectively) with a tight confidence interval. But CRF is not a binary choice between them — it is a continuous quantity with county-level spatial heterogeneity and measurement uncertainty that both studies attempt to estimate.
Inv 21 replaces the pick-one-study approach with a proper Bayesian hierarchy: a normal-normal Gibbs sampler over 58 California counties with partial pooling toward the state mean. The output is a posterior distribution over CA-specific CRF, with residual uncertainty that can be propagated honestly into policy decisions.
The question: does the CA pooled posterior bracket both Di and Krewski, and how much decision-relevant uncertainty remains after pooling?
L1 Discrete Choice → L4 E-Value Sensitivity
L3 Gibbs sampler: 6,000 iterations, 1,000 burn-in, normal-normal conjugate updates (no PyMC needed). Posterior CI shrinks county-level estimates by 82%. E-value formula: HR + √(HR·(HR−1)) from VanderWeele & Ding 2017.
The Posterior CI Brackets Both CRFs
The L3 state-level posterior mean is HR = 1.0671 per 10 µg/m³, 95% CI [1.0504, 1.0840]. This interval covers both Di and Krewski, which means Di and Krewski are better understood as two samples from a single underlying CRF distribution than as competing hypotheses.
| Approach | HR/10µg 95% CI | Coverage of Di (1.073) & Krewski (1.056) | T2 deaths avoided | Residual EVSI |
|---|---|---|---|---|
| L1 Di alone | 1.071–1.075 | Di only (Krewski outside) | 535 | — |
| L1 Krewski alone | 1.040–1.070 | Krewski only (Di outside) | 2,127 | — |
| L3 hierarchical (CA pooled) | 1.0504–1.0840 | Both Di And Krewski | 373–612 | $0.05B |
T2 deaths CI scales linearly with the posterior β. The residual EVSI is small because the posterior already integrates over a range that contains both Di and Krewski — a definitive CRF resolution study adds little new information.
How Big Would a Confounder Need to Be?
An E-value of 1.33 at the posterior mean, and 1.28 at the lower CI bound, means an unmeasured confounder would need relative risk of at least 1.28 with both PM2.5 exposure and all-cause mortality to fully explain away the observed association. Published confounders in air pollution epidemiology (income, smoking, healthcare access) typically have RR < 1.3 after adjustment. The PM2.5 signal is robust to plausible unmeasured confounding.
The Posterior Is the Anchor
The L3 posterior is the object that gets propagated forward into downstream investigations. The residual EVSI of a definitive CRF resolution study against this posterior is ~$0.05B: small, because the posterior 95% CI already brackets both Di and Krewski. The decision problem is not "which single CRF is right" but "where on the posterior continuum are we, and does it change the policy ranking?"
Pointer to Inv 24: the residual research budget is staged across five candidate designs (meta-analysis, retrospective cohort, Di-Medicare extension, CA prospective, multi-cohort consortium) ranked by expected shrinkage per dollar against this posterior. Inv 24 lays out that portfolio.
Normal-Normal Gibbs Sampler
The model:
y_i | β_i ~ N(β_i, s_i²),
β_i | μ, τ² ~ N(μ, τ²),
μ ~ N(0, 10²),
τ² ~ InverseGamma(2, 0.0001).
All three updates are conjugate, so the Gibbs sampler uses closed-form normal and
inverse-gamma draws with no Metropolis correction. 6,000 iterations after a 1,000-step
burn-in converge in ~200 ms of single-core wall time.
County observations are simulated from published CRF heterogeneity (Qian et al. 2022) because Medicare linkage beta fits are proprietary. The simulation preserves regional offsets (higher CRF in LA Basin + SJV, lower in Sierra + North Coast) and population-scaled standard errors, so the partial-pooling structure is honest. Swapping in real county fits when they become available is a one-line change.
Sources: Di et al. 2017 (NEJM); Krewski et al. 2009 (HEI 140); VanderWeele & Ding 2017 (E-value); Qian et al. 2022 (spatial heterogeneity); Gelman, Carlin, Stern & Rubin 2013 (Bayesian Data Analysis, ch. 5 hierarchical models).
Convergence disclosure: The reported posterior comes from a single Gibbs chain (6,000 iterations, 1,000 burn-in). Because every update is conjugate normal/inverse-gamma there is no Metropolis rejection, and the chain mixes rapidly in simulation — but formal multi-chain R̂ (Gelman-Rubin) and effective-sample-size diagnostics have not been reported on this page. The pooling shrinkage (82%) is driven primarily by the likelihood-weighted county SEs, not chain length, so the ordinal conclusions are stable; the CI width on the global μ should be read as approximate (± ~1 MCMC SE). A production submission would quote R̂ < 1.01 across 4 chains and n_eff > 1,000.