California Freight Cleanup → Investigation 6-9
Does an independent model validate the NSGA-II runner-up portfolio?
VERDICT: AMBIGUOUS — midpoint $3.12M/death [$6.80M Di, $2.03M Krewski]
The NSGA-II runner-up portfolio (Q_nsga_2) concentrates $1.81B on indoor air quality improvement — a novel design the original surrogate had never evaluated at that scale. Investigation 6-9 validates that prediction by running each spending pathway through an independent model. The surrogate holds within 8% at the midpoint. The problem is the underlying health-risk science, which disagrees by 3.4× depending on which study you use.
The decision
When we expanded the portfolio candidate set using Pareto optimization, a new winner emerged: Q_nsga_2, which concentrates its spending entirely on indoor air improvement with a small zero-cost power plant retirement and nothing on transport, building upgrades, or wildfire. That result depends on a linear surrogate model’s prediction of indoor air benefits at a scale and composition it had never actually seen.
This is the largest open caveat in the cascade. If the surrogate over-predicts the indoor benefit at that design point, the new winner is a modeling artifact — and the zero-cost free-lunch portfolio remains the conservative production recommendation. Investigation 6-9 closes that caveat by running the indoor pathway through an independent model to see whether the surrogate’s prediction holds.
Q_nsga_2 x-vector decomposition
Q_nsga_2 = [wildfire=0, transport=$0B, building=$0B, indoor=$1.812B, dte_on=0.019]. Three distinct exposure pathways, validated independently:
| Spend | Validation source | Deaths/yr (Di) | Deaths/yr (Krew.) | Deaths/yr (Midpoint) | |
|---|---|---|---|---|---|
| Outdoor ISRM | $0 | Identity from x-vector (no transport/building/wildfire spend) | 0.0 | 0.0 | 0.0 |
| Indoor air | $1.812B | Inv 19 L4 personal-exposure (CHAD time-activity, 5 archetypes), scaled from $2B | 266.4 | 894.4 | 580.4 |
| DTE Stockton partial (0.019×) | $0 (free) | Inv 13 ISRM × NEI point-source (3.73 deaths/yr full retire), linearly scaled | 0.07 | 0.07 | 0.07 |
| TOTAL | $1.812B | 266.5 | 894.5 | 580.5 |
Validated vs. surrogate comparison
| Metric | Di-anchor | Krewski-anchor | Midpoint | Surrogate (Investigation M-3) |
|---|---|---|---|---|
| Deaths/yr | 266.5 | 894.5 | 580.5 | 631.2 |
| Validated / Surrogate | 0.42× | 1.42× | 0.92× | 1.000 |
| Cost $B | 1.812 | 1.812 | 1.812 | 1.812 |
| $/death ($M) | $6.80M | $2.03M | $3.12M | $2.87M |
The midpoint-vs-surrogate ratio (0.92×) is the most direct apples-to-apples comparison: Investigation M-3’s surrogate was calibrated to the Di/Krewski midpoint (indoor_deaths_per_b = 320.25 = (147 + 493.5)/2 at $2B). The surrogate holds within 8% of the ISRM-validated midpoint at this design point. The problem is not the surrogate’s midpoint estimate—the problem is that the Di/Krewski spread (3.4×) makes the decision contingent on which indoor CRF applies.
Verdict determination
VERDICT: AMBIGUOUS
Confirmed threshold (surrogate × 80%): 505.0 deaths/yr.
- Di-anchor (266.5 deaths/yr): BELOW threshold → refuted under Di
- Krewski-anchor (894.5 deaths/yr): ABOVE threshold → confirmed under Krewski
- Midpoint (580.5 deaths/yr): ABOVE threshold → confirmed under midpoint
Because one anchor is above and one is below, the verdict is AMBIGUOUS. The CRF choice is decision-controlling: the policy recommendation changes depending on which indoor PM2.5 CRF is accepted.
Production recommendation under ambiguity: quote the midpoint-validated $/death ($3.12M) as the primary estimate, with bounds [$6.80M (Di), $2.03M (Krewski)]. Retain the Investigation 6-4 caveat language until a project-specific indoor CRF study resolves the Di/Krewski spread. Conservative fallback: A_free_lunch (zero cost, positive NB under all lenses, ISRM-confirmed).
Why “AMBIGUOUS” is not “fails”
AMBIGUOUS is the honest analytical verdict—not a policy failure.
- Midpoint cost-effectiveness ($3.12M/death) is favorable by any benchmark. EPA’s historical acceptance band for major air-quality rules is $10–30M/death. Even the Di-only estimate ($6.80M/death) falls well within that band.
- Q_nsga_2 avoids ~11× more deaths than A_free_lunch at the midpoint CRF at the midpoint. A_free_lunch avoids 53.8 deaths/yr at $0 cost. Q_nsga_2 avoids 580 deaths/yr at $3.12M/death. The question is not whether Q_nsga_2 beats A_free_lunch—it almost certainly does under either CRF—but whether the 631-deaths-per-year surrogate prediction can be defended in a regulatory proceeding where both CRF choices might be cross-examined.
- The spread is an epidemiological question, not a modeling question. The 3.4× Di/Krewski range reflects genuinely different cohorts, exposure ranges, and PM2.5 fractions. Running more ISRM simulations cannot reduce this spread; only an indoor-specific epidemiological study can.
- A_free_lunch and Q_nsga_2 are not alternatives—they are complements. Q_nsga_2’s x-vector includes DTE retire (a free-lunch component). The conservative path: deploy A_free_lunch now (zero risk, 53.8 deaths/yr, ISRM-confirmed), pursue the indoor CRF study in parallel, and escalate to Q_nsga_2 if the Krewski/midpoint CRF is validated.
Monte Carlo envelope (indoor pathway)
The deterministic-pathway Krewski estimate above (894.5 deaths/yr) and the MC-mean Krewski below (948.7 deaths/yr) are two different summaries of the same indoor pathway. The 894.5 figure is the point-estimate ISRM × NEI pathway sum at the cascade’s baseline parameters; the 948.7 is the mean of a 10,000-draw MC over the Inv 19 indoor uncertainty envelope, which is upward-skewed by the lognormal indoor-CRF prior. Both are valid; quote the deterministic number for the threshold-comparison test (against the 505-deaths floor) and the MC mean for the uncertainty envelope.
| CRF anchor | Mean deaths/yr | P5 deaths/yr | P95 deaths/yr |
|---|---|---|---|
| Di (2017 NEJM) | 282.6 | 152.3 | 464.1 |
| Krewski (2009 HEI) | 948.7 | 511.2 | 1,558.1 |
| Midpoint (surrogate calibration) | 615.7 | 331.8 | 1,011.1 |
10,000-draw MC (seed 2026) over the Inv 19 indoor air uncertainty envelope. Even the Di-anchor P95 (464 deaths/yr) is below the Krewski-anchor mean (949 deaths/yr), illustrating that the distributions under Di and Krewski do not overlap substantially. This is genuine epistemic uncertainty, not Monte Carlo noise.
Note on Investigation 6-3 and indoor CRF
Investigation 6-3’s hierarchical beta (β = 0.02439, implying HR/10 = 1.28) was estimated from an outdoor ambient PM2.5 cohort (NHIS + AQS, 656K subjects). It is not used to weight Di vs. Krewski for the indoor pathway. Indoor and outdoor PM2.5 differ in particle composition, bioavailability, and exposure dynamics; applying the outdoor ambient beta to indoor exposures would be scientifically indefensible. The midpoint (Di + Krewski)/2 is the appropriate central estimate—and it matches the Investigation M-3 surrogate calibration exactly, making it the most direct comparison.
Caveats
- Di/Krewski spread (3.4×) is genuine epistemic uncertainty, not modeling error. Di 2017 (NEJM Medicare cohort, HR = 1.073/10 µg/m³) and Krewski 2009 (ACS-CPS-II cohort, HR = 1.06/10 µg/m³) measured different populations, exposure ranges, and PM2.5 fractions. Running more ISRM simulations cannot reduce this spread; it requires an indoor-specific epidemiological study at residential exposures (5–7 µg/m³).
- Linear scaling from $2B to $1.812B. Coverage fraction ≈89%. Marginal-cost nonlinearity is unlikely to be material at this scale (Inv 19 stock model uses 5 continuous residential archetypes), but remains unvalidated below the full $2B anchor.
- dte_on = 0.019 is a continuous relaxation of a binary decision. The NSGA design variable maps to ≈2% capacity curtailment. The physical decision is binary (retire or not); the validated 0.07 deaths/yr is a lower bound on the discrete option (full retire = 3.73 deaths/yr). The NSGA optimizer would recommend full retire if the decision were binary.
- Surrogate midpoint calibrated at $2B, validated at $1.812B. The 89% coverage interpolation is within the calibration range, but the surrogate was designed for portfolio-level mixing, not a pure-indoor design point. This is the primary reason validation is necessary.
- Stale Investigation M-3 sha256 at time of last run. Investigation M-3 sha256 changed since Investigation 6-9 last ran. Q_nsga_2 x-vector outputs are unchanged in the diff table.
Provenance
| Field | Value |
|---|---|
| Investigation | 58_qnsga2-isrm-validation |
| Tier | Tier 1 |
| Run timestamp | 2026-05-04T07:48:40 |
| results.json sha256 | e8435cfe5128 |
| MC draws | 10,000 (seed 2026) |
| Verdict | AMBIGUOUS |
| Upstream: Inv 19 | sha256 9496484b2d20 (indoor air L4) |
| Upstream: Inv 13 | sha256 2a2c0cc9d19d (DTE point-source) |
| Upstream: Investigation 6-3 | sha256 3104ba850408 (CRF posterior, context only) |
| Upstream: Investigation 6-4 | sha256 cab2edc05333 (surrogate reference) |
| Upstream: Investigation M-3 | sha256 cfd8d8584269 (Q_nsga_2 x-vector) |
| CRF anchors | Di 2017 (NEJM 376:2513); Krewski 2009 (HEI Report 140) |
| Conservative fallback | A_free_lunch (53.8 deaths/yr, $0 cost, ISRM-confirmed) |