California Freight Cleanup → Investigation 6-9

Does an independent model validate the NSGA-II runner-up portfolio?

VERDICT: AMBIGUOUS — midpoint $3.12M/death [$6.80M Di, $2.03M Krewski]

The NSGA-II runner-up portfolio (Q_nsga_2) concentrates $1.81B on indoor air quality improvement — a novel design the original surrogate had never evaluated at that scale. Investigation 6-9 validates that prediction by running each spending pathway through an independent model. The surrogate holds within 8% at the midpoint. The problem is the underlying health-risk science, which disagrees by 3.4× depending on which study you use.

The decision

When we expanded the portfolio candidate set using Pareto optimization, a new winner emerged: Q_nsga_2, which concentrates its spending entirely on indoor air improvement with a small zero-cost power plant retirement and nothing on transport, building upgrades, or wildfire. That result depends on a linear surrogate model’s prediction of indoor air benefits at a scale and composition it had never actually seen.

This is the largest open caveat in the cascade. If the surrogate over-predicts the indoor benefit at that design point, the new winner is a modeling artifact — and the zero-cost free-lunch portfolio remains the conservative production recommendation. Investigation 6-9 closes that caveat by running the indoor pathway through an independent model to see whether the surrogate’s prediction holds.

Q_nsga_2 x-vector decomposition

Q_nsga_2 = [wildfire=0, transport=$0B, building=$0B, indoor=$1.812B, dte_on=0.019]. Three distinct exposure pathways, validated independently:

Spend	Validation source	Deaths/yr (Di)	Deaths/yr (Krew.)	Deaths/yr (Midpoint)
Outdoor ISRM	$0	Identity from x-vector (no transport/building/wildfire spend)	0.0	0.0	0.0
Indoor air	$1.812B	Inv 19 L4 personal-exposure (CHAD time-activity, 5 archetypes), scaled from $2B	266.4	894.4	580.4
DTE Stockton partial (0.019×)	$0 (free)	Inv 13 ISRM × NEI point-source (3.73 deaths/yr full retire), linearly scaled	0.07	0.07	0.07
TOTAL	$1.812B		266.5	894.5	580.5

Validated vs. surrogate comparison

Metric	Di-anchor	Krewski-anchor	Midpoint	Surrogate (Investigation M-3)
Deaths/yr	266.5	894.5	580.5	631.2
Validated / Surrogate	0.42×	1.42×	0.92×	1.000
Cost $B	1.812	1.812	1.812	1.812
$/death ($M)	$6.80M	$2.03M	$3.12M	$2.87M

The midpoint-vs-surrogate ratio (0.92×) is the most direct apples-to-apples comparison: Investigation M-3’s surrogate was calibrated to the Di/Krewski midpoint (indoor_deaths_per_b = 320.25 = (147 + 493.5)/2 at $2B). The surrogate holds within 8% of the ISRM-validated midpoint at this design point. The problem is not the surrogate’s midpoint estimate—the problem is that the Di/Krewski spread (3.4×) makes the decision contingent on which indoor CRF applies.

Verdict determination

VERDICT: AMBIGUOUS

Confirmed threshold (surrogate × 80%): 505.0 deaths/yr.

Di-anchor (266.5 deaths/yr): BELOW threshold → refuted under Di
Krewski-anchor (894.5 deaths/yr): ABOVE threshold → confirmed under Krewski
Midpoint (580.5 deaths/yr): ABOVE threshold → confirmed under midpoint

Because one anchor is above and one is below, the verdict is AMBIGUOUS. The CRF choice is decision-controlling: the policy recommendation changes depending on which indoor PM_2.5 CRF is accepted.

Production recommendation under ambiguity: quote the midpoint-validated $/death ($3.12M) as the primary estimate, with bounds [$6.80M (Di), $2.03M (Krewski)]. Retain the Investigation 6-4 caveat language until a project-specific indoor CRF study resolves the Di/Krewski spread. Conservative fallback: A_free_lunch (zero cost, positive NB under all lenses, ISRM-confirmed).

Why “AMBIGUOUS” is not “fails”

AMBIGUOUS is the honest analytical verdict—not a policy failure.

Midpoint cost-effectiveness ($3.12M/death) is favorable by any benchmark. EPA’s historical acceptance band for major air-quality rules is $10–30M/death. Even the Di-only estimate ($6.80M/death) falls well within that band.
Q_nsga_2 avoids ~11× more deaths than A_free_lunch at the midpoint CRF at the midpoint. A_free_lunch avoids 53.8 deaths/yr at $0 cost. Q_nsga_2 avoids 580 deaths/yr at $3.12M/death. The question is not whether Q_nsga_2 beats A_free_lunch—it almost certainly does under either CRF—but whether the 631-deaths-per-year surrogate prediction can be defended in a regulatory proceeding where both CRF choices might be cross-examined.
The spread is an epidemiological question, not a modeling question. The 3.4× Di/Krewski range reflects genuinely different cohorts, exposure ranges, and PM_2.5 fractions. Running more ISRM simulations cannot reduce this spread; only an indoor-specific epidemiological study can.
A_free_lunch and Q_nsga_2 are not alternatives—they are complements. Q_nsga_2’s x-vector includes DTE retire (a free-lunch component). The conservative path: deploy A_free_lunch now (zero risk, 53.8 deaths/yr, ISRM-confirmed), pursue the indoor CRF study in parallel, and escalate to Q_nsga_2 if the Krewski/midpoint CRF is validated.

Monte Carlo envelope (indoor pathway)

The deterministic-pathway Krewski estimate above (894.5 deaths/yr) and the MC-mean Krewski below (948.7 deaths/yr) are two different summaries of the same indoor pathway. The 894.5 figure is the point-estimate ISRM × NEI pathway sum at the cascade’s baseline parameters; the 948.7 is the mean of a 10,000-draw MC over the Inv 19 indoor uncertainty envelope, which is upward-skewed by the lognormal indoor-CRF prior. Both are valid; quote the deterministic number for the threshold-comparison test (against the 505-deaths floor) and the MC mean for the uncertainty envelope.

CRF anchor	Mean deaths/yr	P5 deaths/yr	P95 deaths/yr
Di (2017 NEJM)	282.6	152.3	464.1
Krewski (2009 HEI)	948.7	511.2	1,558.1
Midpoint (surrogate calibration)	615.7	331.8	1,011.1

10,000-draw MC (seed 2026) over the Inv 19 indoor air uncertainty envelope. Even the Di-anchor P95 (464 deaths/yr) is below the Krewski-anchor mean (949 deaths/yr), illustrating that the distributions under Di and Krewski do not overlap substantially. This is genuine epistemic uncertainty, not Monte Carlo noise.

Note on Investigation 6-3 and indoor CRF

Investigation 6-3’s hierarchical beta (β = 0.02439, implying HR/10 = 1.28) was estimated from an outdoor ambient PM_2.5 cohort (NHIS + AQS, 656K subjects). It is not used to weight Di vs. Krewski for the indoor pathway. Indoor and outdoor PM_2.5 differ in particle composition, bioavailability, and exposure dynamics; applying the outdoor ambient beta to indoor exposures would be scientifically indefensible. The midpoint (Di + Krewski)/2 is the appropriate central estimate—and it matches the Investigation M-3 surrogate calibration exactly, making it the most direct comparison.

Caveats

Di/Krewski spread (3.4×) is genuine epistemic uncertainty, not modeling error. Di 2017 (NEJM Medicare cohort, HR = 1.073/10 µg/m³) and Krewski 2009 (ACS-CPS-II cohort, HR = 1.06/10 µg/m³) measured different populations, exposure ranges, and PM_2.5 fractions. Running more ISRM simulations cannot reduce this spread; it requires an indoor-specific epidemiological study at residential exposures (5–7 µg/m³).
Linear scaling from $2B to $1.812B. Coverage fraction ≈89%. Marginal-cost nonlinearity is unlikely to be material at this scale (Inv 19 stock model uses 5 continuous residential archetypes), but remains unvalidated below the full $2B anchor.
dte_on = 0.019 is a continuous relaxation of a binary decision. The NSGA design variable maps to ≈2% capacity curtailment. The physical decision is binary (retire or not); the validated 0.07 deaths/yr is a lower bound on the discrete option (full retire = 3.73 deaths/yr). The NSGA optimizer would recommend full retire if the decision were binary.
Surrogate midpoint calibrated at $2B, validated at $1.812B. The 89% coverage interpolation is within the calibration range, but the surrogate was designed for portfolio-level mixing, not a pure-indoor design point. This is the primary reason validation is necessary.
Stale Investigation M-3 sha256 at time of last run. Investigation M-3 sha256 changed since Investigation 6-9 last ran. Q_nsga_2 x-vector outputs are unchanged in the diff table.

Provenance

Field	Value
Investigation	58_qnsga2-isrm-validation
Tier	Tier 1
Run timestamp	2026-05-04T07:48:40
results.json sha256	`e8435cfe5128`
MC draws	10,000 (seed 2026)
Verdict	AMBIGUOUS
Upstream: Inv 19	sha256 `9496484b2d20` (indoor air L4)
Upstream: Inv 13	sha256 `2a2c0cc9d19d` (DTE point-source)
Upstream: Investigation 6-3	sha256 `3104ba850408` (CRF posterior, context only)
Upstream: Investigation 6-4	sha256 `cab2edc05333` (surrogate reference)
Upstream: Investigation M-3	sha256 `cfd8d8584269` (Q_nsga_2 x-vector)
CRF anchors	Di 2017 (NEJM 376:2513); Krewski 2009 (HEI Report 140)
Conservative fallback	A_free_lunch (53.8 deaths/yr, $0 cost, ISRM-confirmed)