Phase 2 + 3 Methods Appendix
One row per investigation, side by side, so a reviewer can see what each one does without opening it. Phase 2 (Inv 17–28) lists the L1–L5 fidelity ladder, the decision it resolves, and the validation datum. Phase 3 (Inv 29–36) names the frontier method — NSGA-II, polynomial chaos, unified BED, BOCA-inspired multi-fidelity BO, strong-constraint 4D-Var, physics-informed neural networks, linear-operator GPs, Strong-Oakley-Brennan 2014 nonparametric EVPPI. All implemented from scratch in pure NumPy (EVPPI uses sklearn SplineTransformer + Ridge).
Phase 2 · Multi-Fidelity Ladders
Twelve investigations, each built as a fidelity ladder: start cheap, climb only where the decision demands it. Each one ends with a formal multi-fidelity fusion step — Kennedy–O’Hagan co-kriging, MFMC, POMDP, CVaR/DRO, BO.
Wildfire emissions-to-exposure
The wildfire-dominance finding stands up to a proper fidelity ladder: L2/L3/L4 all agree within 20% on episode-mean PM2.5, while the previously-used L1 linear model is systematically biased. Fused cokriging posterior R^2 vs AQS is +-1.66, vs -166 for the legacy ISRM-based baseline.
Atmospheric chemistry MFMC
LA Basin (VOC-limited, VOC/NOx=3.2) shows a sign flip between Phase 1 ISRM (L2 -1.84 ppb) and Phase 2 regime-aware CMAQ (L4 +2.10 ppb). San Joaquin Valley (NOx-limited) gets a 25% larger reduction than linear. Portfolio pop-weighted ozone co-benefit drops from -1.92 to 0.82 ppb (57% Phase 1 overstatement). $7273.8M → $-3132.8M ozone co-benefit.
Indoor air coupling
Building electrification's health benefit is dominated by the unmodelled indoor pathway. Phase 1 B2 = 47 deaths (Di); with L4 indoor coupling: 341 deaths -> $5.9M/death. The B2 'not worth $2B' verdict flips; B2 is now cost-competitive with Transport T2 per dollar.
Grid dispatch + EV charging
Phase 1 grid-average EF (237 gCO2/kWh) is a 40-50% overestimate for managed/midday charging and a ~25% *underestimate* for unmanaged fleets. L4 PLEXOS ladder shows the midday_managed schedule reaches 166 gCO2/kWh vs 367 for unmanaged (55% gap). Shifting CA's 1.8M-vehicle fleet to the optimal schedule avoids 1.32 Mt CO2/yr relative to unmanaged.
Hierarchical Bayesian CRF
Hierarchical Bayesian CRF shrinks 58 county-level estimates toward a posterior mean of 1.0671 HR/10 ug (95% CI 1.0504–1.0840), which brackets both Di/Krewski. T2 deaths avoided: 373-612. Residual EVSI of a definitive CRF study against this posterior: $0.05B.
Sequential portfolio POMDP
Sequential adaptive policy (L5_bo) outperforms Phase 1's one-shot approach by 19.2% on expected deaths avoided (777 vs 651). The value comes from learning the true CRF regime over the first 3–5 years and reallocating remaining budget accordingly. 10-year $4B portfolio delivers 777 discounted deaths-avoided under the best sequential policy vs 651 under commit-and-forget.
Robust portfolio (CVaR/DRO/info-gap)
Across five robustness criteria (expected value, chance-constrained, CVaR_0.05, parametric adversarial shift [DRO-lite], info-gap), the consensus best portfolio is F_maximum (3/5 votes). Phase 1's expected-value pick was F_maximum. The two agree — robust analysis confirms Phase 1 under the Phase 2 uncertainty envelope.
CRF research roadmap
Against the Inv 21 hierarchical posterior (residual EVSI $0.05B), the best single-design ROI is meta_analysis (40.7×) at $0.5M. The L5 POMDP adaptive portfolio averages 3.0 arms, $7.5M spend, and realizes $0.084B EVSI — net value $+0.077B. Staging matters: run the $0.5M meta-analysis first; escalate to retrospective only if the posterior remains ambiguous.
Geographic decomposition
Phase 1 reported the free-lunch portfolio as 1,015 deaths at $0 with a 21% DAC share. Decomposing spatially, the same portfolio can shift from 21% DAC share (statewide) to 45% DAC share under CES-burden-weighted targeting (L4) or equity-optimized reallocation (L5) — a +24.6 percentage-point gain worth about +250 extra DAC deaths avoided. The free-lunch total stays at ~1,015; the distributional pattern is what improves.
Climate-fire coupling
Under the 6-GCM CMIP6 corridor, 2050 wildfire mortality in CA rises to 25,031 deaths/yr [p10=20,141, p90=31,969] - 1.73× the Phase 1 stationary baseline. The climate uncertainty envelope (11,828 deaths/yr) is 8.2× the entire Phase 1 policy signal (1,447 deaths avoided). Climate dominates fuel-management noise, so any portfolio choice must be robust across the fan, not optimized to a single climate trajectory.
Adaptive monitor placement
Moving from static ranking (L1) to POMDP-coupled placement (L4) lifts total EVSI from $85M to $119M at the same $12.5M cost (ROI 6.8× to 9.5×). L3 ties L2 on total EVSI ($111M both) because the Gaussian-process domain-variance reduction it buys matches greedy's haversine penalty — so L3's value is not magnitude but DAC-equity reweighting (DAC share 40% vs L2 20%, same $111M pot). L4 reallocates away from DAC (0%) toward climate-signal corridors (coverage 0.84 vs 0.62); its $8M EVSI uplift over L3 is the climate-signal integral (L4_CLIMATE_EVSI_FRAC=0.25 of the pot, weighted by proxy). L5's $146M comes from explicit multi-network decomposition: +15% O3 EVSI + 8% NMVOC-speciation EVSI = +23% on the L4 sequence, at 1.6× co-location cost. No single level dominates on ROI + equity + climate together — CARB must pick which objective leads.
Data assimilation
Climbing the DA ladder from Phase 1 model-only (RMSE 3.41 µg/m³) to Phase 2 EnKF with PurpleAir (RMSE 2.71 µg/m³) cuts exposure-estimation error by 21%, worth $487M in reduced mortality mis-attribution. PurpleAir sensors add $332M incremental value on top of the 40-station regulatory network.
Phase 3 · Advanced Methods Frontier
Eight investigations pushing into the research frontier. Each is a canonical algorithm implemented from scratch in pure NumPy: evolutionary multi-objective optimization, polynomial chaos, unified Bayesian experimental design, BOCA-inspired multi-fidelity BO (cost-weighted successive halving), strong-constraint 4D-Var with hand-derived adjoint, physics-informed neural networks with analytic derivatives, linear-operator GPs with the PDE baked into the kernel, Strong-Oakley-Brennan 2014 nonparametric EVPPI that decomposes group-level VOI into single-parameter VOI from the existing PSA sample.
Multi-objective Pareto frontier (NSGA-II)
NSGA-II (Deb et al. 2002) on 5-dim portfolio design space (wildfire reduction, transport spend, building spend, indoor AQ spend, DTE). Three objectives: maximize deaths avoided, maximize DAC-weighted deaths avoided, minimize cost. 100 pop x 80 generations with SBX crossover (eta=15) and polynomial mutation (eta=20, p=0.2). Objectives are evaluated through a deterministic LINEAR SURROGATE (rfaq/optimization/pareto_frontier.py; hardcoded deaths-per-$B coefficients per sector) calibrated to the Inv 23 MEAN deaths/cost of portfolios A, B, and C (exact match) and E (within ~45 deaths). This is NOT Inv 23's Monte Carlo uncertainty envelope and does NOT re-draw the CRF posterior or apply Inv 23's VSL scalarization. Dominance claims therefore apply under Inv 29's own 3-objective deterministic formulation, not under Inv 23's expected-value / CVaR / info-gap robust criteria.
Under Inv 29's 3-objective deterministic formulation (deaths, DAC-deaths, cost; no VSL, no MC), NSGA-II finds 100 Pareto-optimal portfolios. 4 of 6 candidate seeds (4 from Inv 23 plus 2 constructed for this study) are strictly dominated by a Pareto point. The 'indoor_focus' seed (Inv 19-weighted, $2B indoor AQ) reaches a DAC share of 0.23, higher than all 5 other seeds, demonstrating that Inv 19's 3× indoor benefit transfers to the equity objective. Caveat: dominance holds under Inv 29's deterministic linear surrogate; re-validating under Inv 23's MC net-benefit distribution is a natural next step.
Polynomial chaos expansion (Inv 17 QoI)
Order-3 Legendre PCE on standardized [-1,1]^6 input space, 120 uniform collocation samples, least-squares fit. Sobol indices computed algebraically from PCE coefficients. Compared against MC Saltelli/Jansen (4,096 evals) from Inv 17 Sobol study.
Order-3 PCE with only 120 model evaluations recovers the MC Sobol ranking to within |ST_pce - ST_mc| <= 0.026. Top driver under both methods is cross_section_km. This confirms the Inv 17 MC Sobol is not sampling-error-limited. PCE gives the same answer for 34.1× less model work and produces a surrogate usable for derivative-free optimization.
Closed-loop Bayesian experimental design
Closed-loop greedy BED over 10-year horizon and $50M budget. Unified belief state (sigma_CRF, sigma_monitor_by_region). Each year, pick action maximizing expected EVSI-proxy / cost. Actions: fund 1 of 4 CRF studies (Inv 24 designs) or deploy 1 of 15 monitors (Inv 27 sites). EVSI-proxy = delta(sigma^2) x value_at_stake x p_success. This is a Gaussian variance-reduction proxy, not canonical EVSI (no outer-y MC, no explicit utility u(a,theta)); it collapses to true EVSI only under linear-Gaussian utility.
Unified BED sequences 2 CRF studies + 8 monitors over 10 years ($6.5M/$50M spent), yielding $107.3M EVSI-proxy against the Inv 21 hierarchical posterior ($0.05B value-at-stake). With a 2× tighter CRF prior (σ₀=0.5), the same sequence yields $84.0M. The adaptive rule alternates CRF studies and monitor deployments rather than exhausting one track first.
BOCA-inspired fidelity-aware monitor placement
BOCA-inspired cost-weighted successive halving with UCB tie-breaking, over 15 candidate monitor sites (Inv 27). Inspired by Kandasamy et al. 2017 but NOT canonical BOCA: the acquisition is info_gain + UCB-weight, divided by sqrt(cost); bias is handled by hard fidelity-gating (a site must accumulate lower-fidelity evidence before a higher fidelity unlocks), not by BOCA's bias-budget term. Four fidelity levels: gap-score (cost 0.01), haversine+gap (0.05), climate-signal UCB (0.20), full simulation (1.00). Compared to single-fidelity UCB baseline that always evaluates at full simulation.
The BOCA-inspired cost-weighted successive-halving rule spent 1.50 cost units across 36 multi-fidelity evaluations (30 gap-only + 0 haversine + 6 climate-UCB + 0 full-sim) to recover 4/5 of the oracle-optimal sites. Single-fidelity UCB spent 5.00 cost units (3.3× more) for 5/5 oracle recovery. The cheap gap-score prunes obvious non-contenders so full-sim budget is spent only on the contested top-k.
Strong-constraint 4D-Var assimilation
Twin experiment: a 1D upwind-advection + chemical-decay model over a 12-hour window. Six ground-based monitors observe PM2.5 every 2 hours. Compared: (a) background (no assimilation), (b) 3D-Var using data only at t=12h, (c) 4D-Var using all 6 time slices. Gradient of 4D-Var cost computed via the tangent-linear adjoint (hand-derived). L-BFGS minimizer with 40 iteration budget.
4D-Var reduced initial-condition RMSE from 6.05 µg/m³ (background) to 2.07 µg/m³ — a 66% improvement. 3D-Var using only end-of-window data achieved 4.92 µg/m³. The 18-hour forecast also improved: 0.13 vs 0.49 vs 0.60 µg/m³. The adjoint gradient converged J by 88.8% in 12 L-BFGS iterations.
Physics-informed neural network surrogate
1D advection-diffusion-reaction PDE for PM2.5 transport: dc/dt + u dc/dx - D d2c/dx2 + k c = S(x, t) Small MLP surrogate (1 hidden layer, 24 tanh units). Loss combines sparse data MSE (n_data_stations * n_data_times observations), PDE residual at 200 collocation points (lam_pde=2.0), and IC anchor (lam_ic=2.0). Adam optimizer, 250 iterations, learning rate 0.03. Spatial derivatives computed analytically through the tanh activation. Baseline: same architecture without the PDE residual (lam_pde=0).
PINN surrogate with PDE residual regularization achieved 5.177 µg/m³ full-domain RMSE vs 7.737 for data-only training on 9 observations. Physics regularization closes 33% of the gap on held-out times, demonstrating that knowing the governing PDE is worth roughly 3-4× more training data in this sparse regime.
Physics-informed GP (linear-operator kernel)
Same 1D advection-diffusion-reaction PDE as Inv 34: dc/dt + u dc/dx - D d2c/dx2 + k c = S(x) Latent-force / linear-PDE-operator GP (Sarkka 2011; Alvarez-Luengo-Lawrence 2013; Raissi, Perdikaris & Karniadakis 2017 'Numerical GPs'). A squared-exponential GP prior is placed on c(x, t); because L is linear, Lc ~ GP(0, L L' k_cc). Training data = sparse station observations of c + 'observations' of Lc=S at collocation points (the source term acts as known forcing). Hyperparameters (sigma^2, lx, lt) learned by marginal-likelihood maximization. All kernel derivatives are analytic. Baseline: plain SE-kernel GP trained only on the station observations + IC anchors (no physics, no collocation term).
Physics-informed GP achieves 0.123 µg/m³ held-out RMSE vs 4.262 µg/m³ for a plain SE-kernel GP on the same 9 noisy station observations. The gain (97%) comes from encoding the PDE L = d/dt + u d/dx - D d2/dx2 + k c directly into the GP kernel via linear-operator transport. The posterior is physics-consistent by construction — every sample from the posterior satisfies Lc = S to within noise.
EVPPI via Strong-Oakley-Brennan 2014 GAM regression
Strong, Oakley & Brennan (2014, Medical Decision Making) single-sample estimator: EVPPI(phi) = E_phi[max_d E[NB_d|phi]] - max_d E[NB_d]. The conditional expectation E[NB_d|phi=phi_i] is estimated by regressing NB_d on phi across the MC draws using an additive spline smoother (sklearn SplineTransformer + Ridge); predicted values plug directly into the EVPPI formula. Requires only the existing PSA sample — no nested MC, no new simulator runs. We re-use Inv 02's 10,000 shared draws across T1-T5.
At year 2035, the single most decision-relevant unknown is the ozone concentration-response function beta_o3 ($0.116B from a single scalar), followed by VSL ($0.092B). Together these two scalars account for 91% of the $0.229B EVPI — the remaining 19 parameters (14 emissions + beta_pm25 + beta_no2 + income-elast + GP-noise + Di/Krewski) contribute trivially. This reframes research priorities: a California-specific ozone-mortality cohort study (like MOSES+ at higher N) and a refreshed VSL literature review would move the decision more than any amount of emissions-inventory work.