Skip to main content

California Freight Cleanup → Investigation 3-4

Can a multi-fidelity emulator chain three data sources into a validated PM2.5 surrogate?

2019 RMSE 2.76 µg/m³ • passes Tessum + Boylan-Russell • ρtop stable ±0.016 across 4 years • FAQSD rung null result

We used a multi-fidelity Gaussian process to chain together three data sources (the base transport model, an EPA-published fused product, and direct monitor readings) and produce concentration estimates that pass both standard regulatory accuracy tests. Two findings: the corrected surrogate hits RMSE 2.76 µg/m³; and adding the EPA fused product as a middle rung made no difference — a clean lesson about when more fidelity rungs actually help.

Two questions answered here. Does the Le Gratiet linear MFGP chain deliver regulatory-grade PM2.5 estimates (RMSE ≤ 5.0 µg/m³, |MFB| ≤ 0.6) at held-out AQS sites? Yes. Does adding FAQSD as an intermediate rung between CMAQ and AQS improve accuracy over the simpler 3-level chain? No.

Algorithm. Le Gratiet & Garnier 2014 recursive linear MFGP. Each level fits a scaling coefficient ρk by OLS through origin, then a GP residual δk(lat, lon) under a ConstantKernel × Matern(nu=1.5) + WhiteKernel kernel. At test time, the recursive predict reconstructs higher-rung outputs from L1 only — CMAQ and FAQSD values at held-out test sites are never queried. This is the structural sidestep around FAQSD’s AQS-leakage problem.

Multi-year hybrid chain (Phase 6c.3). 2019 uses the full 4-level chain (L1→CMAQ EQUATES→FAQSD→AQS). 2020–2022 use a 3-level chain (L1→FAQSD→AQS) because CMAQ EQUATES v5.3.2 ATOTIJ is frozen at 2019 (no EPA/CMAS/AWS Open Data release of 2020+ as of May 2026). FAQSD substitutes as the mid rung for 2020–2022.

3-level vs 4-level A/B. The 2019 run simultaneously fits both the 4-level chain and a 3-level baseline (L1→CMAQ→AQS, same sites/folds, same kernel) to isolate the value of the FAQSD rung.

Splits. Investigation 3-1’s 5-fold basin-stratified site groupings are reused across all years to make rung-to-rung RMSE comparisons apples-to-apples. 5-fold spatial CV on 64 CA sites (2 dropped for missing FAQSD data).

L4 MFGP predicted vs observed scatter, RMSE 2.76 µg/m³
Figure: L4 MFGP cross-validation predicted vs observed PM2.5 at AQS sites, CA 2019. 5-fold spatial CV, 64 sites. RMSE 2.76 µg/m³; OLS slope and R² shown. Points colored by fold; dashed line is 1:1.

2019 anchor: RMSE 2.76 µg/m³ — passes both accuracy standards

5-fold CV-RMSE 2.762 µg/m³ (SD 0.892), MFB −0.055. Passes Tessum 2017 (≤ 5.0) and Boylan-Russell 2006 (|MFB| ≤ 0.6). Per-fold range 1.684.13 µg/m³.

YearChainSitesRMSE µg/m³MFBTessum
20194-level642.7620.055Pass
20203-level645.0880.080Fail (wildfire)
20213-level644.0030.076Pass
20223-level643.1840.054Pass

Multi-year mean RMSE 3.759 µg/m³ (SD 1.025 across 4 years).

Adding the EPA fused product as a middle rung adds nothing — a clean example of redundant fidelity

The 4-level MFGP (2019, with FAQSD as L3) scores RMSE 2.762 µg/m³. The 3-level baseline (L1→CMAQ→AQS) scores 2.715 µg/m³. The 4-level is worse by 0.047 µg/m³ — well inside fold noise (SD 0.892). The diagnostic is ρ30.995 across all five folds: at training sites, FAQSD and AQS are essentially identity (FAQSD is fit on AQS, so they share information). Once the leakage is structurally blocked at test sites, FAQSD adds GP variance without adding bias correction.

This is the cleanest ADM lesson in the study: more fidelity rungs only help when each rung carries information that is not redundant with what follows it. FAQSD is redundant with AQS at the training-site level. The 3-level chain is the production surrogate.

The coupling between the corrected surrogate and monitors is stable year to year (±0.016)

Per-year ρtop (the top-rung→AQS coupling):

YearChainρ1 (L1→mid)ρtop
20194-level (CMAQ mid)0.3680.9954
20203-level (FAQSD mid)0.8110.9929
20213-level (FAQSD mid)0.6641.0023
20223-level (FAQSD mid)0.5881.0093

ρtop range 0.016 across four years. The linear MFGP chain generalizes without year-fixed effects — the constant-ρ assumption holds at the top rung. ρ1 varies by 0.44 but this is partly a definition-shift artifact (mid rung changes from CMAQ in 2019 to FAQSD in 2020–2022).

2020 wildfire year: RMSE 5.09 — the base model has no wildfire smoke; this is an input problem, not a chain failure

The 2020 RMSE (5.09, failing Tessum) reflects the LNU/SCU/August Complex fire season: ISRM × NEI does not include wildfire smoke (NEI smoke kernels are not in the ISRM matrix). FAQSD partially recovers fire-season PM via AQS data assimilation in the 3-level chain. The 2021 (4.00) and 2022 (3.18) results confirm this is a wildfire-year anomaly, not chain degradation.

Year-by-year: the corrected surrogate beats the base model every year; the satellite reference struggles in the 2020 wildfire season

YearL1 RMSEL3 vD RMSEL4 RMSEΔ(L4−L1)
20196.4431.0812.7623.68
20205.9758.4235.0880.89
20215.7253.4944.0031.72
20225.8241.7703.1842.64

Note: The L3 vD 2019 value (1.08 µg/m³) above is evaluated at the 64 training sites used by this investigation’s 2019 MFGP chain; Investigation 3-3’s headline 4.34 µg/m³ is evaluated across all 5 years and 66 sites (330 obs) against pooled AQS annual means. These are different evaluation scopes and should not be compared directly.

L3 van Donkelaar scores anomalously high RMSE in 2020 (8.42 µg/m³) because the satellite-fused field’s annual-mean product cannot resolve the extreme wildfire smoke episodes. L4 MFGP still beats L1 that year by 0.89 µg/m³.

Item
run.py[internal artifact]
results.jsoninvestigations/42_l4-mfgp-corrected/latest/results.json
Method labelle_gratiet_2014_multi_year_linear_mfgp
KernelConstantKernel × Matern(nu=1.5) + WhiteKernel
CMAQ inputdata/raw/cmaq_equates/ca_site_cmaq_pm25_2019.csv (sha256 35d812f9ba52)
FAQSD inputs2019–2022 daily .txt.gz (sha256s in results.json inputs_from)
Upstream: Investigation 3-1 folds + L1sha256 c63ae2d281ce
Upstream: Investigation 3-3 L3 predictionsartifact sha256 621a2d74fe13
Upstream: Investigation 3-5 L5 2019 RMSE0.857 µg/m³ (sha256 278e28fe52db)
Last run2026-05-02 (results sha256 b89d8204eb15)