Skip to main content
Studies · CA Air Quality · Investigation 33 · Phase 3 · Scoping study

Using 12 hours of monitor data to reconstruct today's PM2.5

This is a 1D synthetic-twin scoping study, not a California-scale nowcast. Its role in the decision chain is to show what a full 3D WRF-Chem adjoint would buy relative to 3D-Var — quantifying the upper rung of the Inv 28 data-assimilation ladder. Strong-constraint 4D-Var over a 12-hour window. The adjoint model provides the exact cost-function gradient; L-BFGS finds the initial state consistent with all 6 time slices of observations. Initial-condition RMSE drops from 6.05 (background) to 2.07 µg/m³, and the downstream 18-hour forecast RMSE drops by 73.0% vs 3D-Var — on this simplified model. Production CA deployment requires the 3D adjoint (see "What this demo does not show").

2.07
4D-Var init RMSE (µg/m³)
65.7%
RMSE cut vs background
57.9%
RMSE cut vs 3D-Var
12
L-BFGS iterations
The Question

Can we recover today's air-quality state from the past 12 hours of monitor data?

Air-quality nowcasting is a state-estimation problem: given a physics model, a first guess, and a stream of noisy ground-based monitor observations, what is the best estimate of the true PM2.5 field right now? 3D-Var uses one snapshot of data. 4D-Var uses the entire time window and enforces the forward model as a hard constraint — mathematically, it's the MAP estimate of the initial condition given all observations.

This investigation runs a twin experiment on a 1D upwind-advection + decay model with 6 monitor stations reporting every 2 hours over a 12-hour window. We compare background (no assimilation), 3D-Var (end-of-window data only), and 4D-Var (full trajectory).

Assimilation ladder

From persistence to adjoint optimization

L1
Persistence (t = now) No assimilation; forecast = observation at t=0.
n/a
baseline
L2
Optimal interpolation Weight observations by distance to grid cell; static B matrix.
n/a
classical
L3
3D-Var (end-of-window) Minimize J using only observations at t = 12h. RMSE_init = 4.92 µg/m³.
2
iters
L4
4D-Var (this investigation) Minimize J over 6 time slices using adjoint gradient. RMSE_init = 2.07 µg/m³ — a 57.9% improvement over 3D-Var.
12
iters
L5
Ensemble 4D-Var (hybrid) Flow-dependent B matrix from ensemble forecast; next-generation operational system (ECMWF, NOAA).
future
Initial condition recovery

Does the adjoint find the plume?

TruthBackground3D-Var4D-VarObs 03006009001200 0102030 Distance along wind axis (km) PM2.5 (μg/m³)

The true plume (white) is centered at 400 km with sigma=120 km. The background (gray) is wrong in both location and width. 3D-Var (gold) pulls toward the observations at the end of the window but can't resolve the upwind shape. 4D-Var (green) uses the time-evolution of the plume to reverse-engineer its initial position — recovering the shape to within 2 µg/m³ RMSE.

6.05
Background RMSE (µg/m³)
4.92
3D-Var RMSE
2.07
4D-Var RMSE
57.9%
4D vs 3D improvement
18-hour forecast

Better initial condition → better forecast

TruthBackground3D-Var4D-Var 03006009001200 05101520 Distance along wind axis (km) PM2.5 (μg/m³)

Running the forward model another 6 hours past the assimilation window, 4D-Var's forecast RMSE is 0.13 µg/m³ vs 0.49 µg/m³ for 3D-Var. Decay smooths everything so absolute errors shrink, but the 4D-Var forecast still tracks the truth almost exactly while 3D-Var retains a visible offset.

Cost-function convergence

L-BFGS with adjoint gradient

4D-Var (12 iters)3D-Var (2 iters) 102101.5101 024681012 L-BFGS iteration — cost J

4D-Var's J dropped 88.84% in 12 iterations; 3D-Var converged in 2 iterations because the cost surface is simpler. The adjoint model provides the exact gradient so L-BFGS super-linear convergence kicks in after a few steps.

Adjoint derivation is hand-written here. Production systems use automatic differentiation (TAPENADE, OpenAD) or pre-coded adjoints (WRF-DA, GEOS-Chem Adjoint).

Decision implication

Assimilate the hourly stream, not the snapshot

Recommendation: CEC operational forecasting should assimilate the full hourly AQS monitor stream, not just the latest snapshot. The 4D-Var framework's cost (adjoint model maintenance + L-BFGS) is justified by the 2-3× RMSE reduction on both analysis and forecast. This matters for exceedance nowcasting and for validating the Inv 18 MFMC uncertainty bounds.

Caveats

What this demo does not show

  • Synthetic twin experiment on a simplified 1D model; production 4D-Var requires 3D mesoscale adjoint (e.g., WRF-Chem 4D-Var or GEOS-Chem Adjoint).
  • Observation error assumed Gaussian and uncorrelated; real AQS data has correlated instrument/siting error.
  • Strong-constraint 4D-Var assumes perfect model; weak-constraint 4D-Var (also known as long-window 4D-Var) is the next step for WRF-Chem.
  • B matrix assumed diagonal; a flow-dependent ensemble B would further reduce analysis RMSE.