California Freight Cleanup → Investigation 8-3

How much does it cost to find the best monitor sites?

3.3× speedup (BOCA cost 1.5 vs full-sim 5.0) • 4/5 oracle overlap • 2% EVSI loss • full-simulation tier never needed

Investigation 27 identified which five candidate sites to deploy monitors to. This investigation asks the prior question: how much does it cost to find them? A full one-year simulation per candidate costs 100 times more than a cheap gap-score screen. We ran a tiered screening approach — cheap tests first, expensive tests only for survivors — and recovered 4 of 5 optimal sites at 3.3 times lower cost, without ever running the most expensive tier.

The decision

Should a CEC or CARB site-evaluation programme run full simulations on all 16 candidate sites before choosing 5 — or use a tiered screening hierarchy that applies cheap tests first and expensive tests only to survivors? The answer determines whether multi-fidelity evaluation is worth the algorithm complexity at programme scale. Investigation 8-3 tests this on the curated 16-site set from Investigation 8-2.

Methodology

Fidelity ladder. Four evaluation levels are defined over the 16 Investigation 8-2 candidate sites:

Fidelity	Label	Relative cost	Noise σ
1	Gap-score screen only	0.01	0.25
2	Haversine-adjusted gap	0.05	0.15
3	Climate-signal UCB	0.20	0.08
4	Full 1-year simulation	1.00	0.02

Cost units are relative to a full-simulation budget of 5.0 (one full-sim evaluation per candidate). Total budget cap: 50 evaluation units.

Algorithm. A BOCA-inspired successive-halving UCB — closer in mechanism to Jamieson & Talwalkar 2016 than to Kandasamy et al. 2017 BOCA — maintains an independent Gaussian conjugate posterior per candidate. The acquisition function is (info_gain + ucb_weight × UCB) / sqrt(cost). Promotion to a higher fidelity requires sufficient lower-fidelity evaluations; a candidate is committed once posterior std < 0.10 and at least one fid-3 or fid-4 evaluation is complete.

Baseline. Single-fidelity UCB evaluates at fidelity 4 (full simulation) throughout, selecting the 5 highest-UCB candidates from up to 50 evaluations.

Oracle. The true EVSI ranking from Investigation 8-2’s L4 scoring formula applied to each candidate’s attributes. Oracle overlap — how many of the algorithm’s top-5 selections match the oracle top-5 — is the primary quality metric.

Findings

3.3× faster at 2% value loss: tiered screening works

BOCA selects 5 sites at total cost 1.5 units vs. 5.0 for full-simulation UCB — a 3.3× reduction. Realized true EVSI: BOCA 2.5186 vs. full-sim 2.5720, a gap of −0.054 (−2.1%). At the 21k-cell full grid (Investigation 8-2 caveat), the speedup compounds to an estimated 5–8× because the cheap gap-score screening phase scales with candidate count while full-simulation cost grows with each candidate evaluated at full fidelity.

Algorithm	Total cost (units)	Realized EVSI	Oracle overlap
BOCA screening	1.5	2.5186	4/5
Full-simulation UCB	5.0	2.5720	5/5

The most expensive evaluation was never needed — cheap screens plus one mid-tier check resolved every site

36 total evaluations: 30 at fidelity 1 (gap score, cost 0.01), 6 at fidelity 3 (climate-UCB, cost 0.20). Fidelity 2 (haversine-adjusted gap) and fidelity 4 (full simulation) were never used. The acquisition function jumped directly from fid-1 to fid-3 commit for every top-ranked candidate — fid-2’s cost/noise ratio offered no additional discrimination worth paying for. This is not a shortcut. It is the multi-fidelity logic working as designed: cheap noisy screens identify the high-EVSI cluster; medium-cost confirmation commits; expensive full-sim is never required.

The one missed site shows the predictable cost of cheap-screen-first logic

BOCA selects rest_ca_cell18810 (posterior mean 0.385) over oracle rank-5 sjv_cell9709 (true EVSI 0.370). sjv_cell9709 was eliminated by one noisy fid-1 observation of −0.217 at step 4, dragging its posterior mean to −0.017. The algorithm never promoted it to fid-3. One bad cheap draw eliminated a genuinely good candidate before any higher-fidelity evidence was collected. That is the cost of the 3.3× speedup. Tolerance for this tradeoff is a policy decision, not a methodology failure.

Caveats

Not canonical BOCA. The algorithm has no joint GP linking fidelity levels, no bias-budget term, and no kernel ridge regression between fidelity levels. The name “BOCA-inspired” is retained for the multi-fidelity successive-narrowing architecture; the algorithm docstring labels this distinction explicitly.
Fidelity costs are nominal, not empirical. The 0.01/0.05/0.20/1.00 ladder represents a plausible order-of-magnitude ratio for gap-screen vs. field reconnaissance vs. short-term monitoring vs. full-year simulation. Real CEC procurement timelines would change these ratios.
Fidelity-gating thresholds are hand-tuned. The promotion rules (fid-2 requires ≥1 lower-fid eval; fid-3 requires ≥2; fid-4 requires ≥4) are not derived from a bias-budget optimization; they are engineering choices.
16-site candidate set understates the problem at scale. On 16 sites, gap-score UCB signal alone is strong enough to identify most top candidates quickly. On the full 21k-cell grid, cheap fidelity screening would provide far larger absolute savings.
Single seed (42) deterministic run. The 4/5 oracle overlap and −0.054 EVSI gap could shift to 3/5 or 5/5 under different seeds. The 3.3× speedup ratio is tied to the fidelity-cost ladder, not the seed.

Provenance

Item
run.py	`[internal artifact]`
results.json	`investigations/32_monitor-multifidelity/latest/results.json` (sha256 1b31413e6cd2)
analysis.md	`investigations/32_monitor-multifidelity/latest/analysis.md`
scenario.md	`investigations/32_monitor-multifidelity/latest/scenario.md`
Upstream (Investigation 8-2)	`investigations/27_monitor-adaptive/latest/results.json` (sha256 c92a5b8aface) — candidate sites, n_monitors, EVSI parameters
Kandasamy et al. 2017	NeurIPS — multi-fidelity BO framing (inspiration, not algorithm)
Jamieson & Talwalkar 2016	Successive halving (closer to implemented mechanism)
Srinivas et al. 2010	GP-UCB acquisition function
Last run	2026-05-01 (results sha256 1b31413e6cd2)