Skip to main content
Studies · CA Air Quality · Investigation 32 · Phase 3

Multi-fidelity search picks monitor sites for less simulation budget

A BOCA-inspired cost-weighted successive-halving rule (Kandasamy et al. 2017 framing; not a canonical BOCA implementation) adaptively chooses which fidelity proxy to evaluate each candidate at, escalating only when cheap proxies disagree. On the Inv 27 15-site candidate pool, the rule spends 3.3× less evaluation cost than full-sim UCB and recovers 4 of 5 oracle sites.

3.3×
Eval-cost speedup
4/5
Oracle-site recovery
36
Multi-fid evaluations
0
Full-sim calls (L4)
The Question

How much simulation budget does CEC need to pick the right 5 monitor sites?

Inv 27 found 5 high-value monitor sites using GP-UCB with full L4 scoring on every candidate. But a full L4 evaluation requires 1 year of co-located sensors (~$1M each). For 15 candidates, that's $15M in evaluation overhead just to pick where the real $2.5M deployment should go.

This investigation asks: can a multi-fidelity search recover the same decision at a fraction of the evaluation cost? BOCA (Kandasamy et al. 2017) is designed exactly for this — at each step the algorithm picks a (site, fidelity) pair to maximize information per unit cost.

The implementation here is cost-weighted successive halving with UCB tie-breaking, inspired by that framing rather than a literal BOCA port: a candidate must accumulate lower-fidelity evidence before a higher fidelity unlocks, and the acquisition score is (info_gain + UCB-weight) ÷ √cost. No joint GP over (site, fidelity), no explicit bias-budget term, no kernel ridge between fidelities — the cost-aware acquisition and successive-halving gate carry the practical behavior; the full BOCA machinery is the natural next step.

Fidelity ladder

Four proxies of EVSI — one decision

L1
gap_score only Cost 0.01, noise 0.25. Free information already on disk from Inv 13.
30
evals
L2
haversine + gap Cost 0.05, noise 0.15. Adds redundancy penalty against existing network (Inv 27 L2).
0
evals
L3
climate-signal UCB Cost 0.2, noise 0.08. Adds Inv 26 climate-signal bonus to site scoring.
6
evals
L4
full 1-year simulation Cost 1.0, noise 0.02. Ground-truth EVSI after deploying and collecting 1 season of data.
0
evals
L5
BOCA-inspired cost-weighted successive halving (this investigation) Inspired by Kandasamy et al. 2017 (multi-fidelity BO framing); implemented as cost-weighted successive halving with UCB tie-breaking. Picks (site, fid) by info-gain plus UCB per sqrt(cost), with hard fidelity-gating replacing BOCA's bias-budget term.
3.3×
speedup
Where the budget went

Which fidelities got used?

L1: gap_score only30 evalsL2: haversine + gap0 evalsL3: climate-signal UCB6 evalsL4: full 1-yr simulation0 evals Evaluation budget — 36 total evals across 4 fidelity levels

The rule burned almost all of its budget at L1 (cheap gap-score screening), then escalated to L3 (climate-signal UCB) only for top contenders. It never needed L2 or L4 — the L1 prior was informative enough to prune the obvious non-contenders, and L3 was sufficient to rank the top contested candidates.

In problems where L1 has larger bias against the top fidelity, the gating rule would force more L2/L4 evaluations. The cost split is adaptive in that sense, but it is governed by the hand-set unlock thresholds — not by BOCA's optimal bias-budget calculation.

Quality vs cost

3× cheaper, one site missed

Full-sim UCB
5.00
5 monitors picked via L4 only. 5/5 match oracle.
Cost-weighted successive halving
1.50
5 monitors picked via 36 multi-fid evals. 4/5 match oracle.
Speedup
3.3×
Evaluation cost reduction. Realized EVSI gap: -0.1454 units.
Finding
The BOCA-inspired rule recovers 4/5 oracle-optimal sites at 3.3× less evaluation cost. The missed site (sierra_plumas) has low gap_score but high climate_signal — a known failure mode when the low-fidelity proxy's bias is structural rather than random. In practice, this is the cost of the speedup. Keep L3/L4 budget in reserve. A full BOCA implementation with a learned bias-budget would likely catch this — natural next step.
The oracle target

What the "correct" answer looks like

RankSiteGap scoreTrue EVSIBOCA picked?Full-sim picked?
1sjv_merced0.8260.6573
2la_basin_E1.0000.6494
3sierra_plumas0.5810.5356
4sjv_stanislaus0.7370.5084
5la_basin_S0.6530.4612

Oracle = ranking using the full L4 scoring function applied to every candidate, with no noise. In production we don't have the oracle — that's the whole point of the search.

Decision implication

How much pilot budget to reserve

Recommendation: Use the BOCA-inspired rule for monitor-network design. It consumes 3.3× less simulation budget than single-fidelity GP-UCB and recovers 4 of 5 oracle-optimal sites. Replace the 1-year pilot budget (~$1M per evaluation via co-located deployment) with ~$0.1M of multi-fidelity proxy runs plus ~$0.2M of targeted full-sim on the final 2-3 candidates.

For the CEC monitor-network RFP, run a two-stage pilot: screen all 15 candidates using cheap proxies ($0.15M total), then run targeted 1-year co-located deployments on the top 3–4 contenders ($3–4M). Total evaluation cost: ~$4M instead of $15M; quality cost is 4/5 oracle recovery — one bias-driven miss (sierra_plumas).

Caveats

What this rule still misses

  • Honest label: implementation is cost-weighted successive halving with UCB tie-breaking, NOT canonical BOCA. No joint GP over (site, fidelity), no BOCA bias-budget term, no kernel ridge regression between fidelities. Kandasamy et al. 2017 is the inspiration, not the algorithm.
  • Fidelity costs are nominal units; real-world ratio depends on CEC monitor procurement timelines.
  • Fidelity-gating thresholds (need N_k lower-fidelity evals before fid k+1 unlocks) are hand-tuned, not derived from a bias-budget optimization.
  • Assumes noise model is known per fidelity; a GP-UCB extension with heteroscedastic noise is the natural next step.