Four Fires, Eight Questions: What Should Decision-Makers Have Known?
What should emergency managers have known — and how much earlier could they have known it?
We rebuilt four real wildfires — Kincade, Camp Fire, Dixie, and Marshall — using real weather, real terrain, and real satellite data. Then we asked eight questions that emergency managers actually face: which communities are threatened, when does fire arrive, where should resources stage, what if the wind shifts? The right model delivers warning 5 hours earlier. And the answer holds across all four fires.
Four Fires. Four Characters.
Each fire tests the methodology under different conditions — terrain-driven vs. wind-driven, small vs. massive, urban-wildland interface vs. remote wilderness. The methodology doesn't change. The physics handles all four.
Kincade Fire
Sonoma County, California. Ignited by a downed PG&E transmission line in the Mayacamas Mountains during a Diablo wind event. Steep terrain channeled fire into valleys.
Camp Fire
Butte County, California. Destroyed the town of Paradise in hours. 85 deaths — the deadliest California wildfire in modern history. Extreme winds drove fire faster than evacuation.
Dixie Fire
Butte and Plumas Counties, California. Burned for over 100 days — the largest single fire in California history. Destroyed the town of Greenville. Wind-terrain interaction over weeks.
Marshall Fire
Boulder County, Colorado. 100+ mph Chinook winds drove grass fire through suburban neighborhoods in under 6 hours. The most destructive fire in Colorado history by structures lost.
Four fires with different characters — terrain-driven (Kincade), wind-driven (Camp Fire, Marshall), and long-duration (Dixie). Different states, different seasons, different scales. The methodology doesn't change. The physics handles all four. If the results are consistent across this portfolio, the methodology transfers.
Four Levels. Four Decisions.
Each rung adds complexity — and cost. The question isn't "is it better?" but "does it change the decision?" Here's what each level answers and what it costs.
Compute Cost vs. Decision Value
“Which way to run”
“Who's threatened”
“How uncertain”
“When to evacuate”
The biggest fidelity jump isn't the most expensive. Going from deterministic Rothermel (1.2 seconds) to 200-draw Monte Carlo (4 minutes) is a 200× increase in compute. But it shifts evacuation decisions by 1–7 hours. The asymmetric trigger — the actual decision tool — costs nothing extra: it's computed directly from the MC output. Four minutes of compute to gain hours of warning. That's the fidelity sweet spot.
| Fidelity Level | Compute | Decision Supported | Limitation |
|---|---|---|---|
| Wind + slope heuristic | < 1 ms | Which direction to evacuate | 0–40% accuracy; fails when wind shifts |
| Rothermel CA + real ASOS | 1.2 s | Who's threatened, approximate timing | Over-predicts area ~2–3× (CV=0.33); single scenario |
| 200-draw Monte Carlo | ~4 min | Uncertainty bounds on arrival time | Perturbs weather only; no suppression model |
| Asymmetric cost trigger | +0 s | Optimal evacuation timing | Requires MC output; cost ratio is assumed |
| CFD / coupled fire-atmosphere | ~hours | Spotting, ember transport, structure ignition | 10,000× more compute; zero new community-scale decisions |
Eight Questions. Eight Deep Dives.
Each question was answered at the fidelity it required. Click any card to see the full analysis across all four fires.
Which Direction Is the Fire Going?
Wind+slope heuristic captures 0–40% of satellite detections. Works when initial wind is steady (Kincade 40%, Marshall 35%), fails completely when terrain and wind interact unpredictably (Camp Fire 0%, Dixie 0%).
How Far and How Fast?
FIRMS capture rate averages 96% for wildland fires (28% for urban Marshall). Perimeter over-prediction mean 2.76× (CV=0.33). The bias is consistent and calibratable — the model captures the vast majority of real satellite detections in wildland settings.
How Uncertain Is the Forecast?
Camp Fire has tight MC spreads (0% never-reached) — the fire always arrives. Kincade has wide spreads (36–38% never-reached). Fire character determines uncertainty more than model parameters.
When Should You Pull the Trigger?
Across 16 communities in 4 fires, MC-informed triggers provide 1–7 hours of additional warning (mean 5.0h). The asymmetric cost model makes the case: 200 MC draws cost 4 minutes but shift decisions by hours.
Could Firebreaks Have Helped?
Firebreaks reduce total burn by ~1%. Community perimeter rings are more effective than ridgeline breaks. But no firebreak survives a 90-degree wind shift. Necessary but not sufficient.
What's the Minimum Model?
The biggest fidelity jump is deterministic → Monte Carlo. Validated across four fires: every jump above MC has diminishing returns. CFD costs 10,000× more with zero new decisions at community scale.
Can They Actually Get Out?
Four of sixteen communities cannot evacuate even with perfect warning. Paradise: 26,682 people, 4.5h clearance, -2.5h margin. The bottleneck is infrastructure, not algorithms. Some communities need shelter-in-place plans.
How Many Lives Does Each Hour Save?
MC-informed triggers reduce the at-risk population by 146,643 people (86%) across 16 communities. The compliance S-curve is why: extra warning hours push compliance from 30–60% to 85–95%. A 200-draw MC costs 4 minutes.
Scope: One physics model (Rothermel cellular automaton) applied to four fires, with 200-draw Monte Carlo uncertainty quantification, asymmetric cost evacuation triggers, and counterfactual firebreak analysis. ~16 minutes total compute across all fires combined. 16 communities analyzed. 34,936 NASA FIRMS satellite detections validated. Four NIFC perimeters for area validation (all fires). Real ASOS hourly weather for all four fires.
Finding the Right Model for the Decision
Each question above was answered at the fidelity it required. But scattered across eight questions, it's easy to miss the pattern: which dimensions of fidelity actually change the decision? Here it is in one place — the same community evaluated at increasing fidelity.
What a Fire Manager Concludes at Each Level
| Model Fidelity | Example Output | What the Manager Decides |
|---|---|---|
| Wind heuristic only | “Fire heading west” | “Evacuate communities to the west.” Right 40% of the time. |
| + Rothermel CA (real weather) | Geyserville: hour 29 | “Geyserville has ~29 hours. Mobilize.” Single estimate. |
| + 200-draw Monte Carlo | P10: 29h, P50: 39h, P90: 52h | “Could arrive as early as 29 hours. 38% chance it never reaches.” |
| + Asymmetric cost trigger | Optimal trigger: hour 24 | “Evacuate at hour 24 — 5 hours earlier than the deterministic estimate.” |
| + CFD / coupled atmosphere | Same community, same timing | “No new evacuation decision. Spotting details don't change the trigger.” |
The Capstone Matrix — What Each Fidelity Level Answers
| Level | Model | Compute | Decision | Marginal Value |
|---|---|---|---|---|
| 1 | Wind heuristic | < 1 ms | Direction to flee | Essential baseline |
| 2 | Rothermel CA | 1.2 s | Who's threatened, rough timing | High — spatial ordering of communities |
| 3 | MC 200 draws | 246 s | Uncertainty bounds on timing | Highest — shifts triggers 1–7h |
| 3b | Asymmetric trigger | +0 s | When to evacuate | Free — directly from MC |
| 4 | Counterfactual MC | ×2 | Firebreak / intervention value | Moderate — ~1% burn reduction |
| 5 | CFD / coupled physics | ×10,000 | Ember transport, spotting | Zero new community-scale decisions |
The decision determines the fidelity. If you need to know which direction to run, a weather report is enough. If you need to know when to pull the evacuation trigger, you need Monte Carlo — and nothing more. Every jump above MC has diminishing returns. This finding holds across four fires spanning 6,000 to 963,000 acres, three states, and four different fire characters.
Consistent Bias Across Four Fires
The single most important question: are these results stable across different fires, or artifacts of one fire's geometry? We ran identical methodology on all four fires and compared the metrics.
| Metric | Kincade | Camp Fire | Dixie | Marshall | Cross-Fire |
|---|---|---|---|---|---|
| Acres burned | 77,758 | 153,336 | 963,309 | 6,026 | 1.2M total |
| Q1: Direction accuracy | 40% | 0.4% | 0% | 35% | Mean 19% |
| Q2: FIRMS capture rate | 96.0% | 95.5% | 95.4% | 27.8% | 96% wildland |
| Q2: FIRMS detections | 1,431 | 4,492 | 28,905 | 108 | 34,936 total |
| Q2: Over-prediction ratio | 3.02× | 1.90× | 1.99× | 4.13× | Mean 2.76× (CV=0.33) |
| Q2: IoU | 0.307 | 0.489 | 0.465 | 0.084 | Mean 0.336 |
| Q2: Sørensen coefficient | 0.47 | 0.66 | 0.64 | 0.16 | Mean 0.59 (wildland) |
| Q4: Avg hours gained (MC) | — | 4.2 h | 5.6 h | — | Mean 5.0 h |
| Q4: Max hours gained | — | 6 h | 7 h | — | Range 1–7 h |
| Communities analyzed | 3 | 5 | 5 | 3 | 16 total |
| Total runtime | 240 s | 257 s | 248 s | 247 s | ~16 min total |
A 2–5× over-prediction across four fires is a systematic bias, not a model failure. Marshall's over-prediction and low FIRMS capture (28%) show the model struggles most with fast urban-interface fires. Despite the variability (CV=0.33), arrival ordering remains reliable for evacuation decisions. Area estimates require per-fire-type calibration rather than a single correction factor.
Published benchmark context. Our wildland fire Sørensen coefficient of 0.59 (mean of Kincade 0.47, Camp Fire 0.66, Dixie 0.64) falls between FARSITE with standard LANDFIRE fuels (SC=0.38, Anderson et al. 2022) and FARSITE with improved fuel selection (SC=0.70). For a 50×50 screening model using NLCD land cover rather than calibrated FBFM40 fuel models, this is competitive. As Finney (2000) documented: input data quality dominates accuracy far more than algorithm choice.
Why Marshall fails (IoU 0.084). The Marshall Fire was a fast grass/wind WUI fire with 100+ mph gusts, 6.3 km ember transport (NIST), and structure-to-structure ignition — phenomena our Rothermel CA on a 100m grid cannot represent. The right model for Marshall is a WUI fire behavior model with structure ignition modules (e.g., WFDS, FlamMap with WUI). Knowing when your model doesn’t apply is itself a fidelity finding.
Over-prediction ratios now available for all four fires (NIFC perimeters obtained). All four fires use real ASOS weather. All four fires use real elevation data (USGS 3DEP for Dixie/Marshall, Open-Elevation API for Kincade/Camp Fire). All four fires use NLCD 2021 land cover data mapped to fuel types. Benchmarks: Anderson et al. (2022) Fire Ecology 18:22; Finney (2000) USFS Research.
Marshall Fire: Where the Model Fails
One of our four fires has an IoU of 0.084 and a FIRMS capture rate of 28%. We keep it in the portfolio because it’s a fidelity boundary finding — but it needs to be explained honestly.
The Marshall Fire (Boulder County, Dec 30, 2021) was not a wildland fire. It was a fast grass/wind WUI fire with characteristics our model cannot represent:
| Factor | What Our Model Has | What Marshall Fire Had |
|---|---|---|
| Wind gusts | ASOS hourly averages | 100+ mph gusts (microbursts) |
| Ember transport | 2–5 cells (200–500m) | 6.3 km documented (NIST) |
| Fire spread mechanism | Cell-to-cell on terrain | Structure-to-structure ignition |
| Fuel type | Grass (60 m/h base) | Cured winter grass + urban structures |
| Duration | Days/weeks | 6 hours of catastrophic spread |
The right model for Marshall Fire is not a Rothermel CA. It’s a WUI fire behavior model with structure ignition modules (e.g., WFDS, FlamMap with WUI). Knowing when your model doesn’t apply is as important as knowing when it does. That is itself a fidelity finding.
Explore the Data
Three interactive tools to explore the wildfire findings directly.
Fire Replay
Watch the Kincade Fire spread hour by hour. Community markers, wind arrows, playback controls. See when each community gets threatened.
Evacuation Planner
Pick a community. Adjust the trigger hour, toggle contraflow, change mobilization time. Watch four panels update: MC arrivals, road capacity, compliance curve, and lives at risk. The physics doesn't negotiate.
Fire Behavior Variables
Select a real fire to see how wind speed, humidity, and temperature drive the spread estimates across all three model levels — heuristic, cellular automaton, and Monte Carlo.
Moderate Risk
The heuristic model overpredicts spread by 3.0×. Cellular automaton captures terrain routing. Monte Carlo window shifts 1–2 hours. For emergency planning, this uncertainty matters.
Data Sources
Satellite Fire Detection
NASA FIRMS VIIRS — Active fire detections from the Visible Infrared Imaging Radiometer Suite. 34,936 total detections across 4 fires (Kincade 1,431; Camp Fire 4,492; Dixie 28,905; Marshall 108). Used for perimeter validation and arrival-time ground truth.
Terrain
USGS 3DEP — Digital elevation models for Dixie and Marshall. Open-Elevation API — Elevation data for Kincade and Camp Fire. All four fires use real elevation data.
Weather
IEM ASOS — Automated Surface Observing System hourly weather observations from the nearest station to each fire. Wind speed, wind direction, temperature, relative humidity. 7,917 total hours of weather data across all four fires.
Fire Perimeters
NIFC — National Interagency Fire Center progressive fire perimeter data. NIFC perimeters obtained for all four fires. Over-prediction ratios and IoU computed against NIFC final perimeters for Kincade, Camp Fire, Dixie, and Marshall.
Fuel Models
NLCD 2021 Land Cover — National Land Cover Database 2021 land cover classes mapped to Rothermel fuel types. All four fires use real NLCD land cover data for spatially varying fuel loads.
External Validation
NIST TN 2135 (Maranghides et al. 2021) — Camp Fire Progression Timeline. 2,200+ observations.
NIST TN 2252 (Maranghides et al. 2023) — Camp Fire NETTRA Life Safety Database. 2,600+ observations.
Used to validate Q4 trigger timing and Q7 road capacity findings against documented reality.
Evacuation Research
Grajdura et al. (2021) Safety Science 139:105258 — departure S-curves from Red Cross shelter interviews.
Grajdura et al. (2022) Transportation Research Part D 103:103147 — agent-based evacuation simulation.
Used to ground the Q8 compliance curve in published Camp Fire data.
Validation Benchmarks
Anderson et al. (2022) Fire Ecology 18:22 — FARSITE validation with standard vs. improved fuel models.
Finney (2000) USFS Research — foundational FARSITE validation methodology.
Used to contextualize our Sørensen 0.59 against published benchmarks (FARSITE standard: 0.38, improved: 0.70).
Honest limitations: Fuel models use NLCD 2021 land cover mapped to Rothermel fuel types — real spatial variation but not the full LANDFIRE FBFM40 classification (the LFPS API was unavailable). All four fires use real elevation data (USGS 3DEP for Dixie/Marshall, Open-Elevation API for Kincade/Camp Fire). The Rothermel model does not include fire suppression, spotting, or ember transport — explaining the over-prediction of burned area (mean 2.76×, CV=0.33). No suppression model is attempted; the over-prediction is disclosed and consistent. Marshall Fire (IoU 0.084) is an acknowledged model boundary — the Rothermel CA cannot represent WUI-specific phenomena (structure-to-structure ignition, extreme ember transport, 100+ mph gusts). The right model for Marshall is a WUI fire behavior model. Camp Fire decision framework validated against NIST TN 2135/2252: model correctly identifies the need for advance warning, road capacity constraints, and compliance as the critical variable.