PFAS Study → Question 5

How Much Is Better Site Characterization Worth?

A $500K K_d-narrowing campaign moves the P50 arrival time from 53 to 59 years and changes the P95 remediation cost by $-0.2M. VOI ≈ zero. Sometimes the honest answer is: don’t spend the money — the decision is already made.

Key Concept

What Is K_d (Distribution Coefficient)?

K_d measures how much PFAS sticks to soil versus staying dissolved in water. A higher K_d means more contaminant sorbs to the solid phase — the plume moves slower, but cleanup takes longer because you have to flush more pore volumes to desorb the chemical.

For PFOS, published K_d values range from 0.5 to 20 L/kg depending on soil organic carbon content, mineralogy, and pH — a 40x range that dominates cleanup time predictions. At K_d = 0.5, the plume moves nearly as fast as groundwater. At K_d = 20, it creeps. The retardation factor R = 1 + (ρ_b/n) · K_d controls everything downstream: arrival time, pump duration, remediation cost.

The problem: most site investigations measure K_d from 3–5 soil samples. That's not enough to constrain a parameter with a 40x range. Every dollar of uncertainty in K_d propagates into millions of dollars of uncertainty in cleanup cost.

Value of Information

Sharper K_d Doesn’t Move the Decision

We compared two scenarios on a 50-realization Monte Carlo ensemble: wide K_d uncertainty (typical 3–5 sample investigation, log_std = 0.8) versus a targeted sorption campaign that narrows the prior (~$500K, log_std = 0.3). Then we recomputed the arrival distribution and the P95 P&T cost under each.

Arrival Time CDF — Before vs. After Characterization

The CDFs barely separate. Wide K_d: P5 = 23 yr, P50 = 53 yr, P95 = 76.6 yr. Narrow K_d: P5 = 36.5 yr, P50 = 59 yr, P95 = 77.5 yr. Both distributions show ~40% of realizations where the contaminant never reaches the well within the simulation horizon. Narrowing K_d shifts the median by 6 years and tightens the lower tail — but the P95 (the number that drives design) moves less than 1 year.

The cost calculation tells the same story. P95 remediation NPV under wide K_d: $89.8M. Under narrow K_d: $90.0M. The remediation cost barely moves (+$-0.2M); add the $0.5M campaign and you’re $0.7M worse off for the same decision.

−$0.7M

Net Project NPV Impact

$500K

Characterization Cost

+$0.2M

Remediation Cost Change

Finding

Once a site is committed to long-horizon containment with three extraction wells, sharper K_d doesn’t change the build, the schedule, or the bill. VOI ≈ $0M. Not all uncertainty is worth resolving — only uncertainty that changes the decision.

50-realization Monte Carlo, 80-year horizon. Wide: K_d log_std = 0.8. Narrow: K_d log_std = 0.3. Cost basis: $30M P&T capex + NPV of $2M/yr opex over 100 years at 3%.

Sensitivity Analysis

Why K_d Doesn’t Move It

Decomposing the Monte Carlo variance shows where the spread actually comes from:

Contribution to Arrival Time Variance (%)

K (hydraulic conductivity) alone accounts for 51% of arrival-time variance. K_d contributes 30%. Hydraulic gradient adds 18%; porosity is negligible at 2%. K_d matters — but it’s not the dominant lever. And under containment with three wells, even halving the K_d spread doesn’t change capture-zone design or the 100-year operating profile, because every realization in the ensemble is captured.

The decision-relevant question for a different site might be: would tighter K reshape the design? For this site, the capture-zone capacity is set by the wells, not by the parameter — so even the dominant variance contributor doesn’t flip the choice. That’s the whole VOI story.

The Decision

The Honest Answer: Don’t Spend the $500K

VOI ≈ $0M for this site, this decision. The remediation choice — three extraction wells, run for the planning horizon — is robust to K_d across the full prior. Spending $500K on a sharper K_d would buy a tighter histogram and the same build. That’s the difference between information and insight.

This is a value-of-information analysis with a counterintuitive result. The question isn’t “what’s the answer?” — it’s “how much would a better answer be worth?” For a site already committed to long-horizon containment, sharper K_d is worth roughly what you paid for the extra wells you didn’t need. Only a Monte Carlo framework can make this call — deterministic models can’t quantify the value of reducing uncertainty because they don’t represent it.

VOI flips on different sites. If the choice were monitored natural attenuation vs. active remediation, K_d would matter enormously — it controls whether the plume retreats on its own. If the well were closer (decade-scale arrival rather than half-century), K_d would shift the design horizon. Always ask: which decision does this measurement actually change? If the answer is “none,” the right move is to skip the campaign and put the money toward operating the wells.

50-realization Monte Carlo, 80-year horizon. Wide: K_d log_std = 0.8. Narrow: K_d log_std = 0.3. Reported VOI is project NPV under narrow minus project NPV under wide, minus the $0.5M campaign cost.

The ML Experiment

Can Machine Learning Replace Measurement?

If K_d is the dominant uncertainty, and site characterization costs $500K, an obvious question arises: can we predict K_d from cheaper-to-measure soil properties instead? Soil organic carbon content, pH, clay percentage, and grain size are routinely measured during any environmental site assessment. If these properties could reliably predict K_d, you could skip the expensive sorption testing entirely.

We tested this idea using the largest available dataset of PFAS sorption measurements: 1,227 laboratory experiments compiled from 47 published studies, covering 47 PFAS compounds across 451 different soil types (Kühne et al. 2025, Environ. Sci. Technol.). Each experiment measured how much PFAS sorbed to a soil sample under controlled conditions, along with the soil’s organic carbon content, pH, clay/silt/sand fractions, and cation exchange capacity.

We trained three models at increasing sophistication — mirroring the fidelity progression used throughout this study.

Three Models, One Test

Model A — Pure Physics

Organic Carbon Partitioning

The standard textbook model: K_d = K_oc × f_oc, where K_oc is how strongly the chemical partitions to organic carbon (known from its molecular structure) and f_oc is the fraction of organic carbon in the soil (measurable for ~$50/sample). We added published corrections for pH and clay content. No data fitting — pure chemistry.

0.42

R²

926%

Cape Cod Error

Model B — Pure Machine Learning

Random Forest

A random forest is an ensemble of 200 decision trees, each trained on a random subset of the data. It learns patterns from ALL available features — molecular weight, fluorine count, chain length, pH, organic carbon, clay content, and cation exchange capacity — without any physics constraints. The model finds whatever statistical relationships maximize prediction accuracy.

0.83

R²

3,148%

Cape Cod Error

Model C — Physics-Informed ML

Physics Backbone + Learned Correction

The physics model provides the baseline prediction. A second machine learning model (gradient boosting) is trained not on K_d directly, but on the error in the physics prediction — learning to correct the systematic biases that pure chemistry misses. The final prediction is: physics estimate + learned correction.

0.84

R²

1,700%

Cape Cod Error

Reading the cards: R² measures prediction accuracy on held-out data — 1.0 is perfect, 0.0 is no better than guessing. We used 10-fold cross-validation: train on 90% of the data, test on the remaining 10%, rotate 10 times, average the scores. The ML models score 0.83–0.84, which looks strong. But that score is computed within the training distribution. The real question is what happens outside it.

The Cape Cod Test

Joint Base Cape Cod is our validation site. USGS measured PFOS concentrations at 62 monitoring wells in 2020. From the observed plume extent (2,700 meters in 55 years), we back-calculated the effective field K_d: 0.39 L/kg — far below the literature default of 1.5 L/kg (Anderson et al. 2019). Cape Cod’s glacial outwash sand has very little organic carbon (0.3%) and almost no clay (5%) — an extreme soil that sits at the far edge of the training data.

We gave each model Cape Cod’s soil properties and asked: what is the PFOS K_d?

Model	Predicted K_d	Error vs. Field
Literature default (Anderson 2019)	1.50 L/kg	285%
Pure physics (K_oc × f_oc)	4.00 L/kg	926%
Physics-informed ML	7.02 L/kg	1,700%
Pure ML (Random Forest)	12.67 L/kg	3,148%
Actual (field, USGS 2020)	0.39 L/kg	—

Every model overshoots by at least an order of magnitude. The physics-informed model does beat pure ML — cutting the error roughly in half — but “half of terrible” is still terrible — for Cape Cod-extreme soils. ML may perform substantially better on the typical (non-edge-case) soils that dominate the 451-soil training set; this finding is specifically about Cape Cod’s low-organic-carbon, sandy aquifer being out-of-distribution, not a universal indictment of ML for Kd estimation. The simple literature default of 1.5 L/kg, despite being “wrong,” outperforms all three ML approaches.

Why R² = 0.84 Can Be 1,700% Wrong

This is the most important lesson in the entire experiment. An R² of 0.84 means the model explains 84% of the variation within the training data. That data is overwhelmingly from moderate soils — the median organic carbon in the dataset is 1.3%, the median clay content is 24%. Within that range, the model interpolates well.

Cape Cod sits far outside that range: 0.3% organic carbon, 5% clay. The model has almost no training examples from soils this extreme. When it extrapolates — predicting outside the range of data it learned from — it fails catastrophically. This is the fundamental limitation of data-driven models: they learn patterns in the data they’ve seen, and those patterns don’t necessarily hold in new territory.

The Lab-to-Field Gap

Even if the training data included more sandy, low-carbon soils, a deeper problem remains. The 1,227 measurements are laboratory batch sorption experiments — a researcher takes a soil sample, crushes and sieves it, shakes it with PFAS-contaminated water, and measures how much PFAS sticks to the soil particles. This is a controlled, small-scale measurement.

In the field, PFAS transport through intact geological structure involves processes that crushed-soil experiments cannot capture:

Preferential flow — water (and contaminant) channels through high-permeability paths, bypassing most of the soil matrix
Air-water interface sorption — in unsaturated sands, PFAS accumulates at air-water boundaries that don’t exist in a shaken flask (Brusseau 2018)
Scale effects — a 10-gram lab sample cannot represent the heterogeneity of a 2,700-meter plume
Non-equilibrium transport — batch experiments assume the PFAS reaches sorption equilibrium; in flowing groundwater, it may not

The field K_d of 0.39 L/kg at Cape Cod is an effective value that integrates all of these real-world mechanisms. It is not the same quantity that lab experiments measure, even though both are called “K_d.”

The Fidelity Lesson

1,227 lab measurements and the best available ML architecture cannot predict one field site. The fidelity ordering holds — physics-informed beats pure ML beats pure physics — but none of them replace a $500K field campaign. There is no shortcut past measurement.

For practitioners: If a vendor offers an “AI-powered K_d prediction” trained on published sorption data, ask them how it performs on out-of-distribution soils. A model that scores R² = 0.84 on its own test set can be off by 30x on your site. The $500K you spend on field characterization is not just buying a better number — it’s buying a number that actually describes your aquifer.

Data: Kühne et al. (2025), “Modeling PFAS Sorption in Soils Using Machine Learning,” Environ. Sci. Technol., doi:10.1021/acs.est.4c13284. Dataset: 1,227 K_d entries, 47 PFAS, 451 soils, 47 source studies (supplementary file es4c13284_si_002.xlsx). Models: scikit-learn RandomForestRegressor (200 trees), GradientBoostingRegressor (200 trees, lr=0.05, max_depth=4). Physics backbone: log K_d = log K_oc(CF₂) + log f_oc − 0.08(pH − 6) + 0.005(clay%). 10-fold cross-validation, shuffle, seed=42. Cape Cod soil: Sand=85%, Silt=10%, Clay=5%, C_org=0.3%, pH=6.0, CEC=2.0 cmol+/kg (Walter et al. 2018, USGS SIR 2018-5139). Field K_d: 0.39 L/kg, back-calculated from observed plume extent at 62 USGS monitoring wells (Water Quality Portal, 2020 sampling campaign, 49 PFOS detections, 1.3–610 ng/L).

← Previous Q3/Q4: Remediation Next → Q6: Regulatory

How Much Is Better Site Characterization Worth?

What Is Kd (Distribution Coefficient)?

Sharper Kd Doesn’t Move the Decision

Why Kd Doesn’t Move It

The Honest Answer: Don’t Spend the $500K

Can Machine Learning Replace Measurement?

Three Models, One Test

Organic Carbon Partitioning

Random Forest

Physics Backbone + Learned Correction

The Cape Cod Test

Why R² = 0.84 Can Be 1,700% Wrong

The Lab-to-Field Gap

What Is K_d (Distribution Coefficient)?

Sharper K_d Doesn’t Move the Decision

Why K_d Doesn’t Move It