Skip to main content
Analysis Driven Modeling · Methodology

The Analysis Drives the Model. Never the Other Way Around.

Most modeling projects go wrong because they skip the hardest step: figuring out what to model, at what level of detail, for what decision. ADM is the discipline that prevents this.

Why Most Modeling Projects Actually Go Wrong

There's a persistent myth: projects fail because the technology isn't ready, or the data is insufficient, or the team lacks talent. Sometimes those things are true. But after twenty years of building simulation and analysis systems for defense and aerospace, I've watched the same failure pattern repeat dozens of times. The root cause is almost never technical.

The real failure is poor problem framing.

Teams build models before defining what question those models are supposed to answer. They choose methods because they're trendy, not because they fit the problem. They measure model performance against the wrong metrics. They over-build or under-build because they never established a principled basis for how detailed the model needs to be.

The result: months of effort producing a system that's technically impressive and practically useless. A simulation that runs beautifully but doesn't inform the decision it was built to support. A predictive model that scores well on test data but doesn't capture the dynamics that actually matter for the downstream decision.

Analysis Driven Modeling exists to prevent this. There's a structure — three steps and a feedback loop — but ADM is a discipline, not a framework. The structure serves the thinking, not the other way around. ADM exists because the alternative — picking a modeling approach before defining the question — wastes time and money.

STEP 01

Let the Question Set the Bar

Every question has a natural fidelity. “Is this system safe?” demands a different level of precision than “which design is cheapest?” The question tells you what kind of model you need and how precise the answer has to be. That’s the fidelity bar — the target that everything else follows from. Before writing a single line of code, four things need to be clear:

  • What decision will this model inform? Not "what will it simulate," but what decision will a human or system make differently because this model exists?
  • What fidelity does that decision require? A billion-dollar infrastructure investment requires different model resolution than a preliminary screening decision. The decision sets the fidelity, not the toolchain.
  • What physics and dynamics matter? Every real-world system has infinite complexity. The art is knowing which dynamics are load-bearing for your specific question and which can be safely abstracted.
  • What can be abstracted, simplified, or ignored? The most dangerous assumption in modeling is that more detail is always better. Every unnecessary detail costs you compute, validation effort, and introduces error.

When MITRE needed to assess the effectiveness of a proposed kill chain, the first question wasn't "which simulation tool?" It was: "What decision does this analysis need to support, and what level of fidelity does that decision require?" The answer (a force-level trade study, not an engagement-level prediction) reduced the required model complexity by an order of magnitude and delivered results three months faster.

The same discipline applies to risk assessment. When the question is "given a pandemic, what else should we worry about?", the answer isn't a Monte Carlo simulation of disease spread — that answers a different question ("what will happen next?"). The right model is a Bayesian causal network: nodes for uncertain events, edges for causal relationships, evidence injection to update beliefs. The Global Risk Intelligence Network demonstrates this: 36 nodes across 6 domains, answering "what should I believe now?" — not "what should I do next?"

This same rigor applies outside defense. When an energy planner asks "how detailed does our grid simulation need to be?", the answer depends entirely on what decision the simulation is supporting and what conditions it will face. A screening-level adequacy check has radically different fidelity requirements than a stochastic investment optimization. If you don't define the question first, you'll build the wrong model.

STEP 02

Build to the Bar — Not Past It

The question sets the fidelity bar. Now you build to it — not past it. This sounds simple. In practice, it's the step where most projects go wrong.

The fidelity spectrum is wider than most people realize. At one end: back-of-envelope calculations, lookup tables, statistical surrogates. At the other: full physics-based simulations with millions of interacting elements. Both ends have their place. The hard part is knowing where on this spectrum your specific question lives — and resisting the gravitational pull toward either extreme.

More fidelity isn't always better. A model that's too detailed for its training data will overfit. A simulation that's too complex for its validation basis will produce results that look precise but aren't accurate. Every layer of detail beyond what the question requires is cost without value. The discipline is knowing where to stop.

As Chief Analyst at MITRE, I watched this pattern constantly. Modelers would add detail to a simulation just because the framework allowed it — not because the analysis required it. The extra detail didn't just cost time. It introduced code that could cause unintended interactions, and it multiplied the number of components that needed their fidelity assessed, their accuracy verified, and their logic understood by the next analyst who inherited the model.

In Bayesian risk networks, the same principle holds. A 36-node causal network with noisy-OR CPTs reveals cross-domain cascades that are invisible to simpler models — and every inference pathway is interpretable. Scale to 200 nodes and you might capture more correlations, but the decision-maker can no longer trace why a given risk shifted. The extra fidelity makes the model less useful, not more. The right fidelity is determined by the decision the model needs to support.

The Fidelity Ladder and Spiraling Fidelity

You rarely know the exact right fidelity before you build. So don’t try to guess upfront. Instead, use a fidelity ladder: start with the simplest model that could plausibly answer the question, validate whether it’s sufficient, and escalate selectively only where needed.

The process looks like this:

Start simple. A screening-level model—back-of-envelope calculations, analytical solutions, homogeneous assumptions, low-dimensional representations. Build it fast. It will reveal what actually matters and what doesn’t.

Run sensitivity analysis. Which parameters, variables, or model components actually move your answer? Which are decision-irrelevant? This is the critical step. A sensitivity sweep tells you where complexity is wasted and where it’s essential.

Spiral fidelity selectively. Don’t escalate uniformly across the whole model. Add detail only to the components sensitivity analysis identified as decision-critical. Keep the rest simple. A heterogeneous permeability field might matter (escalate that component), but the source term might not (keep it abstracted). A detailed student misconception model might change instruction decisions, but a simpler mastery estimate might be enough for another question.

Validate and repeat. Does this mixed-fidelity model give you the precision, confidence, or granularity the decision requires? If yes, you’re done. If no, identify the next-most-sensitive component and escalate there.

This spiral approach prevents two common failures: over-building (months of effort on components that don’t move the answer) and under-building (discovering too late that a key driver was oversimplified).

Groundwater contamination provides a clear example. A screening model (analytical, homogeneous) answers “does the plume reach the well?” quickly. Sensitivity analysis shows that spatial variation in permeability matters for remediation decisions, but the source term doesn’t. So you escalate the K-field to 2D heterogeneous transport—not the whole model. If uncertainty quantification matters for investment decisions, you escalate again to Monte Carlo, but only over the parameters that sensitivity flagged as important. The discipline is knowing where on the ladder to stop and which components deserve high fidelity.

Uncertainty Quantification

Every model has uncertainty. The question is whether you quantify it or ignore it. ADM requires making uncertainty explicit: What are the model's assumptions? Where are the boundaries of its validity? What happens to the decision when those assumptions are wrong? Monte Carlo analysis, sensitivity studies, and ensemble methods aren't optional additions. They're part of the model.

Method Selection

The method should follow from the question and the fidelity bar, not precede them. A physics-based simulation is the right choice when temporal dynamics and system interactions drive the answer. Monte Carlo analysis fits when the decision depends on tail risk and the uncertainty range matters more than the point estimate. Gradient boosting or logistic regression may be the right call when you have rich observational data and the question is about prediction. Sometimes the answer is a spreadsheet. Each method has a sweet spot defined by the structure of the problem, not its trendiness.

STEP 03

Validate What Matters

The model doesn’t need to be perfect — it needs to be good enough for the decision it serves. Sensitivity analysis tells you which inputs actually move the answer. Uncertainty quantification tells you how much to trust it. If the model is more precise than the decision requires, you’ve overbuilt.

The defense Modeling and Simulation community has spent decades developing rigorous Verification, Validation, and Accreditation (VV&A) practices. These practices exist because the consequences of trusting an unvalidated model in defense are measured in lives. But the underlying principle applies everywhere: a model without validation is a hypothesis, not a tool.

Verification asks: "Did we build the model right?" Does the code implement the intended equations? Are the algorithms numerically stable? Do the outputs make dimensional and physical sense?

Validation asks: "Did we build the right model?" Does the simulation's behavior match reality within acceptable tolerances for the intended use? Note the phrase for the intended use — this is where ADM connects to V&V. A model validated for one analysis question may be completely inappropriate for another, even if it uses the same underlying physics.

The V&V Blind Spot

Most AI teams have a V&V blind spot. They validate model accuracy on held-out test sets but never ask whether the model is actually useful for the decision it was built to support. A predictive model can have excellent test-set accuracy and still be useless if it fails to capture the dynamics that matter for the downstream decision. ADM insists on end-to-end validation: does the chain hold all the way from data through the model to the actual decision? If it breaks at any link, the model hasn't been validated for its intended purpose.

Each Answer Unlocks the Next Question

Hard problems are rarely answered by a single question and a single model. More often, they’re answered by the right sequence of questions — where each answer produces something that didn’t exist before, and that something makes the next question askable for the first time.

Often, the first question in the sequence is the one that needs real modeling and simulation. Not because simulation is the goal, but because the problem is too tangled for intuition or arithmetic alone. A grid with 180 GW of generation, a dozen fuel types, hourly demand swings, and weather-dependent renewables doesn’t simplify to a formula. You likely need a simulation to find the breaking point — and that breaking point is a number nobody had before the model produced it.

But that number usually does more than answer the first question. It tends to make the second question precise. “What will this cost consumers?” is a vague worry until you have specific generation requirements, investment timelines, and capacity shortfalls from a simulation. Then it becomes tractable — important analysis, but analysis that only works because the simulation gave it a foundation to stand on.

The second answer often sharpens a third question. The third can sharpen a fourth. Each step in the chain builds on everything below it. Some steps may need simulation. Some may need economic analysis. Some may need nothing more than careful reasoning with the numbers the earlier steps produced. The discipline isn’t applying simulation everywhere — it’s recognizing which question in the chain is the one where modeling and simulation is likely the only way to get a credible answer, and building exactly what that question requires.

When it works well, this kind of chain naturally builds toward the question the decision-maker actually cares about. An engineer might care most about reliability. A regulator might care about consumer rates. A CEO might care about whether to build their own power plant. The simulation at the base of the chain doesn’t need to be redesigned for each audience — it produced the quantitative foundation that every downstream question depends on, and each audience can read the chain at the level where their decision lives.

Consider a study on data center load growth in PJM territory. A dispatch simulation might answer “at what load does the grid start to struggle?” Those results — breaking points, investment requirements, generation costs — could feed the next question: “what happens to consumer electricity rates as data center demand grows?” And the rate projections could feed the next: “at what point does it become cheaper for data centers to generate their own power, and what would that exodus mean for everyone still on the grid?” Each question builds on the previous answer. The simulation runs once, at the base of the chain. Everything above it is analysis that the simulation made possible.

From Question to Decision

The ideas above describe how I think about every modeling problem. Here’s what it looks like in practice — from first question to final answer.

The process is rooted in the OSD Mission Engineering Guide 2.0, the same framework that guides systems-level analysis across the Department of Defense. I’ve used it across dozens of M&S programs. It maps to any domain because the underlying discipline is identical: start with the question, build what the question requires, and make sure the answer holds up against reality.

Phase 01

Define the Problem

What decision are you trying to make, and what will you do differently once the model exists? Every engagement starts here — with the stakeholders, the constraints, and a clear picture of what success actually looks like. If we can't articulate this cleanly, we're not ready to build anything.

Phase 02

Assess the Current State

Before designing a solution, I need to understand what you have and what you don't. What data exists? What's been tried? Where are the real gaps versus the perceived ones? This phase is a structured look at what you have, what's missing, and what to do about it.

Phase 03

Architect the Solution

With the problem defined and the current state understood, we choose the approach. What method fits the structure of this specific problem — reinforcement learning, Bayesian optimization, a physics-informed model, a statistical surrogate? What level of fidelity does the decision actually require? Every design choice traces back to the question from Phase 1.

Phase 04

Build and Validate

Build at the right fidelity, not the maximum fidelity. Validate against the decision the model is supposed to support, not just test-set accuracy. Monte Carlo analysis, sensitivity studies, and progressive testing are part of the model — not afterthoughts bolted on at the end.

Phase 05

Deliver and Transition

The deliverable isn't a model — it's an answer to the question you started with. How confident should you be? Where does it break down? I also define what to watch going forward: the conditions under which the model's assumptions break down and the analysis needs to be revisited. The goal is a decision you can defend, not a system you depend on me to run.

AI coding agents accelerate every phase of this process — running sensitivity sweeps, exploring design spaces, building and validating models across thousands of conditions. But they don't change the process. The methodology is what tells them what to build and when to stop.

The Questions ADM Answers

Every engagement starts from a different place, but the questions underneath tend to be the same.

01

What should we build, and how detailed does it need to be?

You've got a decision that needs a model behind it, but there are competing approaches and the range of possible fidelity levels is huge. Full physics-based simulation or a statistical surrogate? Reinforcement learning or Bayesian optimization? ADM works backward from the decision the model needs to support. The method and the fidelity level follow from the question — not the other way around.

02

Is what we built actually good enough?

You have a simulation or a trained model, and it runs. Maybe it performs well on test data. But you're not confident it's actually fit for the decision it's supposed to support. ADM reframes the question: it's not about whether the model is accurate in the abstract. It's about whether the gaps between model and reality are the kind that would change the answer.

03

Where do we start?

You know AI or simulation could help, but you don't know where to start or what to build first. The usual approach is a maturity assessment — a scorecard and a list of prerequisites to check off before you're "ready." ADM starts from the other end: given where you actually are today, what's the highest-impact thing you could build right now? The output is a specific recommendation, not a roadmap.

The OSD Connection

ADM follows the same principles as the OSD Mission Engineering Guide — the framework DoD uses for modeling and simulation across every service.

"From the beginning, it's important to have a clear understanding of what goal or decision will be informed as this will drive subsequent choices throughout the process. [...] These decisions guide the specific questions for the activity as well as the degree of fidelity and level of analytic rigor needed from the results, findings, and conclusions."

Mission Engineering Guide 2.0, OUSD(R&E)

This is the same principle ADM is built on: the question sets the fidelity bar, the fidelity bar dictates the model and method, and the answer is validated against the decision it needs to support. Stakeholders should be able to trace from the decision, through the analysis, to the model assumptions that support it.

The difference is that ADM is how you actually do it. The Mission Engineering Guide tells you what to do. ADM is how to do it, grounded in two decades of doing it across OUSD(R&E), MITRE, and dozens of programs that required exactly this kind of rigor.

ADM vs. Other Approaches

ADM isn't the only philosophy for building models. But it's the only one I've found that actually holds up in practice.

vs.

"Just Use the Biggest Model"

The assumption that more parameters, more data, and more compute will solve any problem. Sometimes true at internet scale. Rarely true for domain-specific problems where data is expensive, physics matters, and the cost of being wrong is high.

ADM says: match fidelity to the decision. A well-scoped model that answers the right question outperforms a massive model that answers the wrong one.
vs.

"Move Fast and Iterate"

The Silicon Valley default: ship quickly, gather feedback, improve. Excellent for consumer products. Dangerous for systems where iteration is expensive, feedback is delayed, and failure has consequences beyond a dip in engagement metrics.

ADM says: define the question first, then move fast. Moving fast without knowing what question you're answering just produces churn.
vs.

"Let the Data Decide"

The belief that sufficient data eliminates the need for domain expertise. This works when the data distribution matches the deployment distribution. It fails when there are distribution shifts, rare events, or physics that the data has never seen.

ADM says: the analysis question determines which data matters. Domain expertise isn't replaced by data — it's what tells you whether your data is sufficient for the decision at hand.

Start With the Question

Whether you’re evaluating simulation fidelity, building models for high-stakes decisions, or figuring out where to start — the methodology is the same. Let the question set the bar. Build to it. Validate what matters. Let’s talk about your problem.

michael@rightfidelity.ai  ·  Washington, D.C. Metro