ML vs FEA: Validation Within 1% for 95% of Conditions

Damir Herman, Ph.D. avatar
Damir Herman, Ph.D.

Image credit: Photo by Jesús Esteban San José (pexels.com)

Rise of the Machines: ML v. FEA

It is increasingly common to see machine-learning models benchmarked against finite element analysis (FEA) and reported as achieving “within 1% accuracy for 95% of cases.”

On its own, that statement sounds impressive.

Without context, it is also meaningless.

What the Comparison Is Actually Saying

At a high level, this type of result usually means:

  • An ML model has been trained on outputs generated by an FEA solver
  • For a defined set of input conditions, the ML predictions closely match the FEA results
  • The comparison metric (often RMSE or relative error) falls within ±1% for most test samples

This demonstrates that the ML model can interpolate within the space it has already seen. It does not automatically demonstrate engineering validity, operational safety, or decision-readiness.

Why “1% for 95%” is not enough

That headline number hides several critical questions:

  • Which 95% of conditions?
    Mild sea states? Nominal vessel speeds? Straight lay only?

  • What defines the remaining 5%?
    Are they edge cases, or are they the exact conditions that drive risk?

  • What is the tolerance applied to?
    Tension? Curvature? Touchdown location? Fatigue damage rate?

  • What is the reference truth?
    One FEA configuration? Which solver settings, mesh assumptions, soil models, load formulations and wave spectra?

Without explicit answers, the metric collapses into a statistical artifact, not an engineering claim.

The Environment Defines the Problem

FEA results are not universal truths. They are conditional outcomes based on:

  • Environmental definition (waves, currents, directionality, spectra)
  • Cable properties (weight, stiffness, diameter, coatings)
  • Vessel behavior and control assumptions
  • Soil interaction and seabed representation
  • Numerical tolerances and solver choices

An ML model that matches FEA well under one environmental definition may fail silently under another. This is not a flaw of ML. It is a failure of problem definition.

Guidance such as DNV-RP-C205 exists precisely to ensure that environmental inputs are traceable, bounded, and defensible, rather than convenient or optimistic.

What DNV-RP-0665 Adds to the Discussion

Recommended practice for machine learning applications document DNV-RP-0665 makes an explicit distinction that is often missing in ML–FEA comparisons: performance is not assurance.

The standard requires that ML applications demonstrate:

  • Clearly defined intended use
  • Explicit input domain boundaries
  • Known failure modes and uncertainties
  • Traceability between model outputs and engineering decisions
  • Evidence that residual risk is within tolerable limits, not merely small on average

Under this lens, a “1% error for 95% of cases” is not a conclusion, but is a single data point in a broader assurance argument. The unanswered question becomes whether the remaining uncertainty is acceptable given the consequences of being wrong.

Validation Is Not Accuracy Alone

Engineering validation answers a different question than statistical accuracy:

Can this model be relied upon to support decisions under uncertainty?

That requires:

  • Clearly defined input domains
  • Explicit operational tolerances
  • Known failure modes
  • Demonstrated behavior near limits, not just at the mean
  • A documented relationship between ML outputs and engineering acceptance criteria

Frameworks developed by DNV formalize this distinction: high performance does not equal high confidence unless claims, assumptions, and evidence are aligned.

The Real Takeaway

Achieving ML–FEA agreement within 1% for 95% of conditions is genuinely impressive.

But without:

  • disciplined environmental definition,
  • explicit tolerance mapping,
  • careful selection of output variables,
  • and an assurance process consistent with DNV-RP-0665,

the result is hot shit with no place to land.

Useful engineering models are not defined by how close they are to a solver. They are defined by how safely they support decisions when conditions drift, assumptions strain, and uncertainty grows.

That is the bar that matters.