Welcome To The Edge

Marsham Edge's Newsletter

Most AI tools don’t fail in obvious ways.

They fail quietly.

They work 95% of the time…
And then break on the one case that actually matters.

That’s the problem with edge cases.

The illusion of “working”

In most organisations, AI tools are tested like this:

Does it generate a report?
Does it summarise correctly?
Does it produce a usable output?

If the answer is “yes,” the tool is considered ready.

But that’s not where the risk lives.

The risk lives in the 5% of cases that don’t behave as expected:

incomplete or conflicting data
unusual project structures
edge scenarios no one thought to test

That’s where outputs become unreliable.
And in high-stakes environments, that’s where problems start.

Where this becomes dangerous

In infrastructure, energy, and project-led environments:

A slightly incorrect report isn’t just inefficient
A missed inconsistency isn’t just a nuisance

It can mean:

incorrect decisions
delays in delivery
or outputs that don’t stand up to scrutiny

This is how “working AI” quietly introduces risk into the system.

Why this happens

Most teams test for:

👉 expected scenarios

Very few test for:

👉 edge cases
👉 failure modes
👉 “what happens when the data isn’t clean”

And almost no one defines:

👉 when a human must step in

The role of experience

This is where “human in the loop” is often misunderstood.

It’s not about:

reviewing everything
slowing things down

It’s about:

👉 knowing when not to trust the output

That judgment comes from:

domain experience
understanding of the system
and exposure to failure patterns

Without that, teams become:

👉 overconfident in tools that haven’t been fully tested

What works instead

The teams that get this right do two things:

They actively test edge cases
Not just “does it work,” but “when does it break”
They define clear confidence thresholds
When the system is reliable
And when a human must intervene

This is what turns AI from:
👉 a helpful tool

Into:
👉 a system you can actually rely on

Where this becomes practical

In most organisations, this shows up today as:

outputs being used without clear validation standards
inconsistent review across teams
no shared understanding of what is “safe to rely on”

That’s where risk accumulates.

Quietly.

A practical way to test this

What we do with teams is simple:

We take the tools they’re already using
…and stress-test them against:

edge cases
inconsistent data
real-world scenarios

Then we define:

where the system holds
where it breaks
and where human oversight is critical

No new systems.
No long rollout.

Just clarity on what actually works ... and what doesn’t.

If this is something you want to understand in your environment, just reply with:

EDGE

TEST

I’ll share how we’re approaching this with other teams.

Muriel Demarcus
CEO, Marsham Edge

Engineer | Lawyer | Ultra-Runner

600 1st Ave, Ste 330 PMB 92768, Seattle, WA 98104-2246
Unsubscribe · Preferences

Marsham Edge

AI Works...Until It Doesn't