Human-in-the-Loop Checkpoints That Hold Without Killing Throughput · Securing Money-Moving Agents

A money-moving agent will, sooner or later, try to do something it should not. The scoped credentials from earlier in this course shrink the blast radius, and the tool gateway in the next module caps what any single call can touch. But neither one asks a person "are you sure" before the funds leave. That is the job of the human-in-the-loop checkpoint, and it is where most teams either over-gate themselves into uselessness or under-gate themselves into a loss.

The hard part is not adding an approval prompt. It is deciding which actions get one, keeping the reviewer's judgment intact, and engineering the pause so it survives restarts. We will treat each of those as a separate problem.

What a checkpoint actually buys you

A checkpoint is a deliberate stop where execution halts and waits for an explicit human decision before an action commits. The reason it works is sequencing: the human reviews before the irreversible step, not after. An audit log (covered later in this course) tells you what happened; a checkpoint changes what happens.

That distinction matters because money movement is rarely reversible on agent timescales. A pushed payment, a vendor payout, a refund, a balance transfer all settle into someone else's control. The checkpoint is the last point at which a wrong decision is still cheap to undo.

So the goal is not maximum oversight. It is to spend the reviewer's attention only where reversal is expensive and the agent's confidence is not enough on its own.

Tier the actions, do not gate them all

The failure mode that quietly destroys these systems is uniform gating. If every agent action waits for a person, throughput collapses and, worse, reviewers start skimming. A human asked to approve fifty near-identical payouts a day stops reading the fifty-first. The checkpoint is still there; the judgment behind it is gone.

The fix is risk tiering. Sort actions into bands and apply a different control to each.

A workable three-tier split

Tier 1, auto-execute with logging. Read-only calls, balance checks, quote retrieval, and writes that move no money. No human in the path. These are the majority of actions and they should feel instant.
Tier 2, asynchronous review. Reversible or low-value money actions inside a known envelope: a refund under a set cap, a payout to an already-verified payee. The agent can proceed, but the action is queued for a reviewer who can claw it back inside a defined window.
Tier 3, synchronous approval required. Irreversible or high-value movement: a new-payee payout, a transfer above a threshold, anything outside the pre-approved budget or category. Execution blocks until a person approves.

This maps cleanly onto the mandate model from Module 4. AP2 frames the same split as Human-Present and Human-Not-Present. In a Cart Mandate, the human is present and signs the specific cart to authorize it, which is your Tier 3 synchronous case. In an Intent Mandate, the human is not present at execution time but has pre-signed the conditions (budget, categories, timing) the agent must stay inside, which is your Tier 1 and Tier 2 envelope. When the agent stays inside the signed envelope, no live approval is needed; when it tries to step outside, you force it back to Human-Present.

Set the thresholds from your own loss data, not from a vendor default. The threshold is a business decision about acceptable loss per unattended action, and it should be reviewed as volume grows.

Make the pause durable

A synchronous checkpoint means the agent run stops and waits, possibly for minutes or hours, while a person decides. If your process crashes, redeploys, or times out during that wait, you cannot lose the pending decision or, worse, resume in a state where the action silently fires.

This is a durable-execution problem, and the orchestration layer should own it. The pattern that has settled across tools like LangGraph and Temporal is the same: the agent hits an interrupt, the runtime serializes the full state to a checkpointer keyed by a thread or run ID, and execution waits. A reviewer sees the pending action with its context, sends an approve or reject signal, and the runtime resumes exactly at the interrupt with the human's decision as the return value. If the host dies mid-wait, it restarts from the last checkpoint rather than replaying the run.

Two engineering rules follow. First, the action must commit only on the resume path, never speculatively before the pause. Second, the approval signal needs an idempotency key so a retried or duplicated approval cannot fire the payment twice.

A worked example

An agent manages supplier payouts. Its Intent Mandate, signed by the finance lead, authorizes payouts up to $5,000 to payees already in the verified vendor list, within the marketing category, for the current quarter.

The agent queues a $3,200 payout to an existing vendor. That is inside the signed envelope, so it lands in Tier 2: it executes, and a row appears in the reviewer's async queue with a defined claw-back window. No one is blocked.

Next the agent tries a $3,200 payout to a vendor that is not on the verified list. The bank details are new. This breaks the "verified payee" condition, so it drops to Tier 3. The orchestrator interrupts, serializes state, and surfaces the request to a human with the full context: amount, new payee, the bank detail, and the agent's stated reason. The reviewer sees the new-payee flag, recognizes it as a classic redirected-payment setup (the kind of promptware payload covered in Module 5), and rejects. The agent resumes on the reject path and the payout never commits.

The same checkpoint that blocked the bad payout let the routine one through untouched. That is the balance you are aiming for.

Design the review, not just the gate

A gate that produces a yes/no prompt with no context is theater. For Tier 3 to work, the reviewer needs the amount, the destination, the reason the agent gave, and what specifically tripped the gate, presented in one place. Bury any of that and you get reflexive approval under time pressure.

Two further controls keep the human layer honest. Put a timeout on every synchronous checkpoint that fails closed: if no one decides inside the window, the action is rejected, not auto-approved. And watch your approval rate. A Tier 3 queue running at 99 percent approval is not catching anything; either the tier is miscalibrated and those actions belong in Tier 2, or the reviewers have stopped reading.

Takeaway

Checkpoints earn their keep by being selective. Tier actions by reversibility and value, let the signed mandate envelope carry the routine traffic, and reserve the synchronous human stop for irreversible money movement and anything outside the envelope. Engineer the pause as durable state that commits only on resume, give the reviewer real context, and fail closed on timeout. Done this way, the human is in the loop exactly where a wrong move is expensive and nowhere it is not.

← Previous

Tool Gateways and Least Agency

Tamper-Evident Audit for Money-Moving Agents