Building a Dispute-Ops Function That Scales · Disputes & Chargebacks, End to End

Most dispute teams start as one person doing representment between other tasks. That works until volume crosses a few hundred cases a month, and then the same operation that won cases at low volume starts missing deadlines and shipping weak evidence. Disputes do not fail because the underlying mechanism is hard. They fail because the operation around it does not scale.

This lesson treats dispute handling as an operations problem: a queue with hard deadlines, a quality bar set by reason codes and evidence rules, and a cost structure that has to stay flat or fall as case count rises. The earlier modules covered how the four-party flow, reason codes, and representment work. This one covers how to run that work as a function.

The work is a deadline-bound queue

Every dispute is a clock. On Visa, 30 days is the network-level acquirer window to respond with a dispute response (second presentment), measured from the dispute processing date; on Mastercard, the window is 45 days from the chargeback filing. Your acquirer or processor usually imposes a tighter internal deadline, often 5 to 10 days, so the working clock is shorter than the network limit. The issuer then gets its own window to review and escalate to pre-arbitration. A case that sits past its window is a guaranteed loss regardless of how strong the evidence was.

That single fact shapes the whole operation. The first metric is not win rate. It is the share of cases worked before deadline, because a missed deadline removes the case from the win-rate pool entirely. We have seen teams quietly inflate their reported win rate by only counting cases they actually responded to, while a fifth of the queue aged out unanswered. Measure response coverage and win rate as separate numbers, and never let the second hide the first.

The metrics that actually run the function

Track four numbers, in this order:

Response coverage: disputes responded to before deadline, divided by disputes received.
Win rate: cases won, divided by cases responded to.
Net recovery rate: revenue recovered after representment costs and second chargebacks, divided by disputed revenue. This is the number that pays for the team.
Cost per case: fully loaded operating cost divided by cases worked.

The gap between win rate and net recovery rate is where the real economics live. Industry benchmarks put representment win rate somewhere in the 40 to 55 percent range for merchants who fight, but net recovery, after fees and the second chargebacks that come back, lands far lower, often in the low double digits. Run the function on net recovery, not the headline win rate.

Sizing the team

A manual case takes real time. Pulling transaction records, retrieving customer communications and shipping confirmation, writing the rebuttal, and formatting it to network specification runs roughly 30 to 60 minutes per case, and complex cases run longer. That throughput figure is the basis for staffing.

A worked example

Say you receive 1,000 disputes a month. At an average of 40 minutes per worked case, that is about 667 analyst-hours, or roughly four full-time analysts once you account for the fact that productive case-handling time is maybe 70 percent of a paid hour. A competent dispute analyst costs on the order of $55,000 to $75,000 a year in salary, and fully loaded with benefits, tooling, and management overhead, a small team runs comfortably past $160,000 a year.

Now do the recovery side. If those 1,000 disputes average $80 and you respond to 90 percent and win 45 percent of what you fight, you recover about 1,000 x 0.90 x 0.45 x $80, or roughly $32,400 a month, before second chargebacks claw some back. Net recovery after fees might be half that. Against a four-analyst cost base, a fully manual operation at this volume is close to break-even on recovery alone, which is why the build-versus-buy question is not optional.

Build, buy, or outsource

The same case math drives the model choice. Three options, each with a different cost shape.

In-house manual gives you full control and institutional knowledge, but cost scales linearly with volume and you pay for idle capacity in slow months. It fits large operations with the volume and HR infrastructure to keep analysts busy and specialized.

Software and automation moves the cost from headcount to a platform fee, charged per dispute, by volume tier, or on recovered revenue. The win is consistency: evidence is assembled and formatted the same way every time, and nothing ages out because a person was on leave. The realistic fit is merchants with enough monthly volume to justify the fee plus at least one internal owner who runs the workflow and audits outputs.

Outsourcing hands the queue to an agency, usually priced as a share of recovered revenue (commonly in the 15 to 30 percent range) or a flat per-case fee. It removes the staffing problem entirely and is the fastest way to get coverage up, at the cost of margin and direct control of evidence quality.

Most operations at scale end up hybrid: automation handles evidence assembly and the high-volume, low-complexity reason codes, while a small in-house team owns escalations, pre-arbitration, the network monitoring programs from module 5, and the judgment calls a template cannot make. The first-party-misuse cases from module 7 are exactly the kind that need a human, because winning them turns on a specific evidence narrative rather than a standard packet.

What tooling has to do

Tooling is not the same as automation. Before automating anything, the function needs three boring capabilities: a single queue with deadline tracking so nothing ages out, a connection to transaction and fulfillment data so evidence does not require five logins per case, and reason-code-aware templating so each rebuttal matches what the network actually wants for that code.

Automation sits on top of that and earns its keep on volume. The reason codes that recur, the evidence that is always the same, the formatting rules that never change. Reserve human attention for the cases where judgment moves the outcome. A platform that auto-files a weak, mistargeted response fast is worse than a slow human who files the right one, because a lost representment can trigger a pre-arbitration you then have to defend.

The takeaway

A dispute-ops function scales when three numbers hold as volume rises: response coverage stays near 100 percent, net recovery stays positive after all costs, and cost per case falls. Size the team off real per-case handling time, choose build, buy, or outsource off the recovery math rather than instinct, and automate the repetitive codes so human judgment goes to the cases that actually turn on it. Run it as a deadline-bound queue with a quality bar, and win rate takes care of itself.

← Previous

Disputes Beyond Cards: A2A, RTP, and the Recourse Gap

Capstone: Design a Dispute Flow End to End