✦ Agentic for Agentforce — we use AI agents to deploy yours·✦ AI agent + Salesforce expertise — the combination that delivers results·✦ Free Agentforce Readiness Assessment — book a call·✦ 100+ Salesforce projects delivered — we know what works·✦ Health Cloud specialists — PIPEDA-compliant implementations for Canadian healthcare·✦ Canada-based — offices in Toronto & Mohali, India·✦ Agentic for Agentforce — we use AI agents to deploy yours·✦ AI agent + Salesforce expertise — the combination that delivers results·✦ Free Agentforce Readiness Assessment — book a call·✦ 100+ Salesforce projects delivered — we know what works·✦ Health Cloud specialists — PIPEDA-compliant implementations for Canadian healthcare·✦ Canada-based — offices in Toronto & Mohali, India·
Back to case studies

Fundraising SaaS · Salesforce Apex

A bug that survived three rewrite proposals, closed in five commits.

On a fundraising platform built on Salesforce, two REST resources had been quietly leaking data — failing roughly one call in fifty, re-triaged every quarterly review, and surviving every "let's rewrite it" deck for three quarters running. The fix was small. The surrounding code had effectively zero coverage, so nobody could safely touch it. We closed the bug in five surgical commits, with 23 Apex tests landing alongside the fixes — no business logic rewritten, no managed-package churn.

SalesforceApexREST APISFDXApex TestTooling API

In one sentence

Five surgical Apex fixes plus 23 tests closed a chronic data-leak bug that had survived three rewrite proposals — coverage rose to 78% and 87% as a byproduct, not as the goal.

5 commitsClosed what three quarters of triage couldn’t
23 tests8 of them would have caught the bug pre-fix
78% / 87%Apex coverage on the two REST resources, up from below threshold

The Problem

The team had stopped trying to fix it.

Two REST resources on a Salesforce-backed fundraising platform were intermittently returning the wrong slice of data — roughly one call in fifty, never the same input twice, never reproducible on demand. It was a customer-facing data leak. It generated compliance heat every time it surfaced. And it had been showing up on the same quarterly review slide for three quarters running.

The engineering org described it the way exhausted teams always do: as a thing that had become part of the furniture.

The team had given up trying to fix it the right way — they were patching symptoms quarterly and adding it to the rewrite-someday backlog.

— The VP of Engineering, during scoping

Every patch reduced the failure rate; none of them made it go away.

Every prior root-cause attempt landed on the same recommendation: rewrite both resources from scratch. And every rewrite proposal got rejected — for good reasons. The business logic encoded years of edge-case handling around donor records, recurring gifts, refunds, and partial allocations. Re-platforming carried more risk than the bug itself. The PM framed it bluntly to us in the first call: they would rather live with the bug than risk the rewrite.

That sentence was the brief. The deliverable wasn't a code change. It was a path that didn't require a rewrite.

The technical constraints fell out of that framing:

  • No rewrites — preserve every line of business logic
  • Tests must cover the fix and surrounding paths, not the whole org
  • Ship as a single managed-package update, not a multi-release campaign

The Approach

Reproducer first. Coverage as a byproduct.

Most teams in this situation reach for the same playbook: write tests for the whole resource first, build coverage up to a safe threshold, then refactor under the safety net. It's the textbook answer. It's also why this bug had been open for three quarters — the resource was thousands of lines, the business rules were undocumented, and writing a comprehensive test suite first was itself a quarter-long project nobody wanted to fund.

We did the opposite. A focused reproducer in a sandbox first. Then five surgical fixes, each mapped to that reproducer, with tests landing in the same commit as the fix. Coverage rose as a byproduct of proving each fix, not as a goal pursued for its own sake. The change in sequence is what made this shippable in a single managed-package update instead of a quarter-long rebuild.

Key insight

Write tests for the fix, not for the resource. A focused reproducer in a sandbox tells you exactly what to lock in — 200 lines of preemptive coverage tells you nothing except that you were afraid to change anything.

1

Isolate the failure before touching code

A focused reproducer in a sandbox pinned down the exact input shapes that triggered the leak. Once the reproducer ran red on demand, every conversation downstream — what to fix, what to test, what to ignore — anchored to it. We did not start writing Apex until the reproducer was deterministic.

2

Five fixes, each independently reviewable

Every fix was one Apex method or one query, scoped tight enough that a reviewer could read the diff in under two minutes. No fix touched adjacent code "while we were in there." When a fix tempted us to refactor neighbours, we wrote it down as a follow-up and moved on. The discipline kept the surface area small enough for a single managed-package release.

3

Tests in the same commit as the fix

Each commit landed with the Apex tests that proved the fix. Tests asserted the pre-fix behaviour would have failed and the post-fix behaviour passes. No fix shipped alone. That rule made the change-log trivially auditable — every line of new code had a test pointing at it.

4

Coverage as a leading indicator, not a goal

Coverage climbed from below threshold to 78% on one resource and 87% on the other. We never set those numbers as targets. They were the natural consequence of writing the tests that proved each fix — coverage was the residue of doing the work properly, not the design driver.

5

One managed-package update, not a campaign

All five fixes shipped together. The change-log document maps each fix to its tests to its reproducer input, so the next regression — whenever it surfaces — can be diagnosed in minutes instead of days. Future engineers inherit a chain, not a mystery.

Implementation

Five fixes, twenty-three tests, one change-log.

Apex

Five method-level edits across the two REST resources. Every fix is independently reviewable — bulkification preserved, governor-limit boundaries unchanged, no signature changes that would ripple to callers. The diff is small on purpose: the team that owns this code needs to be able to read it without a guided tour.

Apex tests

23 Apex tests across the two resources. Each test uses @isTest(SeeAllData=false), sets up fixture data explicitly, and asserts both happy-path and the previously-broken edge cases. Eight of the tests would have caught the original bug pre-fix — they exist specifically to lock the regression shut.

Change-log discipline

Every fix has an entry in a change-log doc: which method, which test, which reproducer input, which failure mode it closes. The doc is the artifact reviewers read first; the code reads as confirmation. When the next intermittent failure surfaces — and on a system this old, there will be one — the change-log is the first thing the on-call engineer opens. Future regressions become diagnosable in minutes.

SFDX + sandbox-first flow

Everything moved through a dedicated sandbox before production. The reproducer ran in the sandbox until it ran red consistently; fixes ran there until tests went green and the reproducer went silent; only then did the bundle move to a release candidate. Standard practice — but the discipline matters more on a code path everyone else had been afraid to touch. No business logic was rewritten.

Results

What changed for the team.

  • The chronic bug is closed

    The reproducer no longer triggers. The failure mode is gone — not patched, not reduced in frequency. Gone.

  • 23 Apex tests green, 8 of them regression guards

    Net-new coverage on the two REST resources. Eight of those tests would have caught the original bug pre-fix; they exist to make sure it never comes back through the same door.

  • Coverage at 78% and 87%, up from below threshold

    Both REST resources moved from coverage-blocked to production-acceptable. Future changes are no longer gated on writing tests for code that should have had them years ago.

  • Zero business logic rewritten

    Every prior root-cause attempt had ended in a rewrite proposal. This delivery preserved every line of the edge-case handling the team had spent years tuning. The rewrite-someday slide came off the quarterly deck.

  • Future regressions diagnosable in minutes

    The change-log → tests → reproducer chain compresses the next investigation from days to minutes. The on-call engineer who picks this up in six months inherits a map, not a mystery.

Takeaway

If you take one thing from this case study: a focused reproducer is worth more than two hundred lines of coverage written to feel safe before changing anything. The reproducer tells you what to fix. The tests just lock it in. The teams that stay stuck on chronic Apex bugs are usually stuck on the sequence, not the code.

Sitting on a chronic Apex bug that's outlived three rewrite proposals?

Let's find the surgical path before the next "rewrite the resource" deck lands.

30 minutes. Share the failure mode, the rewrite history, the current coverage. We'll tell you whether 5 fixes and a test suite can replace a quarter of rebuild work.

Book a 30-min call