Fundraising SaaS · Salesforce Apex
A bug that survived three rewrite proposals, closed in five commits.
On a fundraising platform built on Salesforce, two REST resources had been quietly leaking data — failing roughly one call in fifty, re-triaged every quarterly review, and surviving every "let's rewrite it" deck for three quarters running. The fix was small. The surrounding code had effectively zero coverage, so nobody could safely touch it. We closed the bug in five surgical commits, with 23 Apex tests landing alongside the fixes — no business logic rewritten, no managed-package churn.
In one sentence
Five surgical Apex fixes plus 23 tests closed a chronic data-leak bug that had survived three rewrite proposals — coverage rose to 78% and 87% as a byproduct, not as the goal.
The Problem
The team had stopped trying to fix it.
Two REST resources on a Salesforce-backed fundraising platform were intermittently returning the wrong slice of data — roughly one call in fifty, never the same input twice, never reproducible on demand. It was a customer-facing data leak. It generated compliance heat every time it surfaced. And it had been showing up on the same quarterly review slide for three quarters running.
The engineering org described it the way exhausted teams always do: as a thing that had become part of the furniture.
The team had given up trying to fix it the right way — they were patching symptoms quarterly and adding it to the rewrite-someday backlog.
— The VP of Engineering, during scoping
Every patch reduced the failure rate; none of them made it go away.
Every prior root-cause attempt landed on the same recommendation: rewrite both resources from scratch. And every rewrite proposal got rejected — for good reasons. The business logic encoded years of edge-case handling around donor records, recurring gifts, refunds, and partial allocations. Re-platforming carried more risk than the bug itself. The PM framed it bluntly to us in the first call: they would rather live with the bug than risk the rewrite.
That sentence was the brief. The deliverable wasn't a code change. It was a path that didn't require a rewrite.
The technical constraints fell out of that framing:
- No rewrites — preserve every line of business logic
- Tests must cover the fix and surrounding paths, not the whole org
- Ship as a single managed-package update, not a multi-release campaign
The Approach
Reproducer first. Coverage as a byproduct.
Most teams in this situation reach for the same playbook: write tests for the whole resource first, build coverage up to a safe threshold, then refactor under the safety net. It's the textbook answer. It's also why this bug had been open for three quarters — the resource was thousands of lines, the business rules were undocumented, and writing a comprehensive test suite first was itself a quarter-long project nobody wanted to fund.
We did the opposite. A focused reproducer in a sandbox first. Then five surgical fixes, each mapped to that reproducer, with tests landing in the same commit as the fix. Coverage rose as a byproduct of proving each fix, not as a goal pursued for its own sake. The change in sequence is what made this shippable in a single managed-package update instead of a quarter-long rebuild.
Key insight
Write tests for the fix, not for the resource. A focused reproducer in a sandbox tells you exactly what to lock in — 200 lines of preemptive coverage tells you nothing except that you were afraid to change anything.
Isolate the failure before touching code
A focused reproducer in a sandbox pinned down the exact input shapes that triggered the leak. Once the reproducer ran red on demand, every conversation downstream — what to fix, what to test, what to ignore — anchored to it. We did not start writing Apex until the reproducer was deterministic.
Five fixes, each independently reviewable
Every fix was one Apex method or one query, scoped tight enough that a reviewer could read the diff in under two minutes. No fix touched adjacent code "while we were in there." When a fix tempted us to refactor neighbours, we wrote it down as a follow-up and moved on. The discipline kept the surface area small enough for a single managed-package release.
Tests in the same commit as the fix
Each commit landed with the Apex tests that proved the fix. Tests asserted the pre-fix behaviour would have failed and the post-fix behaviour passes. No fix shipped alone. That rule made the change-log trivially auditable — every line of new code had a test pointing at it.
Coverage as a leading indicator, not a goal
Coverage climbed from below threshold to 78% on one resource and 87% on the other. We never set those numbers as targets. They were the natural consequence of writing the tests that proved each fix — coverage was the residue of doing the work properly, not the design driver.
One managed-package update, not a campaign
All five fixes shipped together. The change-log document maps each fix to its tests to its reproducer input, so the next regression — whenever it surfaces — can be diagnosed in minutes instead of days. Future engineers inherit a chain, not a mystery.
Implementation
Five fixes, twenty-three tests, one change-log.
Apex
Five method-level edits across the two REST resources. Every fix is independently reviewable — bulkification preserved, governor-limit boundaries unchanged, no signature changes that would ripple to callers. The diff is small on purpose: the team that owns this code needs to be able to read it without a guided tour.
Apex tests
23 Apex tests across the two resources. Each test uses @isTest(SeeAllData=false), sets up fixture data explicitly, and asserts both happy-path and the previously-broken edge cases. Eight of the tests would have caught the original bug pre-fix — they exist specifically to lock the regression shut.
Change-log discipline
Every fix has an entry in a change-log doc: which method, which test, which reproducer input, which failure mode it closes. The doc is the artifact reviewers read first; the code reads as confirmation. When the next intermittent failure surfaces — and on a system this old, there will be one — the change-log is the first thing the on-call engineer opens. Future regressions become diagnosable in minutes.
SFDX + sandbox-first flow
Everything moved through a dedicated sandbox before production. The reproducer ran in the sandbox until it ran red consistently; fixes ran there until tests went green and the reproducer went silent; only then did the bundle move to a release candidate. Standard practice — but the discipline matters more on a code path everyone else had been afraid to touch. No business logic was rewritten.
Results
What changed for the team.
The chronic bug is closed
The reproducer no longer triggers. The failure mode is gone — not patched, not reduced in frequency. Gone.
23 Apex tests green, 8 of them regression guards
Net-new coverage on the two REST resources. Eight of those tests would have caught the original bug pre-fix; they exist to make sure it never comes back through the same door.
Coverage at 78% and 87%, up from below threshold
Both REST resources moved from coverage-blocked to production-acceptable. Future changes are no longer gated on writing tests for code that should have had them years ago.
Zero business logic rewritten
Every prior root-cause attempt had ended in a rewrite proposal. This delivery preserved every line of the edge-case handling the team had spent years tuning. The rewrite-someday slide came off the quarterly deck.
Future regressions diagnosable in minutes
The change-log → tests → reproducer chain compresses the next investigation from days to minutes. The on-call engineer who picks this up in six months inherits a map, not a mystery.
Engagement artifact
The change-log methodology, redacted, on request.
Change-log + test inventory available on request
The full change-log document, test inventory, and reproducer scripts are confidential to the engagement. We share the methodology and a redacted sample on request — useful if your team has its own chronic-Apex-bug situation and wants to see what the artifact looks like before scoping a call. Email inquiry@growbizsolutions.com to ask.
Takeaway
If you take one thing from this case study: a focused reproducer is worth more than two hundred lines of coverage written to feel safe before changing anything. The reproducer tells you what to fix. The tests just lock it in. The teams that stay stuck on chronic Apex bugs are usually stuck on the sequence, not the code.
Sitting on a chronic Apex bug that's outlived three rewrite proposals?
Let's find the surgical path before the next "rewrite the resource" deck lands.
30 minutes. Share the failure mode, the rewrite history, the current coverage. We'll tell you whether 5 fixes and a test suite can replace a quarter of rebuild work.
Book a 30-min call