Caching for correctness, not speed

Every outside reviewer told me to scale the database. I bet the cache was load-bearing for the contract — not just the latency — and that's why it worked.

Situation

Cash Clearance is the part of the Enterprise Reinsurance workflow where finance matches an inbound check against the open transactions it settles. A heavy treaty search returned 100,000+ rows, and the first query took three to five minutes — Lambda timed out before the response made it back. The app was unusable on exactly the workflows it had been built for.

Task

Lead Engineer for the application track; owned the architectural call across an 18-engineer program with no tolerance for a write freeze and ~50 daily actuarial and finance consumers downstream.

Action

The conventional moves — scale Aurora, raise the Lambda timeout, add more indexes — would each have shaved seconds off a query that took minutes. I made a different bet: the cost lived in the wrong layer, and the cache belonged above the database, not below it. I designed a DynamoDB layer that stored the materialised result of each search, keyed by a hash of user + filter + version token. The first query still hit Aurora; every subsequent page, sort, or scroll-restore came from DDB in single-digit milliseconds. I split the core transaction tables on the dimensions the filters actually used so the first query — the one the cache couldn't help — finished in seconds, not minutes. Then I made the version token do double duty as an optimistic-concurrency check on writes, so two users marking against the same set couldn't double-clear.

Result

3–5 minutes → under 5 seconds on 100K+ row queries. The DDB layer stopped being “a performance optimisation” once it became the place we read the truth from — the cache is now load-bearing for correctness, not just speed, and the pattern is reusable for every sync-API-over-expensive-query problem in the portfolio.