Case Study | PayPal

Distributed State Management

Resolving critical token-invalidation race conditions in the AU/Citi credit lifecycle using a "Request-in-Flight" locking pattern.

The Challenge

During the credit lifecycle within the AU/Citi Co-Branded Card domain, backend services were receiving concurrent webhook events and synchronous user requests simultaneously. Because these separate threads attempted to mutate the same transaction state at the same millisecond, they triggered Race Conditions.

This resulted in critical state collisions, causing auth tokens to be prematurely invalidated. Legitimate user requests were failing, creating a highly visible consistency bug in the payments domain where transaction reliability must be absolute.

Architecture

Abstracted view of the Request-in-Flight Pattern.

Concurrent Triggers

Webhooks & User Requests

Centralized Cache

Distributed Lock (Redis/Memcached)

Retry & Backoff

Non-blocking Collision Handling

State Mutator

Safe, Sequential Processing

The Solution

Request-in-Flight Locking

Concurrency Management

  • Distributed Locking: Implemented a "Request-in-Flight" concurrency control mechanism using a Centralized Cache. The first thread to process a transaction acquires an exclusive, short-lived lock for that specific token/ID.
  • Retry-with-Backoff Logic: Rather than immediately failing concurrent requests when a lock is encountered, implemented an intelligent exponential backoff algorithm. Subsequent threads wait briefly and retry, allowing the initial state mutation to complete smoothly.
  • Idempotency: Guaranteed that overlapping events for the same credit lifecycle stage were safely serialized, preventing duplicate processing and premature invalidations.
  • Impact: Completely eradicated the token-invalidation race condition, achieving 100% transaction consistency for concurrent requests in a high-volume financial environment.