13 minute read

Table of Contents


The 3 AM Wake-Up Call

Phone buzzes at 3 AM.

“Hey, merchants are reporting they’ve been charged payout fees twice. I checked and found several merchants with double charges…”

That message from our support team was my introduction to the harsh reality of webhook idempotency in payment systems.

Today I’m sharing the story of this production bug, why it happened, and more importantly - how to build systems that handle webhooks being sent “at least once” while ensuring your logic runs “exactly once.”

(Spoiler: It’s not as simple as checking the database.)

The Production Incident

The Context

Our payment system integrates with a payment gateway (let’s call it “PaymentProvider”). Here’s how payouts worked:

Flow:
1. Merchant requests payout (withdraw money to bank)
2. PaymentProvider processes payout
3. PaymentProvider sends webhook: "Payout successful"
4. Our system charges 0.5% payout fee
5. Update merchant balance

The Naive Implementation

Here’s what our code looked like initially:

@Controller('webhooks')
export class WebhookController {
  @Post('payment-provider')
  async handlePayoutWebhook(@Body() event: PayoutEvent) {
    const payout = await this.payoutService.findOne(event.payoutId);

    // ❌ What's wrong with this logic?
    if (payout.status !== 'completed') {
      // Charge 0.5% fee
      const fee = payout.amount * 0.005;
      await this.feeService.createPayoutFee({
        payoutId: payout.id,
        amount: fee
      });

      // Update status
      await this.payoutService.update(payout.id, {
        status: 'completed'
      });
    }

    return { received: true };
  }
}

Looks reasonable, right? We check the status before charging the fee. What could go wrong?

The Incident Timeline

10:15:00 - Merchant A requests payout of $10,000
10:15:30 - PaymentProvider processes payout, sends webhook #1
10:15:30 - Webhook #1 arrives at our server, starts processing
10:15:30 - Code checks status: 'pending' ✅
10:15:30 - Charges fee: $50
10:15:31 - PaymentProvider retries webhook (didn't get response in time)
10:15:31 - Webhook #2 arrives at our server
10:15:31 - Code checks status: still 'pending' ✅ (first request hasn't finished!)
10:15:31 - Charges ANOTHER fee: $50 ❌

Result: Merchant charged $100 instead of $50

Root Cause Analysis

Problem #1: Race Condition

- Webhook #1 is processing (hasn't updated status yet)
- Webhook #2 arrives and checks status
- Both see status = 'pending'
- Both decide "haven't charged fee yet"
- Both charge the fee → DOUBLE CHARGE

Problem #2: Webhook Delivery Semantics

Payment providers guarantee "at-least-once delivery"
- They don't guarantee "exactly-once"
- Webhooks can be sent multiple times due to:
  → Network timeouts
  → Retry logic
  → Provider-side failures
  → Our server being slow to respond

This is not a bug in the payment provider - it’s by design.

Understanding Webhook Delivery Guarantees

Three Types of Delivery Semantics

┌─────────────────┬──────────────────┬────────────────────┐
│   Type          │   Behavior       │   Real World       │
├─────────────────┼──────────────────┼────────────────────┤
│ At-most-once    │ Send once, no    │ Unreliable, rarely │
│                 │ retry if fails   │ used for webhooks  │
├─────────────────┼──────────────────┼────────────────────┤
│ At-least-once ← │ Send until gets  │ Stripe, PayPal,    │
│ (Most common)   │ success response │ Shopify, Square    │
│                 │ May duplicate    │ ← STANDARD         │
├─────────────────┼──────────────────┼────────────────────┤
│ Exactly-once    │ Guaranteed once  │ Expensive/complex  │
│                 │ (theoretical)    │ (Apache Kafka)     │
└─────────────────┴──────────────────┴────────────────────┘

Why “At-Least-Once” is Standard

Imagine you’re a payment provider:

Scenario 1: At-Most-Once

Provider: "I'm sending webhook for $10,000 payout"
*Network timeout*
Provider: "Oh well, not retrying"
→ Merchant never knows payout succeeded ❌

Scenario 2: At-Least-Once

Provider: "I'm sending webhook for $10,000 payout"
*Network timeout*
Provider: "No response, retrying in 5 seconds..."
Provider: "Sending again..."
*Server responds OK*
→ Merchant gets notified (might receive duplicates) ✅

Conclusion: At-least-once delivery is the best trade-off for reliability.

Who’s Responsible for What?

Payment Provider's job: "Webhook will arrive at least once"
Our job: "Logic runs EXACTLY once"

→ This is called IDEMPOTENCY

What is Idempotency?

Definition

Idempotency: Performing an operation multiple times produces the same result as performing it once.

Real-world examples:

✅ Idempotent:
   SET temperature = 25°C
   (set 10 times → still 25°C)

❌ Not idempotent:
   INCREMENT temperature +5°C
   (increment 10 times → 50°C increase!)

✅ Idempotent:
   UPDATE users SET name = 'John' WHERE id = 1

❌ Not idempotent:
   INSERT INTO payments (amount) VALUES (100)

Idempotency in Webhooks

// ❌ Not idempotent
function chargePayoutFee(payoutId: string) {
  await db.insert('fees', {
    payoutId,
    amount: 50
  });
  // Called twice → 2 records → wrong
}

// ✅ Idempotent
function chargePayoutFee(payoutId: string, idempotencyKey: string) {
  await db.insert('fees', {
    payoutId,
    amount: 50,
    idempotencyKey // ← Unique constraint
  });
  // Called twice → second fails with duplicate key → idempotent
}

The Idempotency Key

An idempotency key is a unique identifier for each webhook event.

Common implementations:

Stripe:   event.id (e.g., "evt_1ABC...")
PayPal:   event_id
Shopify:  X-Shopify-Webhook-Id header
Square:   event_id

→ Use this key to detect duplicate webhooks

Implementation: 3 Patterns for Idempotency

This is the simplest and most reliable approach.

Step 1: Database Schema

CREATE TABLE payout_fees (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  payout_id UUID NOT NULL,
  amount DECIMAL(10, 2) NOT NULL,
  idempotency_key VARCHAR(255) NOT NULL UNIQUE, -- ← The magic
  created_at TIMESTAMP DEFAULT NOW()
);

-- Ensure idempotency key is unique
CREATE UNIQUE INDEX idx_payout_fees_idempotency
  ON payout_fees(idempotency_key);

-- Also ensure one payout = one fee
CREATE UNIQUE INDEX idx_payout_fees_payout_id
  ON payout_fees(payout_id);

Step 2: Service Implementation

@Injectable()
export class PayoutFeeService {
  constructor(
    private readonly feeRepository: FeeRepository,
    private readonly logger: Logger
  ) {}

  async chargePayoutFee(
    payoutId: string,
    amount: number,
    idempotencyKey: string
  ): Promise<PayoutFee> {
    try {
      // Attempt to insert with idempotency key
      const fee = await this.feeRepository.create({
        payoutId,
        amount,
        idempotencyKey
      });

      this.logger.log(`Fee charged successfully: ${fee.id}`);
      return fee;

    } catch (error) {
      // Check if it's a duplicate key error
      if (this.isDuplicateKeyError(error)) {
        this.logger.log(
          `Duplicate webhook detected and ignored: ${idempotencyKey}`
        );

        // Return existing fee (idempotent response)
        const existingFee = await this.feeRepository.findOne({
          idempotencyKey
        });

        return existingFee;
      }

      // Re-throw other errors
      throw error;
    }
  }

  private isDuplicateKeyError(error: any): boolean {
    // PostgreSQL unique violation code
    return error.code === '23505';
  }
}

Step 3: Webhook Handler

@Controller('webhooks')
export class WebhookController {
  constructor(
    private readonly webhookService: WebhookService,
    private readonly payoutService: PayoutService,
    private readonly feeService: PayoutFeeService,
    private readonly logger: Logger
  ) {}

  @Post('payment-provider')
  async handlePayoutWebhook(
    @Body() event: PayoutWebhookEvent,
    @Headers('x-webhook-signature') signature: string
  ) {
    // Step 1: Verify webhook signature (security)
    await this.webhookService.verifySignature(event, signature);

    // Step 2: Use event ID as idempotency key
    const idempotencyKey = event.id; // e.g., "evt_123abc"

    // Step 3: Get payout details
    const payout = await this.payoutService.findOne(event.payoutId);

    // Step 4: Calculate fee
    const feeAmount = payout.amount * 0.005; // 0.5%

    // Step 5: Charge fee (idempotent!)
    await this.feeService.chargePayoutFee(
      payout.id,
      feeAmount,
      idempotencyKey // ← Magic happens here
    );

    return { received: true };
  }
}

Why This Works:

First webhook arrives:
- Insert with idempotency_key = "evt_123"
- Success → Fee charged

Second webhook arrives (duplicate):
- Try to insert with idempotency_key = "evt_123"
- Fails: Unique constraint violation
- Catch error, return existing fee
- No double charge! ✅

Pros:

✅ Simple to implement
✅ Database enforces uniqueness (single source of truth)
✅ Atomic operation (no race conditions)
✅ No external dependencies (Redis, etc.)
✅ Works with any SQL database

Cons:

❌ Relies on database constraint
❌ Less flexible for complex scenarios

Pattern #2: Redis Distributed Lock

⚠️ Warning: This pattern is more complex than Pattern #1. Only use it if you have specific requirements.

When you might need this:

  • You’re processing webhooks on multiple servers simultaneously
  • Webhook processing takes >5 seconds (long operations)
  • You want to wait for first request to finish instead of failing fast

Core concept:

async chargePayoutFee(payoutId: string, amount: number, idempotencyKey: string) {
  const lockKey = `lock:payout-fee:${idempotencyKey}`;

  // Try to acquire lock
  const lockAcquired = await redis.set(lockKey, '1', 'EX', 30, 'NX');

  if (!lockAcquired) {
    // Another server is processing, wait for it to finish
    await this.waitForCompletion(lockKey);
    return this.feeRepository.findOne({ idempotencyKey });
  }

  try {
    // We got the lock - check if already processed
    const existing = await this.feeRepository.findOne({ idempotencyKey });
    if (existing) return existing;

    // Process fee
    return await this.feeRepository.create({ payoutId, amount, idempotencyKey });

  } finally {
    // Always release lock
    await redis.del(lockKey);
  }
}

Pros:

✅ Handles concurrent requests across multiple servers
✅ Prevents race conditions completely
✅ Second request waits instead of failing

Cons:

❌ Significantly more complex than Pattern #1
❌ Requires Redis (external dependency)
❌ What if Redis goes down? (need fallback)
❌ Lock timeout needs careful tuning
❌ Waiting logic can be tricky

Reality check: Pattern #1 (unique constraint) is enough for 95% of use cases. Only implement Redis locks if you have a proven need.


Pattern #3: State Machine with Pessimistic Locking

For workflows with complex state transitions:

// Database schema with state tracking
enum PayoutState {
  PENDING = 'pending',
  FEE_CHARGED = 'fee_charged',
  COMPLETED = 'completed'
}

// Entity
class Payout {
  id: string;
  amount: number;
  status: PayoutState;
  version: number; // For optimistic locking
}
@Injectable()
export class PayoutFeeService {
  async chargePayoutFee(payoutId: string): Promise<PayoutFee> {
    // Use database transaction
    return this.entityManager.transaction(async (em) => {
      // Lock the payout row (SELECT FOR UPDATE)
      const payout = await em.findOne(Payout, payoutId, {
        lock: { mode: 'pessimistic_write' }
      });

      // Check state - only charge if pending
      if (payout.status !== PayoutState.PENDING) {
        this.logger.log('Fee already charged, skipping');
        return em.findOne(PayoutFee, { payoutId });
      }

      // Create fee
      const fee = em.create(PayoutFee, {
        payoutId: payout.id,
        amount: payout.amount * 0.005
      });
      await em.save(fee);

      // Update state atomically
      payout.status = PayoutState.FEE_CHARGED;
      await em.save(payout);

      return fee;
    });
  }
}

Pros:

✅ State machine is easy to reason about
✅ Transaction guarantees consistency
✅ No separate idempotency key needed
✅ Works well for complex workflows

Cons:

❌ Tight coupling with payout entity
❌ Pessimistic lock can slow things down
❌ Harder to scale horizontally

Comparison Table

┌────────────┬─────────────┬──────────────┬─────────────┐
│ Pattern    │ Complexity  │  Performance │   Use Case  │
├────────────┼─────────────┼──────────────┼─────────────┤
│ Unique     │   Simple    │    Fast      │ ✅ Default  │
│ Constraint │   ⭐⭐       │    ⭐⭐⭐     │   choice    │
├────────────┼─────────────┼──────────────┼─────────────┤
│ Redis Lock │   Complex   │    Medium    │ High conc.  │
│            │   ⭐⭐⭐     │    ⭐⭐       │ multi-server│
├────────────┼─────────────┼──────────────┼─────────────┤
│ State      │   Medium    │    Fast      │ Complex     │
│ Machine    │   ⭐⭐⭐     │    ⭐⭐⭐     │ workflows   │
└────────────┴─────────────┴──────────────┴─────────────┘

Recommendation: Start with Pattern #1 (Unique Constraint). It’s simple, reliable, and covers 90% of use cases.

Critical: Webhook Security

Before processing ANY webhook, you MUST verify its signature:

@Post('webhooks/payment-provider')
async handleWebhook(
  @Body() event: WebhookEvent,
  @Headers('x-webhook-signature') signature: string
) {
  // Step 1: ALWAYS verify signature first
  const isValid = this.verifyWebhookSignature(
    JSON.stringify(event),
    signature,
    process.env.WEBHOOK_SECRET
  );

  if (!isValid) {
    throw new UnauthorizedException('Invalid webhook signature');
  }

  // Step 2: Process webhook (idempotent)
  await this.processWebhook(event);
}

private verifyWebhookSignature(
  payload: string,
  signature: string,
  secret: string
): boolean {
  const hmac = crypto.createHmac('sha256', secret);
  const digest = hmac.update(payload).digest('hex');

  return crypto.timingSafeEqual(
    Buffer.from(signature),
    Buffer.from(digest)
  );
}

Why this is critical:

❌ Without verification:
- Attackers can send fake webhooks
- Charge fees arbitrarily
- Manipulate balances
- Steal money

✅ With verification:
- Only legitimate provider can send webhooks
- Webhooks are cryptographically verified
- System is secure

Never skip signature verification in production!

Handling Out-of-Order Webhooks

Webhooks can arrive out of order:

Timeline (reality):
10:00 - Payout created (webhook #1)
10:01 - Payout processing (webhook #2)
10:02 - Payout completed (webhook #3)

Timeline (webhooks arrive):
10:00:10 - Webhook #1 arrives ✅
10:01:05 - Webhook #3 arrives ❌ (too early!)
10:01:15 - Webhook #2 arrives ❌ (delayed)

Solution: Use timestamp or version number:

async processPayoutWebhook(event: PayoutWebhookEvent) {
  const payout = await this.payoutRepo.findOne(event.payoutId);

  // Ignore outdated webhooks
  if (event.timestamp <= payout.lastWebhookTimestamp) {
    this.logger.log('Outdated webhook, ignoring');
    return;
  }

  // Process and update timestamp
  await this.updatePayoutStatus(payout, event);
  payout.lastWebhookTimestamp = event.timestamp;
  await this.payoutRepo.save(payout);
}

Idempotency Window: How Long to Store Keys?

Question: How long should we keep idempotency keys?

Recommended: 24-72 hours

Why 24-72 hours?
✅ Long enough for all webhook retry scenarios
✅ Most providers stop retrying after 24 hours
✅ Prevents database bloat from old keys

Clean up old keys with a cron job:

@Cron('0 0 * * *') // Daily at midnight
async cleanupOldIdempotencyKeys() {
  const threeDaysAgo = subDays(new Date(), 3);

  await this.db.query(`
    DELETE FROM payout_fees
    WHERE created_at < $1
  `, [threeDaysAgo]);
}

Most payment providers (Stripe, PayPal) stop retrying webhooks after 3 days, so this window is safe.

Testing Idempotency

Local Testing: Simulate Webhooks with curl

Before writing automated tests, verify your implementation works locally:

# Step 1: Start your server locally
npm run dev

# Step 2: Send the same webhook 5 times concurrently
for i in {1..5}; do
  curl -X POST http://localhost:3000/webhooks/payout \
    -H "Content-Type: application/json" \
    -H "x-webhook-signature: test_signature" \
    -d '{
      "id": "evt_test_duplicate",
      "payoutId": "payout_123",
      "amount": 10000,
      "status": "completed"
    }' &
done
wait

# Step 3: Check database - should have only 1 fee
psql -d mydb -c "SELECT COUNT(*) FROM payout_fees WHERE idempotency_key = 'evt_test_duplicate';"
# Expected: 1 (not 5!)

Testing with a real payment provider:

# Use ngrok to expose local server
ngrok http 3000

# Update webhook URL in payment provider dashboard to:
https://your-ngrok-url.ngrok.io/webhooks/payout

# Trigger test webhook from provider dashboard
# Then check your logs and database

Unit Test: Duplicate Webhooks

describe('PayoutFeeService', () => {
  it('should charge fee only once for duplicate webhooks', async () => {
    const payoutId = 'payout_123';
    const idempotencyKey = 'evt_abc';
    const amount = 50;

    // First webhook
    const fee1 = await service.chargePayoutFee(
      payoutId,
      amount,
      idempotencyKey
    );

    // Duplicate webhook (same idempotency key)
    const fee2 = await service.chargePayoutFee(
      payoutId,
      amount,
      idempotencyKey
    );

    // Should return same fee object
    expect(fee1.id).toBe(fee2.id);

    // Should only have 1 record in database
    const count = await feeRepo.count({ payoutId });
    expect(count).toBe(1);
  });
});

Integration Test: Concurrent Webhooks

describe('PayoutWebhookController (Integration)', () => {
  it('should handle 10 concurrent duplicate webhooks', async () => {
    const event = {
      id: 'evt_duplicate_test',
      payoutId: 'payout_123',
      amount: 10000,
      status: 'completed'
    };

    // Send 10 webhooks concurrently
    const promises = Array(10).fill(null).map(() =>
      request(app.getHttpServer())
        .post('/webhooks/payout')
        .set('x-webhook-signature', generateSignature(event))
        .send(event)
        .expect(200)
    );

    await Promise.all(promises);

    // Should only create 1 fee
    const fees = await feeRepo.find({
      payoutId: 'payout_123'
    });
    expect(fees).toHaveLength(1);
    expect(fees[0].amount).toBe(50); // 0.5% of 10000
  });
});

Manual Testing

# Test duplicate webhooks manually
for i in {1..5}; do
  curl -X POST http://localhost:3000/webhooks/payout \
    -H "Content-Type: application/json" \
    -H "x-webhook-signature: <signature>" \
    -d '{
      "id": "evt_test_123",
      "payoutId": "payout_xyz",
      "amount": 10000,
      "status": "completed"
    }' &
done
wait

# Check database
psql -d mydb -c "
  SELECT COUNT(*) FROM payout_fees
  WHERE payout_id = 'payout_xyz';
"
# Expected: 1

Best Practices Checklist

Webhook Idempotency Checklist:

✅ Verify webhook signature BEFORE processing
✅ Use idempotency key (event.id) with unique constraint
✅ Handle duplicate key errors gracefully
✅ Return 200 OK even for duplicate webhooks
✅ Log duplicate webhooks for monitoring
✅ Handle out-of-order webhooks (timestamp check)
✅ Test with concurrent duplicate webhooks
✅ Clean up old idempotency keys (24-72h window)
✅ Use transactions for multi-step operations
✅ Monitor duplicate webhook rate

Security Checklist:

✅ Verify HMAC signature on every webhook
✅ Use timing-safe comparison (crypto.timingSafeEqual)
✅ Store webhook secret in environment variables
✅ Rotate webhook secrets periodically
✅ Rate limit webhook endpoints
✅ Log failed signature verifications (potential attacks)

The Results

After implementing idempotency with Pattern #1:

✅ Zero duplicate charges since deployment (6 months and counting)

✅ Handled 50,000+ webhooks without issues

✅ Processed 1,247 duplicate webhooks correctly (2.5% duplicate rate)

✅ No production incidents related to webhook processing

Monitoring metrics:

- Webhook success rate: 99.97%
- Duplicate webhook rate: 2.5% (normal)
- Average processing time: 120ms
- Failed signature verifications: 3 (potential attacks, blocked)

Key Takeaways

1️⃣ Payment providers send webhooks “at-least-once” → Prepare for duplicates

2️⃣ Idempotency key = event.id → Use it with unique constraint

3️⃣ Database unique constraint is simple & effective → Pattern #1 covers 90% of cases

4️⃣ ALWAYS verify webhook signatures → Security is not optional

5️⃣ Test with concurrent requests → Race conditions are real

6️⃣ Clean up old keys → 24-72 hour window is enough

7️⃣ Log duplicate webhooks → Monitor for abnormal patterns

8️⃣ Return idempotent responses → Same input = same output

Remember: “Webhooks being sent multiple times is normal. Your system processing them exactly once is your responsibility.”


Resources

Have you dealt with webhook idempotency issues? What patterns have worked for you? Share your experience in the comments!