Stripe Webhook Retries: A Production Guide

When your server is down and Stripe fires a payment_intent.succeeded, what happens to the event? Does Stripe retry forever? For how long? And how do you reconcile the events you missed?

This is the single most common question I hear from developers integrating Stripe — and it matters because a webhook you fail to process is revenue your product can't act on. A customer just paid. Your fulfillment job never fires. Their dashboard never updates.

TL;DR

Stripe retries for 3 days with exponential backoff, then stops forever
Return 200 fast, do work async, log every event before processing, be idempotent
If you don't have time to build all of this, a webhook relay gives you retry + log + replay with a single URL change

How Stripe retries

Stripe expects your webhook endpoint to return a 2xx status code within 30 seconds. Anything else — including timeouts, 5xx, or network failures — counts as a failed delivery.

On failure, Stripe retries the event. The retry schedule is exponential backoff up to 3 days:

First retry: within minutes
Next retries: increasingly spaced out
Final window: around 72 hours from the original event

After 3 days, Stripe marks the event as failed and stops retrying. It does not wake you up. It does not send a fallback email. The event is simply gone from your delivery queue — though you can still see it in the Stripe Dashboard under Developers → Events.

The three failure modes that bite

1. Your endpoint is slow. Stripe counts any response >30s as a failure. If your webhook handler synchronously writes to a slow database, syncs to Salesforce, and sends a customer email, you're flirting with the timeout on every event.

2. Your deploy window. Your server restarts during a deploy, misses ~10 events, and comes back up. Stripe retries those events — but only up to 3 days. If you don't catch it, those events silently die.

3. A bug eats 500s. You push a deploy that throws on a specific event type. Stripe retries, but it keeps hitting the same error. After 3 days Stripe gives up. Now you have a quiet data gap.

In all three cases, Stripe is doing the right thing. The failure is in your side of the pipe.

Four things production-grade integrations do

1. Return 200 fast, do the work async

Don't do heavy work in the webhook handler. Accept the event, write it to a queue or durable log, return 200, and process asynchronously. This keeps you far from the 30-second timeout and decouples delivery from business logic.

export async function POST(req: Request) {
  const event = await verifyStripeSignature(req);
  await queue.publish(event);   // durable write
  return new Response(null, { status: 200 });
}

2. Log every event before processing

Persist the raw payload + headers before you touch business logic. If your logic throws, you can replay from the log. Stripe only keeps failed-delivery events for 3 days — your log is the only thing that lets you recover after that window.

3. Idempotency on your side

Stripe sends the same event.id on retries. Your handler must be idempotent: if you've already processed evt_1H..., don't process it again. A simple pattern:

INSERT INTO processed_events (stripe_event_id, processed_at)
VALUES ($1, NOW())
ON CONFLICT (stripe_event_id) DO NOTHING;

If the insert returns 0 rows affected, you've seen this event before. Skip.

4. Alerting on failures

If 5 events in a row fail, you want to know within minutes — not when a customer emails support. Track consecutive failures and page yourself.

Where AnyHook fits

AnyHook sits in front of your server and solves this at the transport layer. You point Stripe at in.anyhook.net/you/stripe instead of your origin, and we:

Return 200 to Stripe in <50ms — so the 30-second window is never a problem
Log every event with full headers and body, encrypted at rest
Retry with exponential backoff for longer than Stripe does, with per-plan retry counts
One-click replay — if your server was down for an hour, replay the events from the last hour with one click. Replays don't count against your quota
Alerting on failure streaks — we email you at 1, 5, and 20 consecutive failures, and auto-pause after 20 to stop retry storms

No SDK. No code changes. One URL swap.

Takeaway

If you're losing Stripe events today, the fix is usually not "bigger retries" — it's persistence before processing so you always have something to replay from.