Retries & Auto-Disable
kirim.dev guarantees at-least-once delivery via an 8-attempt retry
pipeline with exponential backoff and jitter, spread across a
~24-hour total window. After the 8th attempt fails, the delivery is
marked failed and parks in the dead-letter queue. After 24
consecutive failed deliveries on the same subscription, the
subscription is auto-disabled.
Retry schedule
Section titled “Retry schedule”| Attempt | Delay before attempt |
|---|---|
| 1 | immediate |
| 2 | 10 s |
| 3 | 30 s |
| 4 | 2 m |
| 5 | 10 m |
| 6 | 1 h |
| 7 | 6 h |
| 8 | 24 h |
Each delay carries ±20% jitter to avoid thundering herds on shared infrastructure. Per-attempt HTTP timeout: 10 seconds.
After attempt 8 fails, the delivery row status flips to failed and
sits in the dead-letter queue until you either:
- Replay it via the API or dashboard (see Replay below), or
- The daily purge removes it 30 days after creation.
HTTP status semantics
Section titled “HTTP status semantics”| Response | Treated as | Retried? |
|---|---|---|
| 2xx (200, 201, 204, …) | success | — |
| 3xx | failure (kirim.dev does not follow redirects) | yes |
| 4xx (most) | permanent failure | no — marked failed immediately |
4xx 408 (request timeout) | transient failure | yes |
4xx 429 (rate limited) | transient failure | yes — Retry-After honoured |
| 5xx | transient failure | yes |
| Network error / DNS / TLS / connection refused | transient failure | yes |
| kirim.dev timeout (>10 s) | transient failure | yes |
The “4xx (most) → failed immediately” rule reflects reality: a 401, 403, or 404 from your server almost always means a config bug (wrong URL, bad gateway auth) that retrying can’t fix. Fix the config, then replay.
Recommended endpoint behaviour
Section titled “Recommended endpoint behaviour”-
Ack within 1-2 seconds. Persist the raw payload (and the
X-Kirim-Event-Id) inside a quick DB write, then return 200. Hand off heavy processing to your own queue.app.post('/webhooks/kirim', async (req, res) => {await persistRawEvent(req.headers['x-kirim-event-id'], req.body)res.status(200).send('ok')// Async processing kicks off via your own worker.}) -
Return 503 if you’re overloaded rather than 200-then-drop. 503 triggers a retryable failure; you’ll get the same payload again after the backoff.
-
Never return 2xx for a payload you couldn’t store. Acking prematurely breaks the at-least-once contract on your side.
Idempotency on your side
Section titled “Idempotency on your side”Because deliveries are at-least-once, the same event can arrive more than once — typically when a retry fires after your server processed the original but didn’t respond in time.
Dedupe on the X-Kirim-Event-Id header. The same id always
represents the same logical event, regardless of attempt number or
whether the delivery is a manual replay.
const fresh = await redis.set(`kirim:evt:${eventId}`, '1', { EX: 604800, NX: true })if (!fresh) return res.status(200).send('duplicate-ack')See Overview → Dedupe for the full pattern.
Auto-disable
Section titled “Auto-disable”After 24 consecutive failed deliveries on the same subscription
(regardless of which retry path each took), kirim.dev flips the
subscription to status: 'disabled':
{ "id": "wbs_…", "object": "webhook_subscription", "status": "disabled", "disabled_reason": "auto_disabled_max_consecutive_failures", "consecutive_failures": 24, "last_failure_at": "2026-05-23T10:00:00Z", "...": "..."}disabled_reason is one of:
| Value | Meaning |
|---|---|
auto_disabled_max_consecutive_failures | Hit the 24-in-a-row cap. |
auto_disabled_tls_expired | Your TLS cert expired and kirim.dev gave up trying. |
manually_disabled | A teammate (or you) flipped it via the API/dashboard. |
While disabled, no new events fan out to this subscription. New events that would have been sent are simply not enqueued — they are not buffered for later delivery, because re-enabling would otherwise trigger a thundering herd.
Org admins receive an email notification on auto-disable.
Re-enabling
Section titled “Re-enabling”Once your endpoint is healthy:
curl -X PATCH \ https://api.kirim.chat/v1/webhook_subscriptions/wbs_… \ -H "Authorization: Bearer $KIRIM_KEY" \ -H "Content-Type: application/json" \ -d '{ "status": "active" }'await kirim.webhookSubscriptions.update('wbs_…', { status: 'active',})httpx.patch( "https://api.kirim.chat/v1/webhook_subscriptions/wbs_…", headers={"Authorization": f"Bearer {os.environ['KIRIM_KEY']}"}, json={"status": "active"},).raise_for_status()Re-enabling resets consecutive_failures to 0. Future deliveries
resume immediately. Replaying old failed deliveries is opt-in —
they don’t fire automatically, so a stale endpoint doesn’t drown
itself the second it comes back online.
Pausing manually
Section titled “Pausing manually”To temporarily stop deliveries without losing the subscription (e.g. during a planned maintenance window):
await kirim.webhookSubscriptions.update('wbs_…', { status: 'paused' })Paused subscriptions don’t accumulate failed deliveries — kirim.dev
drops events that would have fanned out to them. Resume with
{ status: 'active' }.
Replaying failed deliveries
Section titled “Replaying failed deliveries”The dead-letter queue keeps failed deliveries for 30 days. Inspect, filter, and replay them once your endpoint is healthy again.
List failed deliveries
Section titled “List failed deliveries”curl -G https://api.kirim.chat/v1/webhook_deliveries \ --data-urlencode "status=failed" \ --data-urlencode "subscription_id=wbs_…" \ --data-urlencode "limit=50" \ -H "Authorization: Bearer $KIRIM_KEY"// Paginate failed deliveries for a subscription.for await (const delivery of kirim.webhookDeliveries.list({ status: 'failed', subscription_id: 'wbs_…',})) { console.log(delivery.id, delivery.attempt_count, delivery.last_response_status)}resp = httpx.get( "https://api.kirim.chat/v1/webhook_deliveries", params={"status": "failed", "subscription_id": "wbs_…", "limit": 50}, headers={"Authorization": f"Bearer {os.environ['KIRIM_KEY']}"},)resp.raise_for_status()for delivery in resp.json()["data"]: print(delivery["id"], delivery["attempt_count"])Replay a single delivery
Section titled “Replay a single delivery”curl -X POST \ https://api.kirim.chat/v1/webhook_deliveries/wbd_…/replay \ -H "Authorization: Bearer $KIRIM_KEY"const replay = await kirim.webhookDeliveries.replay('wbd_…')console.log(replay.id, replay.replayed_from)httpx.post( f"https://api.kirim.chat/v1/webhook_deliveries/{delivery_id}/replay", headers={"Authorization": f"Bearer {os.environ['KIRIM_KEY']}"},).raise_for_status()Bulk replay
Section titled “Bulk replay”curl -X POST \ https://api.kirim.chat/v1/webhook_deliveries/bulk_replay \ -H "Authorization: Bearer $KIRIM_KEY" \ -H "Content-Type: application/json" \ -d '{ "subscription_id": "wbs_…", "status": "failed", "created_after": "2026-05-20T00:00:00Z", "created_before": "2026-05-23T00:00:00Z" }'const result = await kirim.webhookDeliveries.bulkReplay({ subscription_id: 'wbs_…', status: 'failed', created_after: '2026-05-20T00:00:00Z', created_before: '2026-05-23T00:00:00Z',})
console.log(`enqueued ${result.enqueued} (capped: ${result.capped})`)resp = httpx.post( "https://api.kirim.chat/v1/webhook_deliveries/bulk_replay", headers={"Authorization": f"Bearer {os.environ['KIRIM_KEY']}"}, json={ "subscription_id": "wbs_…", "status": "failed", "created_after": "2026-05-20T00:00:00Z", "created_before": "2026-05-23T00:00:00Z", },)resp.raise_for_status()print(resp.json()["data"])Cap: 1000 deliveries per bulk_replay call. If your filter
matches more, the response capped field is true — paginate via
created_after/created_before and call again.
Suggested recovery playbook
Section titled “Suggested recovery playbook”-
Spot the auto-disable via the email notification, the dashboard banner, or by polling
GET /v1/webhook_subscriptions/{id}forstatus: 'disabled'. -
Fix the underlying issue. TLS cert renewal, infra rollback, bug fix — whatever the failed deliveries’ response bodies indicate.
-
Test against a single delivery first. Pick one failed
wbd_…andreplayit. Check your endpoint returned 2xx. -
Re-enable the subscription with
{ status: 'active' }. -
Bulk-replay the backlog filtered to the outage window.
-
Set up monitoring — alert on
subscription.status != 'active'and on risingconsecutive_failuresso you catch the next outage before it auto-disables.
Observability
Section titled “Observability”The dashboard’s Developers → Webhook Deliveries page shows every delivery (last 30 days), filterable by subscription, status, event type, and date range. Each row exposes:
- Attempt count + last/next attempt timestamps
- Response status + first 1 KB of the response body
- Full payload (pretty-printed JSON, copy-to-clipboard)
- Replay button
The same data is queryable via GET /v1/webhook_deliveries — see the
API reference.