Provider outage response — Kobbopay

01

Objective

Maintain operational control when provider API or webhook delivery degrades—using freeze patterns and customer communication without inventing uptime guarantees.

02

Prerequisites

Health checks on webhook recency and API error rates.
Reconciliation freeze procedure documented.
Status communication templates that avoid unverified promises.

03

Operational signals

Webhook recency lag beyond internal threshold.
API read failures for payment status.
Growing stuck Pending/Paid populations.

04

Decision points

Enable reconciliation freeze for auto-posting.
Pause high-risk fulfillment.
Switch to manual status checks if available.

05

Escalation paths

Operations → provider support with correlation ids.
Customer support → operations for ticket surge.

06

Failure modes

Assuming outage equals payment failure without evidence.
Disabling verification to accept unauthenticated callbacks.

07

Recovery patterns

Backfill provider plane from event log after recovery.
Re-run matchers for outage window.
Clear freeze with finance sign-off.

Retries are normal. Webhook delivery is at-least-once. Design consumers to tolerate duplicates and out-of-order arrivals where possible.
Asynchronous by design. Payers, chains, and your servers operate on different clocks. UI and finance should not assume synchronous finality.
Eventual consistency. API reads, webhooks, and portal views may briefly diverge during transitions. Reconciliation jobs exist to converge truth.

Walkthroughs: /operations