Scaling Connext to a billion API requests a day

Amy Chase

A year ago Connext handled around 25 million requests a day. Last week it crossed a billion. The interesting part isn't the number — it's that our on-call rotation didn't get worse along the way.

Push work to the edge

The single biggest lever was moving routing, auth and rate limiting to the edge. Requests that can be answered or rejected close to the client never touch the core.

  1. Terminate TLS and authenticate at the edge.
  2. Apply rate limits before any business logic runs.
  3. Only forward what genuinely needs the origin.

Make the slow path observable

Every request carries a trace. When latency spikes, we don't guess — we follow the signal:

const span = tracer.start('connext.route')
span.set('tenant', ctx.tenantId)
try {
  return await forward(ctx)
} finally {
  span.end()
}

If you can't see the slow path, you'll spend your nights guessing at it.

The result is a system that scales with traffic instead of with headcount — which is the only kind of scaling that actually lasts.