
A year ago Connext handled around 25 million requests a day. Last week it crossed a billion. The interesting part isn't the number — it's that our on-call rotation didn't get worse along the way.
Push work to the edge
The single biggest lever was moving routing, auth and rate limiting to the edge. Requests that can be answered or rejected close to the client never touch the core.
- Terminate TLS and authenticate at the edge.
- Apply rate limits before any business logic runs.
- Only forward what genuinely needs the origin.
Make the slow path observable
Every request carries a trace. When latency spikes, we don't guess — we follow the signal:
const span = tracer.start('connext.route')
span.set('tenant', ctx.tenantId)
try {
return await forward(ctx)
} finally {
span.end()
}
If you can't see the slow path, you'll spend your nights guessing at it.
The result is a system that scales with traffic instead of with headcount — which is the only kind of scaling that actually lasts.
