Postgres Connection Exhaustion with Vercel Fluid

April 14, 2025

UPDATE: This post was updated to clarify the nature of the problem after a great discussion with David (@activenode), who independently ran his own tests and came to the same conclusions. The core issue is that while in-app connection pooling provides backpressure within a single process, it doesn't solve the problem of Vercel creating multiple processes under load, each with its own pool. This makes a centralized connection pooler like PgBouncer essential.

If you’re using Vercel’s new runtime (Fluid), be aware that it scales very aggressively under load. This includes traffic spikes like bot scraping. Fluid’s autoscaling does not expose tools to gate dynamic concurrency. It does not wait for resources like a connection pooler might. Instead, it scales up execution units freely and relies on your downstream infrastructure to absorb the impact.

Vercel says Fluid uses Lambda infrastructure under the hood but avoids cold-starts by reusing execution environments for concurrent requests. That makes it behave more like a traditional long-running process than a true Lambda. Based on that, I initially assumed Fluid would conservatively ramp up concurrency. Since React Server Components (RSC) don’t follow a typical Request-Response lifecycle like Express or Hono, I followed the common pattern of initializing app-wide resources (like database pools) at the module level and letting them persist across invocations.

This assumption broke down during scraping traffic with bursts of 1000 simultaneous requests. My initial pattern looked like this:

// Example: Global pool — not concurrency-safe under heavy load
const drizzle = drizzle({
  client: new Pool({ connectionString: env.DATABASE_URL }),
  schema,
});

This is dangerous because there’s no guarantee a single module-level pool will serve all requests, and it's completely opaque how many total processes will be created—and therefore how many connections will be used.

My first step was to implement better in-process backpressure. I moved from a global, module-level pool to a withDb(...) pattern that acquires and releases a connection from a process-specific pool for each task.

import { drizzle } from "drizzle-orm/node-postgres";
import { Pool, type PoolClient } from "pg";

const pool = new Pool({
  connectionString: env.DATABASE_URL,
  max: 5, // This limit is per-process, not global!
  idleTimeoutMillis: 3000,
  connectionTimeoutMillis: 2000,
});

export const withDb = async <T>(task: (db: Drizzle) => Promise<T>): Promise<T> => {
  const client = await pool.connect();
  const db = drizzle({ client, schema });
  try {
    return await task(db);
  } finally {
    client.release();
  }
};

This pattern is an improvement. It ensures connections are properly released and prevents a single process from overwhelming the database if it gets flooded with concurrent tasks.

However, this does not solve the core problem.

The Realization: Multiple Processes, Multiple Pools

The critical flaw in relying solely on an in-app pooler like the one above is that Vercel Fluid creates multiple, independent execution environments (processes) under load.

As my own testing and independent reports from others in the community have confirmed, a traffic spike doesn't just funnel requests into one process—it causes Vercel to spin up 1-n new processes. Each of these processes initializes its own instance of the Pool.

If my pool is configured with max: 5, and Vercel creates 10 processes to handle a scrape, my application will attempt to open 50 concurrent connections (10 processes x 5 connections/process).

While the withDb pattern provides backpressure within each process, the aggregate connection count still floods the database and hits the provider's hard limits. This makes it clear that the application layer, on its own, cannot safely manage database concurrency in this environment.

Backpressure is the ability of a system to slow down or reject new work when downstream resources (like a database) are at capacity. Without it, load simply builds up and overflows, often resulting in dropped connections or timeouts. In serverless environments, backpressure is not built-in — it must be explicitly implemented at the application or infrastructure level.

The True Solution: A Centralized Connection Pooler

To solve this problem of aggregate connection pressure, I needed a centralized gatekeeper that could sit between Vercel's army of ephemeral functions and my database.

My solution: I deployed my own PgBouncer proxy on Fly.io. It runs in the same AWS region as Supabase (us-east-1) and Vercel (Ashburn/IAD). It accepts tens of thousands of connections from Vercel and fans them into a much smaller number of pooled connections to Supabase. This allows me to independently scale app layer concurrency and DB concurrency.

I’m using Supabase Postgres. It’s been great overall — reliable, straightforward, nice DX. But its connection pooler (Supavisor) is hard-capped based on plan size. On the Pro plan, the XL instance allows ~1,000 concurrent client connections. Upgrading that limit means moving up instance sizes, which quickly becomes expensive. The limit feels more like pricing enforcement than a reflection of backend capacity.

There’s no way in Supabase to implement back pressure at the connection pooler level. There’s no “wait for a slot” option when the Supavisor pooler is full. If your traffic spikes, the pooler hard-refuses new connections.

My PgBouncer proxy on Fly.io fixes this. The edoburu/pgbouncer Docker image works well. You can configure it using environment variables and deploy it to Fly.io as a simple TCP service. I used Fly’s internal_port and allowed_public_ports features to expose port 5432 directly. I inject the Supabase connection string via DATABASE_URL.

This setup gives you proper decoupling. Vercel can scale as fast as it wants, and the pooler absorbs the spike. Supabase sees only as many connections as PgBouncer decides to allow.

Serverless has made a tradeoff: instant scale, opaque backpressure. That’s fine for stateless compute or HTTP caching layers. But it breaks when applied to I/O-bound services like databases, especially when those services don’t support transparent queuing or throttling. The app layer has to take responsibility — but the tooling is not built for that yet.

I considered Neon early on — it offers up to 10,000 pooled connections per project and is designed to handle these serverless-style workloads. In retrospect, I probably should have gone with it. It feels more aligned with Vercel’s runtime model. But I bet on Supabase’s scale and long-term viability. The Postgres has been solid. The pricing model and connection strategy are the pain points.

If you’re using Fluid or serverless runtimes in general, and your app is talking to Postgres, you need a strategy to buffer dynamic concurrency. Either use a provider that exposes a high-fanout connection layer (like Neon), or insert your own PgBouncer proxy.

For reference here’s the fly.toml:

app = 'pgbouncer'
primary_region = 'iad'

[experimental]
  auto_rollback = true

[build]

[env]
  AUTH_TYPE = "plain"
  DEFAULT_POOL_SIZE = '100'
  MAX_CLIENT_CONN = '10000'
  POOL_MODE = 'transaction'
  SERVER_RESET_QUERY = 'DISCARD ALL'

[http_service]
  internal_port = 5432
  force_https = true
  auto_stop_machines = 'stop'
  auto_start_machines = true
  min_machines_running = 1
  processes = ['app']

[[services]]
  protocol = 'tcp'
  internal_port = 5432

  [[services.ports]]
    port = 5432

[[vm]]
  memory = '1gb'
  cpu_kind = 'shared'
  cpus = 1

... and the Dockerfile:

FROM edoburu/pgbouncer:latest
EXPOSE 5432