Postgres Connection Exhaustion with Vercel Fluid

April 14, 2025

If you’re using Vercel’s new runtime (Fluid), be aware that it scales very aggressively under load. This includes traffic spikes like bot scraping. Fluid’s autoscaling does not expose tools to gate dynamic concurrency. It does not wait for resources like a connection pooler might. Instead, it scales up execution units freely and relies on your downstream infrastructure to absorb the impact.

Vercel says Fluid uses Lambda infrastructure under the hood but avoids cold-starts by reusing execution environments for concurrent requests. That makes it behave more like a traditional long-running process than a true Lambda. Based on that, I initially assumed Fluid would conservatively ramp up concurrency. Since React Server Components (RSC) don’t follow a typical Request-Response lifecycle like Express or Hono, I followed the common pattern of initializing app-wide resources (like database pools) at the module level and letting them persist across invocations.

This assumption broke down during scraping traffic with bursts of 1000 simultaneous requests. Even though Fluid reuses workers, it still scales up to handle incoming load — there’s no guarantee that a single module-level pool will serve all requests. Furthermore it’s completely opaque how many total processes are going to be created, therefore the number of used connections. Ideally Vercel Fluid would give developers more detailed information on auto-scaling and some configuration levers to pull to deal with backpressure.

Backpressure is the ability of a system to slow down or reject new work when downstream resources (like a database) are at capacity. Without it, load simply builds up and overflows, often resulting in dropped connections or timeouts. In serverless environments, backpressure is not built-in — it must be explicitly implemented at the application or infrastructure level.

My initial pattern looked like this:

// Example: Global pool — not concurrency-safe under heavy load
const drizzle = drizzle({
  client: new Pool({ connectionString: env.DATABASE_URL }),
  schema,
});

I’ve since moved to a safer execution model using a withDb(...) pattern that acquires and releases a connection for each task from a shared pool. This gives me explicit control over connection concurrency and lets me fail gracefully if the pool is exhausted:

import { drizzle } from "drizzle-orm/node-postgres";
import { Pool, type PoolClient } from "pg";

const pool = new Pool({
  connectionString: env.DATABASE_URL,
  max: 5,
  idleTimeoutMillis: 3000,
  connectionTimeoutMillis: 2000,
});

export const withDb = async <T>(task: (db: Drizzle) => Promise<T>): Promise<T> => {
  const client = await pool.connect();
  const db = drizzle({ client, schema });
  try {
    return await task(db);
  } finally {
    client.release();
  }
};

This change gives me backpressure control at the app layer and ensures pooler connections are used and released effectively. If all connections are busy, the request waits or fails quickly, instead of flooding the database. If, on the other hand, Vercel’s dynamic auto-scaling kicks in during a spike, more and more processes will be created to deal with the traffic and the tradeoff wherein I sacrifice time to first byte is lost. I still think a module level app pooler is a useful pattern as it keeps connections warm at the process level, reducing cold starts and compliments the long-running process model Vercel Fluid is using.

I’m using Supabase Postgres. It’s been great overall — reliable, straightforward, nice DX. But its connection pooler (Supavisor) is hard-capped based on plan size. On the Pro plan, the XL instance allows ~1,000 concurrent client connections. Upgrading that limit means moving up instance sizes, which quickly becomes expensive. The limit feels more like pricing enforcement than a reflection of backend capacity.

There’s no way in Supabase to implement back pressure at the connection pooler level. There’s no “wait for a slot” option when Supavisor pooler is full and pooler connections haven’t been released. If your traffic spikes, the pooler hard-refuses new connections.

My solution: I deployed my own PgBouncer proxy on Fly.io. It runs in the same AWS region as Supabase (us-east-1) and Vercel (Ashburn/IAD). It accepts tens of thousands of connections from Vercel and fans them into a much smaller number of pooled connections to Supabase. This allows me to independently scale app layer concurrency and DB concurrency.

The edoburu/pgbouncer Docker image works well. You can configure it using environment variables, and deploy to Fly.io as a simple TCP service. I used Fly’s internal_port and allowed_public_ports features to expose port 5432 directly. I inject the Supabase connection string via DATABASE_URL.

This setup gives you proper decoupling. Vercel can scale as fast as it wants, and the pooler absorbs the spike. Supabase sees only as many connections as PgBouncer decides to allow.

Serverless has made a tradeoff: instant scale, opaque backpressure. That’s fine for stateless compute or HTTP caching layers. But it breaks when applied to I/O-bound services like databases, especially when those services don’t support transparent queuing or throttling. The app layer has to take responsibility — but the tooling is not built for that yet.

I considered Neon early on — it offers up to 10,000 pooled connections per project and is designed to handle these serverless-style workloads. In retrospect, I probably should have gone with it. It feels more aligned with Vercel’s runtime model. But I bet on Supabase’s scale and long-term viability. The Postgres has been solid. The pricing model and connection strategy are the pain points.

If you’re using Fluid or serverless runtimes in general, and your app is talking to Postgres, you need a strategy to buffer dynamic concurrency. Either use a provider that exposes a high-fanout connection layer (like Neon), or insert your own PgBouncer proxy.

For reference here’s the fly.toml

app = 'pgbouncer'
primary_region = 'iad'

[experimental]
  auto_rollback = true

[build]

[env]
  AUTH_TYPE = "plain"
  DEFAULT_POOL_SIZE = '100'
  MAX_CLIENT_CONN = '10000'
  POOL_MODE = 'transaction'
  SERVER_RESET_QUERY = 'DISCARD ALL'

[http_service]
  internal_port = 5432
  force_https = true
  auto_stop_machines = 'stop'
  auto_start_machines = true
  min_machines_running = 1
  processes = ['app']

[[services]]
  protocol = 'tcp'
  internal_port = 5432

  [[services.ports]]
    port = 5432

[[vm]]
  memory = '1gb'
  cpu_kind = 'shared'
  cpus = 1

... and the Dockerfile

FROM edoburu/pgbouncer:latest
EXPOSE 5432