Setting Up asyncpg Connection Pool Size for High Concurrency
Set pool_size to the result of max_connections ÷ worker_count − max_overflow so that every deployed process can reach its burst limit without exhausting PostgreSQL's global connection ceiling; then wire pool_pre_ping=True and a realistic pool_recycle to keep those connections healthy across the lifetime of your application — full engine initialization patterns are covered in Configuring Async Engines and Connection Pools.
Quick Answer
Before — a naively oversized pool that ignores worker count and PostgreSQL limits:
# BEFORE: naive config — blows past max_connections with 4 workers
from sqlalchemy.ext.asyncio import create_async_engine
engine = create_async_engine(
"postgresql+asyncpg://app_user:secret@db.example.com:5432/shop",
pool_size=50, # × 4 workers = 200 connections — hits max_connections exactly
max_overflow=20, # × 4 workers = 80 more — FATAL: too many connections
pool_timeout=30,
# pool_pre_ping missing — stale sockets silently fail
# pool_recycle missing — sockets dropped by load balancer after idle timeout
)
After — correctly calculated pool that leaves headroom for admin connections and PgBouncer:
# AFTER: sized for 4 Uvicorn workers, PostgreSQL max_connections=200, 10 reserved for admin
# Formula: pool_size = floor((max_connections - reserved) / workers) - max_overflow
# = floor((200 - 10) / 4) - 5 = 47 - 5 = 42 → round down conservatively to 40
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker, AsyncSession
from sqlalchemy import text
engine = create_async_engine(
"postgresql+asyncpg://app_user:secret@db.example.com:5432/shop",
pool_size=40,
max_overflow=5, # burst headroom per worker: (40+5)*4 = 180 < 190 cap
pool_timeout=10, # fail fast rather than pile up waiting coroutines
pool_recycle=1800, # recycle before most load balancers drop idle sockets (3600s)
pool_pre_ping=True, # round-trip check on checkout — catches dropped TCP sockets
echo=False,
connect_args={
"command_timeout": 10,
"server_settings": {
"jit": "off",
"statement_timeout": "30000", # ms — kills runaway queries at DB level
"idle_in_transaction_session_timeout": "5000",
},
"statement_cache_size": 200,
},
)
AsyncSessionLocal = async_sessionmaker(
engine, class_=AsyncSession, expire_on_commit=False
)
async def get_session() -> AsyncSession:
async with AsyncSessionLocal() as session:
yield session
The core insight is that (pool_size + max_overflow) × worker_count must stay below max_connections − reserved_slots. Everything else — pre-ping, recycle, statement timeouts — defends that budget against real-world infrastructure surprises.
Execution Context & Async Workflow Integration
Why async changes the sizing math
With a threaded SQLAlchemy setup each OS thread holds a connection for the entire duration of its work, including time spent waiting on network I/O or sleeping between queries. Pool exhaustion is therefore almost certain under load: 32 threads means 32 connections consumed simultaneously even if most are idle waiting on Python code.
Async changes the picture because await yields control back to the event loop. A coroutine that calls await session.execute(...) releases the CPU while PostgreSQL processes the query. During that window another coroutine can borrow a different connection, do its own work, and return that connection before the first coroutine resumes. The pool therefore turns over far faster than a threaded pool of equal size.
This is why an async application with 8 CPU cores commonly runs well with pool_size=10–20, whereas the equivalent threaded app would need 50–100 connections to avoid stalling. The headroom you save translates directly into capacity for other processes — monitoring agents, migration runners, admin psql sessions — to connect without hitting max_connections.
AsyncAdaptedQueuePool vs QueuePool
create_async_engine uses AsyncAdaptedQueuePool by default. It wraps the standard QueuePool but routes checkout and checkin through asyncio primitives instead of threading locks, so waiting coroutines yield the event loop rather than blocking a thread. The practical consequence: pool_timeout is an asyncio-native timeout. If checkout exceeds it, asyncio.TimeoutError is raised and the waiting coroutine is cleanly cancelled — no thread is wasted sitting on a lock.
Because the pool lives inside a single event loop per process, connections are never shared across asyncio tasks concurrently. The pool guarantees that each checked-out connection belongs to exactly one coroutine at a time, which means you do not need any additional locking around session.execute() calls even when hundreds of tasks run concurrently.
Worked sizing example
Consider a realistic production deployment:
- PostgreSQL
max_connections: 200 - Reserved for admin / monitoring: 10 (leaving 190 usable)
- Uvicorn workers: 4 (each is a separate OS process with its own event loop)
- Desired max_overflow per worker: 5
pool_size = floor(190 / 4) - max_overflow
= floor(47.5) - 5
= 47 - 5
= 42
Round down to 40 for a safety margin. At peak, total connections reach (40 + 5) × 4 = 180, leaving 10 slots for admin and 10 for unexpected spikes from other tools. An 8-core host running 8 workers with the same PostgreSQL server would use pool_size = floor(190/8) − 5 = 18, totalling (18+5)×8 = 184 connections.
These numbers assume no PgBouncer. When PgBouncer sits between the application and PostgreSQL in transaction-pooling mode, the pool that matters is PgBouncer's own pool_size, not SQLAlchemy's. In that architecture, set SQLAlchemy pool_size generously (matches the expected coroutine concurrency, e.g. 100–200) and keep PgBouncer's PostgreSQL-facing pool tightly sized. See the interaction details in the FAQ below.
For guidance on handling the situations where those limits are breached, see Handling Connection Leaks and Pool Exhaustion.
Resolving Warnings, Errors & Common Mistakes
| Exact error string | Root cause | Production fix |
|---|---|---|
sqlalchemy.exc.TimeoutError: QueuePool limit of size X overflow Y reached, connection timed out, timeout Z | Pool is exhausted — all pool_size + max_overflow connections are checked out and no coroutine returned one within pool_timeout seconds. Often caused by an unfinished transaction held open across an await. | Audit session lifetime: ensure every async with session.begin() block is short and does not await external I/O (HTTP calls, cache writes) while holding a connection. Lower pool_timeout to 5–10 s to surface the problem early in staging. |
asyncpg.exceptions.ConnectionDoesNotExistError: connection was closed in the middle of operation | The underlying TCP socket was silently terminated by a cloud load balancer, AWS RDS proxy idle timeout, or a PostgreSQL tcp_keepalives_idle expiry — but the pool did not know. | Set pool_recycle below the infrastructure idle-timeout (commonly 3600 s on RDS — use 1800). Enable pool_pre_ping=True to issue a lightweight SELECT 1 on checkout and detect dead sockets before handing them to application code. |
asyncpg.exceptions.TooManyConnectionsError: FATAL: remaining connection slots are reserved for non-replication superuser connections | (pool_size + max_overflow) × worker_count has reached or exceeded max_connections. PostgreSQL reserves the last few slots for superusers. | Recalculate pool_size using the formula above. Temporarily increase PostgreSQL max_connections in postgresql.conf (requires restart) or introduce PgBouncer to multiplex many application connections onto fewer PostgreSQL connections. |
asyncpg.exceptions.PostgresConnectionError: connection rejected: pg_hba.conf rejects connection combined with intermittent OSError: [Errno 104] Connection reset by peer | A connection in the pool was created during a brief window when pg_hba.conf or network ACLs permitted it, then the rule changed. Subsequent checkout attempts for new connections that overflow the pool are rejected while recycled ones still work. | Trigger a pool recycle: call await engine.dispose() then recreate sessions. In Kubernetes, this typically means a rolling restart of pods after network policy changes. Add a startup probe that verifies pool connectivity before the pod accepts traffic. |
sqlalchemy.exc.InvalidRequestError: Can't reconnect until invalid transaction is rolled back | Application code caught an exception during a query but did not roll back the transaction before returning the session to the pool. The next user of that connection gets it in a broken state. | Always use async with session.begin() rather than managing begin()/commit()/rollback() manually. The context manager rolls back automatically on any exception. |
Advanced Pool Sizing Optimization
Monitoring pg_stat_activity to detect starvation and auto-tune at runtime
Static pool configuration is a starting point, not a finish line. Query patterns shift as new features ship; traffic spikes unevenly across time zones. The most reliable production technique is to instrument pg_stat_activity and react to observed starvation before users notice.
The following coroutine runs as a background task alongside your FastAPI or Starlette application. It queries PostgreSQL directly for the number of connections in a waiting state and the ratio of active to idle connections, then logs a structured warning when thresholds are breached.
import asyncio
import logging
from sqlalchemy import text
from sqlalchemy.ext.asyncio import create_async_engine, AsyncEngine
logger = logging.getLogger("pool_monitor")
MONITOR_INTERVAL_SECONDS = 30
WAIT_WARNING_THRESHOLD = 3 # waiting connections before alerting
ACTIVE_RATIO_WARNING = 0.85 # fraction of pool in use before alerting
async def monitor_pool_health(engine: AsyncEngine) -> None:
"""
Background task: poll pg_stat_activity and emit structured warnings
when the connection pool shows signs of starvation.
Designed to run with asyncio.create_task() at application startup.
"""
pool = engine.pool
while True:
try:
async with engine.connect() as conn:
result = await conn.execute(
text(
"""
SELECT
count(*) FILTER (WHERE wait_event_type = 'Lock') AS lock_waits,
count(*) FILTER (WHERE state = 'active') AS active,
count(*) FILTER (WHERE state = 'idle') AS idle,
count(*) FILTER (WHERE state = 'idle in transaction'
AND now() - state_change > interval '5 seconds')
AS stale_txn
FROM pg_stat_activity
WHERE datname = current_database()
AND pid <> pg_backend_pid()
"""
)
)
row = result.mappings().one()
pool_size = pool.size()
checked_out = pool.checkedout()
active_ratio = checked_out / pool_size if pool_size else 0
if row["lock_waits"] >= WAIT_WARNING_THRESHOLD:
logger.warning(
"pool_starvation_detected",
extra={
"lock_waits": row["lock_waits"],
"active_db_connections": row["active"],
"idle_db_connections": row["idle"],
"stale_transactions": row["stale_txn"],
"pool_checked_out": checked_out,
"pool_size": pool_size,
"recommendation": "reduce transaction scope or increase pool_size",
},
)
if active_ratio >= ACTIVE_RATIO_WARNING:
logger.warning(
"pool_near_capacity",
extra={
"active_ratio": round(active_ratio, 2),
"pool_checked_out": checked_out,
"pool_size": pool_size,
"recommendation": "consider increasing pool_size or adding workers",
},
)
if row["stale_txn"] > 0:
logger.error(
"stale_idle_in_transaction_connections",
extra={
"stale_count": row["stale_txn"],
"recommendation": "check for unclosed sessions or missing session.begin()",
},
)
except Exception:
logger.exception("pool_monitor_query_failed")
await asyncio.sleep(MONITOR_INTERVAL_SECONDS)
# Wire up at startup (FastAPI lifespan example):
# async def lifespan(app: FastAPI):
# task = asyncio.create_task(monitor_pool_health(engine))
# yield
# task.cancel()
Feed the structured log output into your observability platform (Datadog, Grafana Loki, CloudWatch) and set alerts on pool_near_capacity and pool_starvation_detected. When active_ratio consistently exceeds 0.85 across multiple 30-second windows, increase pool_size by 10 and redeploy. When stale_txn is non-zero, investigate session lifecycle bugs before touching pool configuration — adding more connections will not fix a leak.
For a broader perspective on the connection pool configuration ecosystem and the other async engine options that sit alongside pool sizing, the Async Engines, Dialects, and Connection Pooling section covers the full landscape.
Frequently Asked Questions
How do I calculate pool_size when deploying multiple Uvicorn workers?
Each Uvicorn worker is a separate OS process with its own asyncio event loop and its own SQLAlchemy engine. Pool connections are not shared across processes. The total PostgreSQL connections consumed at peak is (pool_size + max_overflow) × worker_count. Work backwards from PostgreSQL max_connections: subtract reserved admin slots (10 is a safe minimum), divide by worker count, then subtract max_overflow. For a 200-connection PostgreSQL instance with 4 workers and max_overflow=5: pool_size = floor(190/4) − 5 = 42. Gunicorn with --preload creates the engine before forking; always call await engine.dispose() in each worker's post-fork hook so workers do not inherit the parent's file descriptors.
Does PgBouncer in transaction mode change how I configure the SQLAlchemy pool?
Yes, significantly. In transaction-pooling mode PgBouncer multiplexes many client connections onto far fewer PostgreSQL connections, and it returns a PostgreSQL connection to its own pool after each transaction — not after the client disconnects. SQLAlchemy's pool becomes a queue of long-lived PgBouncer client connections rather than real PostgreSQL connections. You can safely set pool_size much higher (matching expected coroutine concurrency) because PgBouncer absorbs the surplus. However, pool_pre_ping=True becomes less useful (PgBouncer presents a fresh logical connection on every checkout), and prepared statements must be disabled: pass prepared_statement_cache_size=0 in connect_args when using asyncpg behind PgBouncer.
What should I set pool_timeout to, and what happens when it expires?pool_timeout is the maximum number of seconds a coroutine will wait for a connection to become available before raising sqlalchemy.exc.TimeoutError. The right value depends on your SLA: if your API must respond within 500 ms, a pool_timeout of 10 s will hide pool exhaustion from monitoring while still causing user-facing timeouts. Prefer 5–10 s as a baseline and alert when the error appears rather than silently increasing pool_timeout. When the timeout expires in an async context, the waiting coroutine is cancelled cleanly — no connection is leaked and the pool remains intact.
Why does pool_size=0 or NullPool make sense for serverless environments?
AWS Lambda and Google Cloud Run spin up and tear down process instances frequently. A persistent pool of idle connections wastes PostgreSQL slots during scale-to-zero periods and causes ConnectionDoesNotExistError when the pool is reused after a cold-start pause. Use NullPool (via poolclass=NullPool in create_async_engine) so every async with engine.connect() opens a fresh TCP connection and closes it on exit. The overhead of TCP establishment is acceptable when invocations are infrequent; for high-frequency serverless workloads, PgBouncer in session mode or Amazon RDS Proxy is the correct answer.
Related
- Configuring Async Engines and Connection Pools — parent guide covering the full engine initialization API
- Handling Connection Leaks and Pool Exhaustion — diagnosing and fixing pool exhaustion once it occurs
- Tuning Connection Pools for Cloud Databases — cloud-specific constraints including RDS Proxy and Cloud SQL Auth Proxy
- Setting Up an Async Engine from Scratch — step-by-step engine creation for developers starting from zero