pgmonkey (4/8) — Connection Pooling

Summary: Configure and tune pgmonkey’s connection pools — sync and async — with the right pool size, health checks, lifecycle limits, and concurrent access patterns for your workload.

KeyValue
pgmonkey version4.0.0
Python version3.12
Config filepgmonkey_config.yaml
Database nametutorial_db
Pool min size2
Pool max size10

0. Prerequisites


1. Why Connection Pooling Matters

Opening a PostgreSQL connection is expensive. Each connection requires a TCP handshake, authentication, and backend process creation on the server. For a script that runs once, this overhead is negligible. For a web application handling hundreds of requests per second, it becomes a bottleneck.

Connection pooling solves this by maintaining a set of pre-opened connections. When your code needs a connection, it borrows one from the pool. When it is done, the connection goes back to the pool instead of being closed. Connections are reused, not recreated.

pgmonkey supports two pool types:

  • pool — synchronous, backed by psycopg_pool.ConnectionPool, thread-safe
  • async_pool — asynchronous, backed by psycopg_pool.AsyncConnectionPool, task-safe

Both are configured in the same YAML file.


2. Pool Configuration Settings

Sync pools use the pool_settings section. Async pools use async_pool_settings. The settings are identical.

pool_settings:
min_size: 2
max_size: 10
timeout: 30
max_idle: 300
max_lifetime: 3600
check_on_checkout: false
SettingTypeDefaultPurpose
min_sizeint5Minimum connections kept open at all times
max_sizeint20Maximum connections the pool will create
timeoutint30Seconds to wait for a connection before raising an error
max_idleint300Seconds a connection can sit idle before being closed
max_lifetimeint3600Maximum age of a connection before it is replaced
check_on_checkoutboolfalseRun SELECT 1 before handing a connection to the caller

3. Sizing Your Pool

Pool sizing depends on your workload. A pool that is too small causes requests to queue and time out. A pool that is too large wastes database server resources.

Start here:

  • Set min_size to the number of connections your application uses at rest (low traffic).
  • Set max_size to the peak concurrent database requests you expect.
pool_settings:
min_size: 2
max_size: 10

Rules of thumb:

  • For a Flask app with 4 Gunicorn workers, each worker gets its own pool. A max_size of 5-10 per worker is typical.
  • For a FastAPI app, a single async pool serves all coroutines. Start with max_size equal to 2-3x your expected concurrent requests.
  • PostgreSQL’s default max_connections is 100. Your total pool sizes across all application instances must stay below this limit.

Warning: If total pool connections exceed max_connections on the server, new connection attempts will fail. Part 6 covers the pgserverconfig --audit command, which checks your server’s max_connections against your pool configuration.


4. Health Checks

Enable check_on_checkout to validate connections before use.

pool_settings:
check_on_checkout: true

With this enabled, every time your code borrows a connection from the pool, pgmonkey runs SELECT 1 first. If the connection is dead (server restarted, network blip, idle timeout), it is discarded and a fresh one is created.

The trade-off is a small latency increase per checkout — one extra round trip to the database. For most web applications, this is negligible and the reliability gain is worth it.

Tip: If your connections are short-lived and your PostgreSQL server is stable, you can leave check_on_checkout off and rely on max_idle and max_lifetime to cycle stale connections out naturally.


5. Pool Lifecycle Settings

Two settings control how long connections stay in the pool.

pool_settings:
max_idle: 300
max_lifetime: 3600

max_idle (seconds) — Close connections that have been sitting idle longer than this. This prevents idle connections from accumulating during low-traffic periods.

max_lifetime (seconds) — Close connections that have been open longer than this, regardless of activity. This ensures connections are periodically refreshed, picking up server configuration changes and preventing resource leaks from very long-lived connections.

WorkloadSuggested max_idleSuggested max_lifetime
Web app with steady traffic300 (5 min)3600 (1 hr)
Bursty traffic with idle periods60 (1 min)1800 (30 min)
Long-running background workers600 (10 min)7200 (2 hr)

6. Concurrent Access — Sync Threads

The sync pool is thread-safe. Multiple threads can borrow and return connections concurrently.

import threading
from pgmonkey import PGConnectionManager
def worker(pool, worker_id):
with pool as conn:
with conn.cursor() as cur:
cur.execute('SELECT pg_backend_pid();')
pid = cur.fetchone()[0]
print(f"Worker {worker_id}: backend PID {pid}")
def main():
manager = PGConnectionManager()
pool = manager.get_database_connection('pgmonkey_config.yaml', 'pool')
threads = []
for i in range(6):
t = threading.Thread(target=worker, args=(pool, i))
threads.append(t)
t.start()
for t in threads:
t.join()
if __name__ == "__main__":
main()
python pool_threads.py
Worker 0: backend PID 12340
Worker 1: backend PID 12341
Worker 2: backend PID 12342
Worker 3: backend PID 12340
Worker 4: backend PID 12341
Worker 5: backend PID 12342

Notice the PID reuse — workers 3, 4, and 5 borrowed the same connections that workers 0, 1, and 2 returned. That is pooling in action.


7. Concurrent Access — Async Tasks

The async pool is task-safe. Multiple asyncio tasks can share the same pool.

import asyncio
from pgmonkey import PGConnectionManager
async def worker(pool, task_id):
async with pool as conn:
async with conn.cursor() as cur:
await cur.execute('SELECT pg_backend_pid();')
pid = (await cur.fetchone())[0]
print(f"Task {task_id}: backend PID {pid}")
async def main():
manager = PGConnectionManager()
pool = await manager.get_database_connection('pgmonkey_config.yaml', 'async_pool')
tasks = [worker(pool, i) for i in range(6)]
await asyncio.gather(*tasks)
if __name__ == "__main__":
asyncio.run(main())
python async_pool_tasks.py

Same pattern, same PID reuse, but fully async. Each async with pool borrows a connection from the async pool and returns it when the block exits.


8. Pool Behavior Summary

BehaviorWhat happens
Borrow a connectionwith pool as conn: or async with pool as conn:
Return a connectionAutomatic when the with block exits
Clean exitConnection is committed, then returned to pool
ExceptionConnection is rolled back, then returned to pool
Pool exhaustedCaller waits up to timeout seconds, then raises an error
Idle connectionClosed after max_idle seconds
Old connectionReplaced after max_lifetime seconds
Stale connectionDetected by check_on_checkout and replaced

Note: pgmonkey caches pools by config content. Calling get_database_connection() with the same config and connection type returns the same pool instance — no duplicate pools are created. This prevents the “pool storm” problem where each call creates a new pool.


Summary

You configured pgmonkey’s connection pools for production workloads.

  • pool_settings controls sync pools; async_pool_settings controls async pools
  • min_size and max_size set the pool boundaries
  • check_on_checkout validates connections before use
  • max_idle and max_lifetime keep the pool fresh
  • Sync pools are thread-safe; async pools are task-safe
  • pgmonkey caches pools to prevent duplicate creation

Next up: pgmonkey (5/8) — Environment Variables and Secrets shows you how to keep credentials out of your config files using ${VAR} substitution, from_env, and from_file references.

Leave a Reply