pgmonkey (4/8) — Connection Pooling
Summary: Configure and tune pgmonkey’s connection pools — sync and async — with the right pool size, health checks, lifecycle limits, and concurrent access patterns for your workload.
| Key | Value |
|---|---|
| pgmonkey version | 4.0.0 |
| Python version | 3.12 |
| Config file | pgmonkey_config.yaml |
| Database name | tutorial_db |
| Pool min size | 2 |
| Pool max size | 10 |
0. Prerequisites
- Completed Part 1 (pgmonkey installed, config file working)
- Familiarity with Part 2 (connection types — particularly
poolandasync_pool)
1. Why Connection Pooling Matters
Opening a PostgreSQL connection is expensive. Each connection requires a TCP handshake, authentication, and backend process creation on the server. For a script that runs once, this overhead is negligible. For a web application handling hundreds of requests per second, it becomes a bottleneck.
Connection pooling solves this by maintaining a set of pre-opened connections. When your code needs a connection, it borrows one from the pool. When it is done, the connection goes back to the pool instead of being closed. Connections are reused, not recreated.
pgmonkey supports two pool types:
pool— synchronous, backed bypsycopg_pool.ConnectionPool, thread-safeasync_pool— asynchronous, backed bypsycopg_pool.AsyncConnectionPool, task-safe
Both are configured in the same YAML file.
2. Pool Configuration Settings
Sync pools use the pool_settings section. Async pools use async_pool_settings. The settings are identical.
pool_settings: min_size: 2 max_size: 10 timeout: 30 max_idle: 300 max_lifetime: 3600 check_on_checkout: false
| Setting | Type | Default | Purpose |
|---|---|---|---|
min_size | int | 5 | Minimum connections kept open at all times |
max_size | int | 20 | Maximum connections the pool will create |
timeout | int | 30 | Seconds to wait for a connection before raising an error |
max_idle | int | 300 | Seconds a connection can sit idle before being closed |
max_lifetime | int | 3600 | Maximum age of a connection before it is replaced |
check_on_checkout | bool | false | Run SELECT 1 before handing a connection to the caller |
3. Sizing Your Pool
Pool sizing depends on your workload. A pool that is too small causes requests to queue and time out. A pool that is too large wastes database server resources.
Start here:
- Set
min_sizeto the number of connections your application uses at rest (low traffic). - Set
max_sizeto the peak concurrent database requests you expect.
pool_settings: min_size: 2 max_size: 10
Rules of thumb:
- For a Flask app with 4 Gunicorn workers, each worker gets its own pool. A
max_sizeof 5-10 per worker is typical. - For a FastAPI app, a single async pool serves all coroutines. Start with
max_sizeequal to 2-3x your expected concurrent requests. - PostgreSQL’s default
max_connectionsis 100. Your total pool sizes across all application instances must stay below this limit.
Warning: If total pool connections exceed
max_connectionson the server, new connection attempts will fail. Part 6 covers thepgserverconfig --auditcommand, which checks your server’smax_connectionsagainst your pool configuration.
4. Health Checks
Enable check_on_checkout to validate connections before use.
pool_settings: check_on_checkout: true
With this enabled, every time your code borrows a connection from the pool, pgmonkey runs SELECT 1 first. If the connection is dead (server restarted, network blip, idle timeout), it is discarded and a fresh one is created.
The trade-off is a small latency increase per checkout — one extra round trip to the database. For most web applications, this is negligible and the reliability gain is worth it.
Tip: If your connections are short-lived and your PostgreSQL server is stable, you can leave
check_on_checkoutoff and rely onmax_idleandmax_lifetimeto cycle stale connections out naturally.
5. Pool Lifecycle Settings
Two settings control how long connections stay in the pool.
pool_settings: max_idle: 300 max_lifetime: 3600
max_idle (seconds) — Close connections that have been sitting idle longer than this. This prevents idle connections from accumulating during low-traffic periods.
max_lifetime (seconds) — Close connections that have been open longer than this, regardless of activity. This ensures connections are periodically refreshed, picking up server configuration changes and preventing resource leaks from very long-lived connections.
| Workload | Suggested max_idle | Suggested max_lifetime |
|---|---|---|
| Web app with steady traffic | 300 (5 min) | 3600 (1 hr) |
| Bursty traffic with idle periods | 60 (1 min) | 1800 (30 min) |
| Long-running background workers | 600 (10 min) | 7200 (2 hr) |
6. Concurrent Access — Sync Threads
The sync pool is thread-safe. Multiple threads can borrow and return connections concurrently.
import threadingfrom pgmonkey import PGConnectionManagerdef worker(pool, worker_id): with pool as conn: with conn.cursor() as cur: cur.execute('SELECT pg_backend_pid();') pid = cur.fetchone()[0] print(f"Worker {worker_id}: backend PID {pid}")def main(): manager = PGConnectionManager() pool = manager.get_database_connection('pgmonkey_config.yaml', 'pool') threads = [] for i in range(6): t = threading.Thread(target=worker, args=(pool, i)) threads.append(t) t.start() for t in threads: t.join()if __name__ == "__main__": main()
python pool_threads.py
Worker 0: backend PID 12340Worker 1: backend PID 12341Worker 2: backend PID 12342Worker 3: backend PID 12340Worker 4: backend PID 12341Worker 5: backend PID 12342
Notice the PID reuse — workers 3, 4, and 5 borrowed the same connections that workers 0, 1, and 2 returned. That is pooling in action.
7. Concurrent Access — Async Tasks
The async pool is task-safe. Multiple asyncio tasks can share the same pool.
import asynciofrom pgmonkey import PGConnectionManagerasync def worker(pool, task_id): async with pool as conn: async with conn.cursor() as cur: await cur.execute('SELECT pg_backend_pid();') pid = (await cur.fetchone())[0] print(f"Task {task_id}: backend PID {pid}")async def main(): manager = PGConnectionManager() pool = await manager.get_database_connection('pgmonkey_config.yaml', 'async_pool') tasks = [worker(pool, i) for i in range(6)] await asyncio.gather(*tasks)if __name__ == "__main__": asyncio.run(main())
python async_pool_tasks.py
Same pattern, same PID reuse, but fully async. Each async with pool borrows a connection from the async pool and returns it when the block exits.
8. Pool Behavior Summary
| Behavior | What happens |
|---|---|
| Borrow a connection | with pool as conn: or async with pool as conn: |
| Return a connection | Automatic when the with block exits |
| Clean exit | Connection is committed, then returned to pool |
| Exception | Connection is rolled back, then returned to pool |
| Pool exhausted | Caller waits up to timeout seconds, then raises an error |
| Idle connection | Closed after max_idle seconds |
| Old connection | Replaced after max_lifetime seconds |
| Stale connection | Detected by check_on_checkout and replaced |
Note: pgmonkey caches pools by config content. Calling
get_database_connection()with the same config and connection type returns the same pool instance — no duplicate pools are created. This prevents the “pool storm” problem where each call creates a new pool.
Summary
You configured pgmonkey’s connection pools for production workloads.
pool_settingscontrols sync pools;async_pool_settingscontrols async poolsmin_sizeandmax_sizeset the pool boundariescheck_on_checkoutvalidates connections before usemax_idleandmax_lifetimekeep the pool fresh- Sync pools are thread-safe; async pools are task-safe
- pgmonkey caches pools to prevent duplicate creation
Next up: pgmonkey (5/8) — Environment Variables and Secrets shows you how to keep credentials out of your config files using ${VAR} substitution, from_env, and from_file references.
