Optimizing PostgreSQL with pgpool-II: Real-world Pooling, Load Balancing, and Caching Solutions – ITFROMZERO

Table of Contents

When PostgreSQL Becomes a “Bottleneck”

In the projects I’ve deployed, ranging from Node.js to Python, the most haunting error isn’t code logic but FATAL: remaining connection slots are reserved. By default, PostgreSQL only allows about 100 connections. When traffic spikes, server RAM gets consumed rapidly because each new connection in Postgres is a separate process, costing 2-3MB of memory.

Many developers immediately think of increasing max_connections. However, this is a fatal mistake as it exhausts the CPU trying to manage waiting processes. While PgBouncer is a good choice for pooling, if you need an “all-in-one” solution that includes Load Balancing and Query Caching, pgpool-II is the standout candidate.

After over 6 months of running this in production for an e-commerce platform peaking at 5,000 concurrent users, I’ve found that pgpool-II plays a vital role, acting like a “conductor” coordinating the entire database cluster.

3 Key Features of pgpool-II

To configure it correctly, you need to understand how pgpool-II intervenes in the data flow:

1. Connection Pooling (Reducing Initialization Overhead)

Instead of constantly opening and closing connections (which is CPU-intensive), pgpool-II maintains old connections in a “pool.” When the application requests one, it immediately provides an available connection. Based on actual measurements, this reduces query latency by about 15-20% by skipping the authentication handshake.

2. Load Balancing (Splitting Read/Write Traffic)

This feature makes it easy to scale out your system. If you have a Primary-Standby cluster, pgpool-II automatically inspects SQL statements. SELECT commands are pushed to Standby nodes (Read-Replicas), while INSERT/UPDATE commands are routed to the Primary node. This maximizes the hardware utility of your secondary servers.

3. Query Caching (Smart Memory Buffer)

pgpool-II saves SELECT results in memory. For complex queries that take 2-3 seconds to calculate, the second call will return results almost instantly (in a few milliseconds). However, be careful: only use this for data that doesn’t change frequently.

Practical Configuration Guide

Suppose your system consists of 3 nodes:

Primary: 192.168.1.10 (Port 5432)
Standby: 192.168.1.11 (Port 5432)
pgpool-II: 192.168.1.20 (Listening on Port 9999)

Step 1: Quick Installation

On Ubuntu, installation is quite straightforward:

sudo apt update && sudo apt install pgpool2 -y

Step 2: Defining Database Nodes

Open the /etc/pgpool2/pgpool.conf file. This is where you define the “backends” for your system:

# Primary Node
backend_hostname0 = '192.168.1.10'
backend_port0 = 5432
backend_weight0 = 1
backend_flag0 = 'ALLOW_TO_FAILOVER'

# Standby Node
backend_hostname1 = '192.168.1.11'
backend_port1 = 5432
backend_weight1 = 1.5 # Assign higher read priority to this node
backend_flag1 = 'ALLOW_TO_FAILOVER'

Note: backend_weight allows you to regulate traffic. If the Standby server has better hardware, increase this number so it handles more queries.

Step 3: Enabling Load Balancing Mode

Find and edit the following parameters to enable Load Balancing:

connection_cache = on
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream' # Used for Streaming Replication

Battle-Tested Lessons from Production

Deploying pgpool-II isn’t always smooth sailing. Here are 3 notes I’ve gathered after facing several challenges:

Avoid Caching for Real-Time Data: I once encountered an issue where customers completed a payment but still saw their old balance because pgpool-II returned a cached result. Advice: Only enable caching for catalog tables or articles.
Replication Lag Issues: Sometimes data written to the Primary hasn’t synced to the Standby yet, but pgpool-II reads from the Standby, leading to inconsistent data. Configure delay_threshold to manage this.
Health Checks: Ensure your failover scripts work perfectly. When the Primary node fails, pgpool-II needs to know exactly which node will take over to reroute traffic and prevent a total system crash.

Quick Health Check

After restarting the service, use the psql command directly on the pgpool-II port to view the cluster status:

psql -h 192.168.1.20 -p 9999 -U postgres -c "show pool_nodes"

If the status column shows up for all nodes, congratulations, you have successfully configured the system.

Conclusion

Switching from direct connections to using pgpool-II is a major architectural turning point. It helps your application handle higher loads, improves stability, and makes it easier to scale in the future. If your system is starting to slow down due to heavy SELECT statements, don’t hesitate to experiment with pgpool-II today.