How to Install and Configure PostgreSQL Streaming Replication: Real-time Data Synchronization between Primary and Standby – ITFROMZERO

Context & Why Streaming Replication is Needed?

Hello everyone! In today’s modern technology environment, data plays an extremely crucial role for every application. Whether you are developing an e-commerce website, an internal management system, or a mobile application, ensuring data availability and preventing loss is the top priority.

I have worked with many databases such as MySQL, PostgreSQL, and MongoDB. Each type has its own strengths, suitable for specific needs. However, regardless of the choice, we all need a robust data protection solution to prevent unexpected incidents.

Just imagine: if the primary server fails, recovering data from a backup could take several hours, or even a full day. During that time, the application is down. This could cost an e-commerce website hundreds of millions VND in revenue every hour, not to mention the negative impact on user experience and trust. This is why High Availability (HA) and Disaster Recovery (DR) solutions become extremely important.

PostgreSQL Streaming Replication is an extremely effective solution to address this challenge. Instead of relying solely on periodic backups, Streaming Replication allows you to maintain one or more secondary (Standby) servers that are almost instantly synchronized with the primary (Primary) server.

All changes on the Primary will be continuously streamed to the Standby. As a result, when the Primary encounters an issue, the Standby can quickly take over, minimizing downtime to a minimum, possibly just a few minutes or even tens of seconds.

In this guide, we will walk you through the step-by-step installation and configuration of PostgreSQL Streaming Replication. The goal is to help you build a robust database system that ensures continuous operation.

Installation (Environment Preparation)

Before diving into configuration, we need to prepare the environment. Assume you have two Linux servers (e.g., Ubuntu Server): one acting as Primary and one as Standby. Both machines must have PostgreSQL installed.

Minimum requirements:

Two Linux servers (e.g., Ubuntu 22.04 LTS).
PostgreSQL version 14 or higher installed on both machines.
Network connectivity between Primary and Standby.

To install PostgreSQL on Ubuntu, you can use the following command (if not already installed):

sudo apt update
sudo apt install postgresql postgresql-contrib -y

Assume the IP address of the Primary Server is 192.168.1.100 and the Standby Server is 192.168.1.101. We will use these addresses throughout the configuration process.

Check PostgreSQL service:

Ensure the PostgreSQL service is running on both machines:

sudo systemctl status postgresql

If the service is not running, start it:

sudo systemctl start postgresql

Also, make sure no firewall is blocking port 5432 (the default PostgreSQL port) between Primary and Standby. If there is, you need to open this port:

sudo ufw allow 5432/tcp
sudo ufw enable

Note: All configuration commands below will be executed with the postgres user or sudo -u postgres. My PostgreSQL version is 14, so the configuration path will be /etc/postgresql/14/main/ and the data directory will be /var/lib/postgresql/14/main/. Please adjust according to your PostgreSQL version.

Detailed PostgreSQL Streaming Replication Configuration

This is the core of the guide, where we will customize configuration files to set up Streaming Replication.

3.1. Configuration on Primary Server (192.168.1.100)

First, we will configure the Primary Server.

Step 1: Edit `postgresql.conf`

Open the main PostgreSQL configuration file:

sudo nano /etc/postgresql/14/main/postgresql.conf

Find and modify (or add) the following lines:

listen_addresses: Allows PostgreSQL to listen for connections from all IP addresses, or specify the Standby’s specific IP.
```
listen_addresses = '*'
```
wal_level: This value must be replica to enable Replication mode.
```
wal_level = replica
```
max_wal_senders: The maximum number of WAL sender processes. Set this value high enough to accommodate the number of Standbys you have (e.g., 5-10).
```
max_wal_senders = 5
```
wal_keep_size: The amount of WAL (MB) that the Primary retains for the Standby. Set large enough for the Standby to resynchronize after a short disconnection. A value of 1024 (1GB) is a good starting point.
```
wal_keep_size = 1024
```
archive_mode: Enable WAL archiving mode, useful for Point-In-Time Recovery (PITR) and ensuring greater data safety.
```
archive_mode = on
```
archive_command: Command to copy completed WALs. The example below will copy to the /var/lib/postgresql/14/main/archive directory. You need to create this directory and set permissions first:
```
sudo mkdir -p /var/lib/postgresql/14/main/archive
sudo chown -R postgres:postgres /var/lib/postgresql/14/main/archive
```
Then add to postgresql.conf:
```
archive_command = 'cp %p /var/lib/postgresql/14/main/archive/%f'
```

Step 2: Edit `pg_hba.conf`

The pg_hba.conf file controls authentication for connections to PostgreSQL. We need to allow the Standby Server to connect to the Primary using a replication user.

Open the file:

sudo nano /etc/postgresql/14/main/pg_hba.conf

Add the following line to the end of the file, replacing 192.168.1.101/32 with the Standby Server’s IP address:

host    replication     repuser         192.168.1.101/32        scram-sha-256

Step 3: Create Replication User

We need to create a special user on the Primary Server so the Standby can connect and synchronize data.

sudo -u postgres psql

Create the user repuser with REPLICATION privileges and set a password:

CREATE USER repuser REPLICATION LOGIN CONNECTION LIMIT -1 ENCRYPTED PASSWORD 'your_secure_password_here';
\q

(Replace your_secure_password_here with your strong password.)

Step 4: Restart Primary Server

For the configuration changes to take effect, we need to restart the PostgreSQL service on the Primary:

sudo systemctl restart postgresql

3.2. Configuration on Standby Server (192.168.1.101)

Now, let’s move on to configuring the Standby Server.

Step 1: Stop PostgreSQL service on Standby

Before copying data, ensure PostgreSQL on the Standby has stopped:

sudo systemctl stop postgresql

Step 2: Delete old data (if any)

The data directory on the Standby must be empty or not exist before performing pg_basebackup. If you have installed PostgreSQL and have default data, delete it:

sudo rm -rf /var/lib/postgresql/14/main/*

Step 3: Copy Base Backup from Primary

We will use the pg_basebackup command to copy all data from the Primary to the Standby. This command creates an exact copy of the data directory on the Primary, and automatically creates the necessary postgresql.auto.conf file for replication.

Run this command on the Standby Server:

sudo -u postgres pg_basebackup -h 192.168.1.100 -p 5432 -U repuser -D /var/lib/postgresql/14/main -F p -Xs -P -R -W

Explanation of options:

-h 192.168.1.100: IP address of the Primary Server.
-p 5432: Port of the Primary Server.
-U repuser: Replication user created on the Primary.
-D /var/lib/postgresql/14/main: Destination directory on the Standby to store the copied data.
-F p: Output format is “plain” (not compressed tar).
-Xs: Include necessary WAL files in the backup.
-P: Display copy progress.
-R: Automatically create standby.signal and postgresql.auto.conf files with appropriate primary_conninfo. This is an extremely useful option!
-W: Prompt for the repuser password.

When running the command, the system will ask for the repuser password that you set on the Primary.

Step 4: Edit `postgresql.conf` on Standby

After pg_basebackup completes, a postgresql.auto.conf file will be automatically created in the data directory. This file contains the important primary_conninfo line. Now, you just need to add one more parameter in postgresql.conf.

Open the file:

sudo nano /etc/postgresql/14/main/postgresql.conf

Find and ensure that hot_standby is enabled (this parameter allows the Standby to accept read queries):

hot_standby = on

Step 5: Restart Standby Server

Start the PostgreSQL service on the Standby so it begins connecting and synchronizing with the Primary:

sudo systemctl start postgresql

At this point, you have completed the configuration!

Check & Monitor Replication

Once the configuration is complete, the next step is to check if replication is working correctly as expected. This is an extremely important step.

4.1. Check Replication Status on Primary Server

sudo -u postgres psql -c "SELECT pid, usename, application_name, client_addr, state, sync_state, sync_priority, write_lag, flush_lag, replay_lag FROM pg_stat_replication;"

You will see a table with detailed information about the connected Standby servers. Some important columns:

usename: The replication user name (e.g., repuser).
application_name: The name of the connecting application.
client_addr: The IP address of the Standby Server.
state: The state of the replication connection. streaming means everything is working normally.
sync_state: Synchronization state. async (asynchronous) is the default and commonly used.
write_lag, flush_lag, replay_lag: The delay between Primary and Standby. Ideally, these values are very small (near 0) or zero.

If you don’t see any rows or the state is not streaming, recheck the configuration steps and PostgreSQL error logs.

4.2. Check Replication Status on Standby Server

sudo -u postgres psql -c "SELECT status, receive_start_lsn, latest_end_lsn, latest_end_time, conninfo FROM pg_stat_wal_receiver;"

Columns to pay attention to:

status: The state of the wal receiver. streaming is a good sign.
latest_end_lsn: The last LSN (Log Sequence Number) position that the Standby has received and replayed.
latest_end_time: The time of that last LSN.
conninfo: Connection information to the Primary.

4.3. Test Actual Data Synchronization

The best way to check is to try writing data to the Primary and see if it appears on the Standby.

On Primary Server:

sudo -u postgres psql -c "CREATE DATABASE test_replication; \c test_replication; CREATE TABLE my_data (id SERIAL PRIMARY KEY, value TEXT, created_at TIMESTAMP DEFAULT NOW()); INSERT INTO my_data (value) VALUES ('Hello from Primary');"

On Standby Server:

After a few seconds, try querying on the Standby (note that the Standby only allows reads, not writes):

sudo -u postgres psql -c "\c test_replication; SELECT * FROM my_data;"

If you see the line Hello from Primary appear on the Standby, then congratulations, your Streaming Replication is working successfully!

4.4. Basic Monitoring

To ensure long-term system stability, monitoring is indispensable. You can:

Check PostgreSQL logs: PostgreSQL log files (usually at /var/log/postgresql/postgresql-14-main.log) contain valuable information about replication status, and any errors.
Use monitoring tools: For production systems, I often integrate with Prometheus and Grafana to monitor metrics such as write_lag, replay_lag, disk space, CPU, RAM… This helps me detect potential issues early.

4.5. Failover Mechanism

Although we have set up Streaming Replication, the process of switching from Primary to Standby (Failover) in case of an incident will not be automatic with this configuration. This is a manual process. For automation, you will need specialized cluster management tools like Patroni, repmgr, or pg_auto_failover.

When the Primary Server fails, to promote the Standby to become the new Primary, we will need to:

Ensure the old Primary has completely stopped or is isolated from the network.
On the Standby Server, use the pg_ctl promote command.

sudo -u postgres pg_ctl -D /var/lib/postgresql/14/main promote

After promotion, the Standby will become the new Primary and begin accepting write queries. However, this needs to be carefully planned and practiced in a staging environment before being applied to production.