MySQL Semi-synchronous Replication: Zero Data Loss Without Group Replication

MySQL tutorial - IT technology blog
MySQL tutorial - IT technology blog

MySQL’s default async replication has a fatal weakness: once the master commits, it’s done — it doesn’t care whether the slave received the data. If the master crashes at that exact moment, you lose data.

I ran into this at 3 AM — server crash, slave lagging 15 seconds, and those 15 seconds of transactions were gone. That incident pushed me to find a better approach. I’ve also dealt with database corruption at 3 AM and had to restore from backup — ever since, I check backups daily and never use pure async for transactional databases.

Semi-synchronous Replication was built to solve exactly this problem — and it’s been built into MySQL since version 5.5, with no additional software required.

Three Replication Approaches — Which One Fits?

Before diving into configuration, you need to understand what you’re choosing between and the trade-offs each option involves:

1. Asynchronous Replication (Default)

Master commits → immediately returns result to the client. The slave receives the binlog whenever it gets around to it. This is the default configuration for a standard master-slave setup.

  • Speed: Fastest — master is never blocked by the slave
  • Risk: Data loss if the master crashes before the slave receives the binlog
  • Best for: Pure read replicas, reporting slaves, applications that can tolerate losing a few seconds of data

2. Semi-synchronous Replication

Master commits → waits for at least one slave to acknowledge receipt of the binlog into its relay log → then returns the result to the client. The slave doesn’t need to have applied the query yet — writing to the relay log is sufficient.

  • Speed: Slightly slower than async — one additional network round-trip, typically 1–5ms on LAN
  • Safety: Zero data loss — data exists in at least two places before commit succeeds
  • Best for: Production systems requiring high reliability without the complexity of Group Replication

3. Group Replication / InnoDB Cluster

Multi-master, automatic failover, uses the Paxos consensus protocol. The most powerful solution but also the most operationally complex.

  • Speed: Depends on node count and network latency — typically slower than semi-sync
  • Safety: Highest — automatically handles split-brain and failover
  • Best for: Teams needing fully automated HA with experience operating distributed systems

Semi-sync Pros and Cons in Practice

After running all three approaches in real production environments, here’s the honest balance for each solution:

  • Data loss risk: Async = yes | Semi-sync = nearly none | Group Replication = none
  • Write latency overhead: Async = 0ms | Semi-sync = +1–5ms (LAN) | Group Replication = +5–20ms
  • Setup complexity: Async = low | Semi-sync = low | Group Replication = high
  • Automatic failover: Async = no | Semi-sync = no | Group Replication = yes
  • Minimum nodes: Async = 2 | Semi-sync = 2 | Group Replication = 3 (odd number)

Practical conclusion: Semi-sync is the best balance point for most production use cases — zero data loss with minimal setup effort. If your team isn’t ready to operate Group Replication, semi-sync is the right choice.

When to Use (and Avoid) Semi-sync

Semi-sync is a good fit when at least one of the following conditions applies:

  • Your database stores financial transactions or orders — data loss is absolutely unacceptable
  • You already have async master-slave running and want to upgrade without rebuilding from scratch
  • Network latency between master and slave is under 10ms (same datacenter or availability zone)
  • Your team doesn’t have experience operating Group Replication

When to avoid it: When master and slave are in different datacenters with high latency (every write slows down proportionally), or when your application has extremely high write throughput — thousands of TPS — that can’t tolerate even a few milliseconds of added latency.

Configuring MySQL Semi-synchronous Replication

Prerequisites

  • MySQL 5.7+ (MySQL 8.0 recommended)
  • Existing async master-slave replication running stably
  • InnoDB storage engine (required for semi-sync)
  • Stable, low-latency network connection between master and slave

If you don’t have basic replication set up yet, read “Configuring MySQL Master-Slave Replication” first, then come back here. Semi-sync is simply an additional layer on top of an existing configuration.

Step 1: Enable the Semi-sync Plugin on the Master

Semi-sync uses a built-in MySQL plugin — no external packages needed.

-- Connect to the MySQL master
-- Install the semi-sync plugin for master
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

-- Enable semi-sync
SET GLOBAL rpl_semi_sync_master_enabled = 1;

-- Timeout (ms) before falling back to async if the slave doesn't respond
-- 10000ms = 10 seconds, adjust based on your requirements
SET GLOBAL rpl_semi_sync_master_timeout = 10000;

-- Minimum number of slaves that must ACK before the master commits (default is 1)
SET GLOBAL rpl_semi_sync_master_wait_for_slave_count = 1;

-- Verify the configuration just set
SHOW VARIABLES LIKE 'rpl_semi_sync%';

To persist the configuration across restarts, add the following to /etc/mysql/mysql.conf.d/mysqld.cnf on the master:

[mysqld]
plugin-load-add = semisync_master.so
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_master_timeout = 10000
rpl_semi_sync_master_wait_for_slave_count = 1

Step 2: Enable the Semi-sync Plugin on the Slave

Repeat the following on each slave node:

-- Connect to the MySQL slave
-- Install the semi-sync plugin for slave
INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

-- Enable semi-sync slave
SET GLOBAL rpl_semi_sync_slave_enabled = 1;

-- Restart the IO thread to reconnect to the master in semi-sync mode
STOP SLAVE IO_THREAD;
START SLAVE IO_THREAD;

Add the following to mysqld.cnf on the slave to persist:

[mysqld]
plugin-load-add = semisync_slave.so
rpl_semi_sync_slave_enabled = 1

Note for MySQL 8.0.26+: Plugin names changed from master/slave to source/replica:

-- On MySQL 8.0.26 and later
-- Master node
INSTALL PLUGIN rpl_semi_sync_source SONAME 'semisync_source.so';
SET GLOBAL rpl_semi_sync_source_enabled = 1;

-- Slave node
INSTALL PLUGIN rpl_semi_sync_replica SONAME 'semisync_replica.so';
SET GLOBAL rpl_semi_sync_replica_enabled = 1;

Step 3: Verify Semi-sync Is Working

-- Run on the MASTER
SHOW STATUS LIKE 'Rpl_semi_sync%';

-- Key values to check:
-- Rpl_semi_sync_master_status  = ON   -- Semi-sync is active
-- Rpl_semi_sync_master_clients = 1    -- One slave connected via semi-sync
-- Rpl_semi_sync_master_yes_tx  = 100  -- Transactions committed with ACK from slave
-- Rpl_semi_sync_master_no_tx   = 0    -- Transactions that fell back to async (should be 0)
-- Run on the SLAVE
SHOW STATUS LIKE 'Rpl_semi_sync_slave_status';

-- Expected output:
-- Rpl_semi_sync_slave_status = ON

If Rpl_semi_sync_master_clients = 0, the slave hasn’t reconnected after enabling semi-sync. Run STOP SLAVE IO_THREAD; START SLAVE IO_THREAD; on the slave again.

Step 4: Real-world Test — Simulating a Slave Offline

Test to confirm semi-sync is behaving as expected:

# Terminal 1: On the slave — stop the IO thread to simulate a disconnected slave
mysql -u root -p -e "STOP SLAVE IO_THREAD;"

# Terminal 2: On the master — try an insert and measure response time
time mysql -u root -p -e "
  INSERT INTO testdb.orders (user_id, amount, created_at)
  VALUES (1, 99.99, NOW());
"
# This command will block for about 10 seconds (matching the timeout set earlier) before succeeding
# The master automatically falls back to async after the timeout expires

# Terminal 1: Restart the slave IO thread
mysql -u root -p -e "START SLAVE IO_THREAD;"

# Check how many transactions fell back to async on the master
mysql -u root -p -e "SHOW STATUS LIKE 'Rpl_semi_sync_master_no_tx';"
# The value will increment by 1 — this is the transaction that just ran in async mode

Understanding Semi-sync Fallback and How to Monitor It

When a slave doesn’t respond within the timeout period (rpl_semi_sync_master_timeout), the master automatically falls back to async to avoid blocking the application. This is intentional behavior — it prevents production from locking up due to a slave issue.

However, during that fallback window, the system is running in async mode and the risk of data loss returns. You need to monitor this metric continuously:

#!/bin/bash
# Script to monitor semi-sync status — run via cron every minute
MYSQL_CMD="mysql -u monitor_user -pYOUR_PASSWORD -N -e"

STATUS=$($MYSQL_CMD "SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='Rpl_semi_sync_master_status';")
NO_TX=$($MYSQL_CMD "SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='Rpl_semi_sync_master_no_tx';")

if [ "$STATUS" != "ON" ]; then
  echo "ALERT: Semi-sync OFF — currently running in async mode!"
  # Add your Slack/Telegram/PagerDuty alert command here
fi

if [ "$NO_TX" -gt "0" ]; then
  echo "WARNING: $NO_TX transaction(s) have fallen back to async"
fi

Add to crontab:

* * * * * /opt/scripts/check_semisync.sh >> /var/log/semisync_monitor.log 2>&1

Summary

Semi-synchronous Replication is a straightforward upgrade from default async replication that delivers zero data loss without rebuilding your infrastructure or learning Group Replication. If you already have master-slave in place, you can enable it in under 10 minutes.

Two metrics to monitor daily: Rpl_semi_sync_master_status must be ON, and Rpl_semi_sync_master_no_tx should always stay at 0. When either of these looks off, your system is running in a less safe mode and needs immediate investigation — don’t wait until 3 AM to find out.

Share: