Configuring MySQL Active-Active Replication: Bidirectional Data Sync and Concurrent Write Conflict Handling

MySQL tutorial - IT technology blog
MySQL tutorial - IT technology blog

Quick Start — Setup Master-Master in 10 Minutes

I’ll demo this with 2 Ubuntu 22.04 servers, both running MySQL 8.0. Replace the IP addresses to match your environment:

  • Node1: 192.168.1.10
  • Node2: 192.168.1.20

Step 1: Create replication users on both nodes

Run the following on Node1:

CREATE USER 'repl_user'@'192.168.1.20' IDENTIFIED BY 'StrongPassword123!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'192.168.1.20';
FLUSH PRIVILEGES;

Do the same on Node2 — just swap the IP to 192.168.1.10:

CREATE USER 'repl_user'@'192.168.1.10' IDENTIFIED BY 'StrongPassword123!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'192.168.1.10';
FLUSH PRIVILEGES;

Step 2: Configure my.cnf

On Node1 (/etc/mysql/mysql.conf.d/mysqld.cnf):

[mysqld]
server-id                = 1
log_bin                  = /var/log/mysql/mysql-bin.log
binlog_format            = ROW
relay_log                = /var/log/mysql/relay-bin.log
log_slave_updates        = ON
auto_increment_increment = 2
auto_increment_offset    = 1

On Node2 — only server-id and auto_increment_offset differ:

[mysqld]
server-id                = 2
log_bin                  = /var/log/mysql/mysql-bin.log
binlog_format            = ROW
relay_log                = /var/log/mysql/relay-bin.log
log_slave_updates        = ON
auto_increment_increment = 2
auto_increment_offset    = 2
sudo systemctl restart mysql

Step 3: Capture the binlog position and activate replication

On Node1, temporarily lock the tables to get an accurate binlog position:

FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;

The output will look something like this — note down File and Position:

+------------------+----------+
| File             | Position |
+------------------+----------+
| mysql-bin.000001 |      154 |
+------------------+----------+

Then configure Node2 to point to Node1:

-- Run on Node2
CHANGE MASTER TO
  MASTER_HOST     = '192.168.1.10',
  MASTER_USER     = 'repl_user',
  MASTER_PASSWORD = 'StrongPassword123!',
  MASTER_LOG_FILE = 'mysql-bin.000001',
  MASTER_LOG_POS  = 154;

START SLAVE;
SHOW SLAVE STATUS\G

Repeat in the reverse direction — get Node2’s binlog position, then configure Node1 to point to Node2. Remember to unlock the tables first:

-- On Node1: unlock after capturing Node2's position
UNLOCK TABLES;

-- Configure Node1 as a slave of Node2
CHANGE MASTER TO
  MASTER_HOST     = '192.168.1.20',
  MASTER_USER     = 'repl_user',
  MASTER_PASSWORD = 'StrongPassword123!',
  MASTER_LOG_FILE = 'mysql-bin.000001',  -- position captured from Node2
  MASTER_LOG_POS  = 154;

START SLAVE;

Verify the status — both of these must show Yes:

SHOW SLAVE STATUS\G
-- Slave_IO_Running: Yes
-- Slave_SQL_Running: Yes

Understanding the Underlying Mechanism

At its core, Master-Master is simply two Master-Slave relationships running in opposite directions. Node1 is simultaneously a master (writing its binlog) and a slave (receiving Node2’s binlog). Node2 works the same way. Whatever you write to either node, the other receives it — there’s no distinction between them.

Why do auto_increment_increment and auto_increment_offset matter so much?

This is the most overlooked part — and the most painful if you skip it. Without these two settings, both nodes generate AUTO_INCREMENT values starting from 1. Node1 creates ID=1, Node2 also creates ID=1, and replication immediately halts with a duplicate key error.

With auto_increment_increment=2:

  • Node1 (offset=1) generates: 1, 3, 5, 7, 9…
  • Node2 (offset=2) generates: 2, 4, 6, 8, 10…

The two sequences never collide. Scaling up to 3 nodes? Set increment=3 with offsets 1, 2, and 3 respectively.

Advanced — Handling Concurrent Write Conflicts

Write conflicts are the hardest problem in Active-Active setups. Imagine Node1 and Node2 simultaneously UPDATE the same row:

-- Node1 executes
UPDATE products SET stock = 10 WHERE id = 100;

-- Node2 executes AT THE SAME TIME
UPDATE products SET stock = 5 WHERE id = 100;

Whichever node replicates last wins. This behavior is called last-write-wins — no error, no rollback, data is silently overwritten. MySQL does not intervene.

Detecting when replication is broken

# Quick check on both nodes
mysql -u root -p -e "SHOW SLAVE STATUS\G" | grep -E "Slave_SQL_Running|Last_Error|Seconds_Behind"

Seeing Slave_SQL_Running: No? Replication has stopped. Read Last_Error to identify the cause, then skip the current error if needed:

-- Skip the current error and continue (use with caution)
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

Automatically skipping duplicate key errors

To skip duplicate key errors (error code 1062) instead of letting replication halt entirely, add this to my.cnf:

[mysqld]
# Only use this when you fully understand the trade-off — data may become inconsistent
slave_skip_errors = 1062

Architectural solution: prevent conflicts at the source

My production database runs MySQL 8.0 with around 50GB of data. The approach to conflict handling isn’t detect-and-fix — it’s smart routing from the start. Specifically: using ProxySQL to route all writes to a single active node. The other node handles reads only. When the active node fails, writes switch to the other node within seconds. This is far simpler than dealing with conflicts after they occur.

Other common patterns to avoid conflicts:

  • Partition by tenant/region: User A always writes to Node1, User B always writes to Node2. The two nodes never write to the same row.
  • Application-level locking: Use Redis SETNX to acquire a distributed lock before writing, ensuring only one writer at a time.

Practical Tips

Use GTID for easier management

MySQL 5.6+ supports GTID (Global Transaction Identifier) — a complete replacement for manually tracking binlog file and position. Failover and recovery become significantly easier:

[mysqld]
gtid_mode                = ON
enforce_gtid_consistency = ON

When using GTID, drop MASTER_LOG_FILE and MASTER_LOG_POS entirely:

CHANGE MASTER TO
  MASTER_HOST          = '192.168.1.10',
  MASTER_USER          = 'repl_user',
  MASTER_PASSWORD      = 'StrongPassword123!',
  MASTER_AUTO_POSITION = 1;  -- MySQL automatically finds the correct position

Continuously monitor replication lag

# Watch lag in real-time, refreshing every 5 seconds
watch -n 5 'mysql -u root -p -e "SHOW SLAVE STATUS\G" | grep Seconds_Behind_Master'

A steadily increasing Seconds_Behind_Master means the slave can’t keep up with the master. Common causes: disk I/O bottleneck, underpowered CPU, or too many concurrent writes — lag exceeding 30 seconds usually warrants immediate investigation.

Pre-production checklist

  • ✅ Both nodes running the same MySQL version
  • binlog_format = ROW — avoid STATEMENT as it easily causes inconsistency with functions like NOW(), RAND()
  • auto_increment_increment and auto_increment_offset configured differently across nodes
  • ✅ Firewall allows port 3306 between both nodes in both directions
  • ✅ Monitoring alert in place when Slave_SQL_Running = No
  • ✅ Runbook ready for handling split-brain when the network between nodes goes down

When should you avoid Master-Master?

Master-Master isn’t always the right choice. Avoid it when:

  • Your application frequently UPDATEs the same rows from multiple nodes — conflicts will be constant
  • You need strong consistency — Active-Active is eventually consistent by nature
  • Your team lacks replication experience — troubleshooting Master-Master is significantly harder than Master-Slave

In those cases, MySQL Group Replication or Galera Cluster are better fits. Both have built-in conflict detection and resolution. The trade-off: slightly higher write latency due to the synchronous consensus required between nodes.

Share: