MySQL Isolation Levels: Don’t Let Transactions Ruin Your Data

MySQL tutorial - IT technology blog
MySQL tutorial - IT technology blog

When Data Suddenly… Doesn’t Add Up

When I first managed an e-commerce site with a scale of about 500 orders per minute, I encountered a “spooky” bug that kept the whole team awake. The system ran perfectly in local testing. However, when pushed to production with an orders table exceeding 10 million rows, irrational cases began to appear. Customers were charged, but orders still showed “Pending payment,” or the same inventory report would show different numbers every time I refreshed the page.

I realized that just using BEGIN and COMMIT wasn’t enough. When hundreds of transactions hit the same data table simultaneously, a lack of control over the Isolation Level leads to database chaos. We need to clearly understand how MySQL manages sessions to avoid concurrency issues.

Three Classic Phenomena Causing Data Inconsistency

Before configuring, you need to clearly distinguish the three issues that every backend developer must face.

1. Dirty Read

This phenomenon occurs when Transaction A reads a row that Transaction B is modifying but has not yet COMMITTED. Imagine Customer A just deposited $500; the system has only temporarily written it to the DB. If Transaction B reads that $500, but Transaction A subsequently fails and undergoes a ROLLBACK, Customer A actually still has $0. Transaction B has read “dirty” or garbage data.

2. Non-repeatable Read

You SELECT a balance for the first time and see $100. Immediately after, another process UPDATEs this balance to $50 and COMMITs. You SELECT a second time (still within the old transaction) and find it has changed to $50. Data changing in the middle of a single transaction often breaks calculation and accounting logic.

3. Phantom Read

Unlike modifying existing rows, Phantom Reads involve INSERTs or DELETEs. For example: You SELECT COUNT(*) to count orders and see 10. A second later, someone inserts a new order. You SELECT again and see 11. These “phantom” rows are the main culprits for discrepancies in end-of-period statistical reports.

4 Isolation Levels in MySQL and How They Work

MySQL (via the InnoDB engine) provides four customizable levels. The higher the level, the safer the data, but the price is a significant drop in performance due to locking mechanisms.

Level 1: READ UNCOMMITTED

This is the loosest level. Transactions can see each other’s uncommitted changes. In practice, I rarely use this level because the risk of Dirty Reads is too high.

SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
START TRANSACTION;
-- You will see temporary changes from others
SELECT balance FROM accounts WHERE id = 1;

Level 2: READ COMMITTED

This is the default choice for SQL Server or PostgreSQL. It ensures you only read what has been COMMITTED. This level prevents Dirty Reads, but still allows Non-repeatable Reads and Phantom Reads.

Level 3: REPEATABLE READ (MySQL Default)

InnoDB uses a very smart MVCC (Multi-Version Concurrency Control) mechanism. Within a transaction, no matter how many times you SELECT, the data remains identical to the first time. MySQL also uses a technique called Next-Key Locking to prevent most cases of Phantom Reads. This is a major advantage that helps MySQL balance speed and safety.

Level 4: SERIALIZABLE

This is the most extreme level. Transactions must be executed sequentially. If Transaction A is reading data range X, Transaction B must wait if it wants to insert into it. Processing speed will plummet if the system has a high volume of requests.

Practical Guide: Checking and Changing Isolation Levels

To check the current mode, use the command:

SELECT @@transaction_isolation;

If you are processing a sensitive financial transaction, you can force that session to run at the highest level:

SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
-- Execute payment logic
UPDATE accounts SET balance = balance - 100 WHERE user_id = 10;
COMMIT;

Practical Experience: Which Level to Choose for Optimization?

Here are a few rules I’ve gathered after years of working with large databases:

  • Prioritize REPEATABLE READ: Suitable for 90% of web applications. MySQL has optimized this exceptionally well using MVCC, so you don’t need to worry too much about speed.
  • When to use READ COMMITTED? When you have massive data tables and can accept slight data changes in exchange for logging speed or to minimize Deadlocks.
  • Avoid SERIALIZABLE if possible: Only use it for extremely strict banking cores. If you apply this to an e-commerce site during a Flash Sale, the system will definitely hang as transactions wait for each other’s locks.

An important note: When a table exceeds 10 million rows, SELECT statements lacking an Index will cause Gap Locking. The database will lock a large range of data, preventing other processes from INSERTing. Always check the Index for columns used in WHERE clauses.

Summary

Choosing an Isolation Level is not just about changing a configuration; it’s a trade-off between accuracy and performance. Don’t just leave it to the defaults. Base your choice on the specific nature of your data. If your system is experiencing Deadlock errors or data inconsistencies during multi-threaded operations, reviewing the Isolation Level should be your first step.

Share: