Problem Introduction: When the Database “Groans”
For system administrators, a slow database is a nightmare. Everything lags, users complain, and the team struggles to find the cause.
I vividly remember one time when a project scaled up, and the users table swelled to over 10 million rows. At that point, simple queries like searching for users by name or email suddenly crawled. It was then I realized: understanding and optimizing Indexes, using EXPLAIN, was no longer optional. It became a mandatory factor to prevent the system from crashing under load.
I will share practical experience on how our team used Indexes and EXPLAIN to “rescue” MySQL performance. I hope this knowledge is useful for you all, especially beginners or anyone facing similar issues.
Core Concepts: Index and EXPLAIN
What is an Index and why do we need it?
Imagine an Index as the table of contents in a book. When you need to find specific information, you don’t have to read the entire book. Instead, the table of contents will point you exactly to the page you need. Indexes in a database work the same way.
- Definition: An Index is a special data structure (often B-Tree) created by the database engine. It helps speed up data retrieval and queries. An Index stores a set of values from one or more columns in a table, along with pointers to the actual location of the data rows.
- Key Benefits:
- Increased query speed: This is the primary benefit. With an Index, MySQL quickly locates data rows without needing to scan the entire table (full table scan).
- Faster data sorting:
ORDER BYandGROUP BYclauses use Indexes to sort and group data more quickly. This reduces or eliminates costly “filesort” operations. - Ensuring uniqueness: Indexes like
PRIMARY KEYandUNIQUE INDEXnot only speed things up but also ensure that the data in a column is unique. - Downsides of Indexes:
- Storage consumption: Indexes occupy disk space.
- Reduced write speed: Every time data is added (
INSERT), modified (UPDATE), or deleted (DELETE), the database must update both the main table and its associated Indexes. This increases I/O costs and reduces write performance. - Management overhead: MySQL requires resources to maintain Indexes.
Therefore, it’s crucial to carefully consider when creating Indexes; don’t create them indiscriminately.
Common Index Types in MySQL
In MySQL, there are several main Index types commonly used:
- PRIMARY KEY: The primary Index. Ensures each row in the table is unique and cannot be NULL. Each table can only have one PRIMARY KEY.
- UNIQUE INDEX: Similar to PRIMARY KEY, but allows NULL values (if the column permits). Also ensures data uniqueness.
- NORMAL INDEX (or Non-Unique Index): A regular Index. Allows duplicate values and NULLs. Used for columns frequently searched.
- FULLTEXT INDEX: Used for full-text search. The blog already has a separate article on this, so I won’t go into detail here.
- COMPOSITE INDEX (Multi-column Index): An Index created on multiple columns. The order of columns in this Index is extremely important.
What is EXPLAIN and how does it help?
If an Index is a weapon, EXPLAIN is the analysis tool to know if that weapon is effective. The EXPLAIN command in MySQL displays the execution plan of an SQL statement. It shows how MySQL accesses data, whether it uses Indexes, and how many steps it takes to complete the query.
Simple syntax:
EXPLAIN SELECT * FROM your_table WHERE your_column = 'value';
The result of EXPLAIN is a table with several important information columns:
id: The sequence number of the steps in the query.select_type: The type of query (SIMPLE, PRIMARY, SUBQUERY, UNION, etc.).table: The name of the table being processed by the query.type: This is an extremely important column, indicating how MySQL accesses data. Good values aresystem,const,eq_ref,ref,range. Poor values includeindex(scanning the entire Index) andALL(scanning the entire table – very bad).possible_keys: Indexes that MySQL could potentially use.key: The actual Index that MySQL chose to use.key_len: The length (in bytes) of the Index part used.ref: The columns used with thekeyfor searching.rows: The estimated number of rows MySQL will have to examine to find the result. Smaller is better.Extra: Extremely useful additional information about how MySQL processes the query, for example: “Using filesort”, “Using temporary”, “Using index”.
Detailed Practice: Using EXPLAIN and Creating Indexes
To make it easier to visualize, I’ll use a specific example. Suppose I have the following products table:
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
category_id INT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
INDEX idx_category_price (category_id, price)
);
INSERT INTO products (product_name, category_id, price) VALUES
('Laptop Dell XPS 15', 1, 1500.00),
('Smartphone Samsung S23', 2, 999.00),
('Smart TV Sony 55 inch', 3, 750.00),
('Logitech Mechanical Keyboard', 1, 120.00),
('Razer Gaming Mouse', 1, 70.00),
('Electrolux Washing Machine', 4, 600.00),
('Panasonic Refrigerator', 4, 850.00),
('JBL Bluetooth Speaker', 2, 150.00),
('Sony WH-1000XM5 Headphones', 2, 350.00),
('Dell UltraSharp Monitor', 1, 450.00);
-- Add a lot of dummy data to simulate a large table
DELIMITER //
CREATE PROCEDURE InsertDummyProducts()
BEGIN
DECLARE i INT DEFAULT 0;
WHILE i < 1000000 DO -- Add 1 million products
INSERT INTO products (product_name, category_id, price) VALUES
(CONCAT('Product ', FLOOR(RAND() * 1000000)), FLOOR(1 + RAND() * 4), ROUND(RAND() * 2000 + 50, 2));
SET i = i + 1;
END WHILE;
END //
DELIMITER ;
CALL InsertDummyProducts();
Now, suppose I want to find all products with prices within a certain range but without an Index on the price column yet.
Query Analysis with EXPLAIN (before creating an Index)
I run the following query:
EXPLAIN SELECT product_name, price FROM products WHERE price BETWEEN 100 AND 200;
The result might look like this:
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+----------+-------------+
| 1 | SIMPLE | products | NULL | ALL | NULL | NULL | NULL | NULL | 1000010 | 11.11 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+----------+-------------+
Looking at the type column being ALL, key being NULL, and rows reaching 1,000,010. This indicates that MySQL is performing a full table scan. This means it iterates through 1 million data rows to find products that satisfy the condition price BETWEEN 100 AND 200. With large tables, this is the main cause of delays.
Creating and Managing Indexes for Optimization
Now I’ll create an Index on the price column to improve performance:
CREATE INDEX idx_price ON products (price);
After creating the Index, run the EXPLAIN command again for the previous query:
EXPLAIN SELECT product_name, price FROM products WHERE price BETWEEN 100 AND 200;
The results change significantly:
+----+-------------+----------+------------+-------+---------------+-----------+---------+------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+-----------+---------+------+-------+----------+-------------+
| 1 | SIMPLE | products | NULL | range | idx_price | idx_price | 5 | NULL | 10000 | 100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+-----------+---------+------+-------+----------+-------------+
Now, the type column is range (much better than ALL). The key column shows idx_price (MySQL used the Index!). Most importantly, the rows column has decreased to about 10,000. This means MySQL only scans a small portion of the indexed data to find results, instead of scanning the entire table.
Composite Index (Multi-column Index)
Sometimes, the WHERE condition involves multiple columns. For example, to find products in a specific category and within a price range:
EXPLAIN SELECT product_name FROM products WHERE category_id = 1 AND price BETWEEN 100 AND 500;
If only idx_price exists, MySQL might use it, but it wouldn’t be the most optimal. MySQL might still need to filter further by category_id on the rows already found. A Composite Index on (category_id, price) is much more effective:
CREATE INDEX idx_category_price ON products (category_id, price);
Now, run EXPLAIN again:
EXPLAIN SELECT product_name FROM products WHERE category_id = 1 AND price BETWEEN 100 AND 500;
You will see that the key is idx_category_price, and the type is range or ref. The number of rows will decrease even more significantly because MySQL uses this Index to filter both conditions simultaneously.
Important note on column order in Composite Indexes: The column order must match how they are used in the WHERE clause. MySQL uses Indexes from left to right. If you create an Index (col1, col2), it’s useful for queries using WHERE col1 = ... or WHERE col1 = ... AND col2 = .... However, it will be less or not useful if you only use WHERE col2 = ....
Covering Index
A Covering Index is an Index that contains all the columns required by a query, both in the SELECT clause and the WHERE clause. When MySQL can retrieve all data from the Index without needing to access the main table, the query will be very fast.
For example: if you frequently run SELECT category_id, price FROM products WHERE category_id = 1 AND price > 100;, the Index (category_id, price) not only helps with the WHERE condition. It also “covers” the columns in the SELECT clause. In such cases, EXPLAIN will display Extra: Using index. This indicates that MySQL only needs to read the Index to complete the query, which is extremely efficient.
Dropping Unnecessary Indexes
If an Index is no longer used or imposes a burden on write operations, you can drop it:
DROP INDEX idx_price ON products;
Important Considerations When Using Indexes
- Only Index frequently queried columns: Don’t create Indexes indiscriminately. Columns that often appear in
WHERE,JOIN,ORDER BY,GROUP BYclauses are ideal candidates. - Column Cardinality: Indexes are most effective on columns with many unique values (high cardinality), such as email or product codes. Columns with few unique values (low cardinality), such as gender (male/female), rarely need an Index. This is because MySQL can often scan the entire table faster than using an Index and then filtering.
- Avoid using functions on Indexed columns: If you use a function on an indexed column in the
WHEREclause (e.g.,WHERE YEAR(created_at) = 2023), MySQL often won’t use that Index. Instead, rewrite the query to compare directly with the column (e.g.,WHERE created_at BETWEEN '2023-01-01' AND '2023-12-31 23:59:59'). LIKEoperator: Indexes can be used withLIKE 'prefix%'(searching for a start). However, they will not be used withLIKE '%suffix'orLIKE '%substring%'. This is because MySQL cannot leverage the tree structure of the Index in these cases.ORand Indexes: UsingORin theWHEREcondition can prevent MySQL from using an Index. Sometimes, splitting the query intoUNIONorUNION ALLis more optimal, especially if each part of theORcan use its own Index.- Small tables don’t need Indexes: For tables with only a few hundred or a few thousand rows, the cost of maintaining an Index might outweigh the benefits. MySQL can often scan the entire table very quickly.
Conclusion: Continuous Monitoring and Optimization
In my experience, optimizing MySQL queries using Indexes and EXPLAIN is an essential skill for anyone working with databases. It not only helps the system run faster but also provides a deeper understanding of how the database works.
However, this is not a one-time task. Systems and data constantly change and grow. Continuously monitoring performance, using EXPLAIN to check new or slow queries, and adjusting Indexes accordingly plays a vital role in the system’s development lifecycle.
View EXPLAIN as a trusted companion. It helps you “see through” how MySQL processes data, enabling you to make the smartest Index optimization decisions. I wish you success in “harnessing” the power of MySQL!
