Optimizing MySQL 8 with EXPLAIN ANALYZE: Don’t Let the Optimizer Mislead You – ITFROMZERO

Table of Contents

When Traditional EXPLAIN Becomes Just “Empty Promises”

If you work with MySQL, EXPLAIN is likely your go-to command whenever the system reports a slow query. Adding this keyword to the beginning of a statement to see how MySQL plans to execute it is a familiar task. However, the old EXPLAIN is only a forecast based on probability. It’s like checking Google Maps and seeing a clear road with a 10-minute estimate, only to find yourself stuck in a traffic jam because of an accident that just happened.

The MySQL Optimizer relies on statistics to estimate rows and cost. The problem is that this data is often outdated or fails to reflect the actual data on the disk. I once handled a tricky case: EXPLAIN reported only 50 rows scanned, but the query took 30 seconds. It turned out to be an I/O bottleneck when reading actual data from an old hard drive. At that time, I wished I could see exactly what MySQL was doing under the hood.

That’s why EXPLAIN ANALYZE (introduced in MySQL 8.0.18) is a lifesaver. Instead of just guessing, it directly executes the statement and records a detailed log. This is the difference between “listening to a weather forecast” and “actually stepping outside.”

Why EXPLAIN ANALYZE is a Must-Have Tool in Your Arsenal

Reality is Always More Reliable Than Estimates

The biggest difference lies in the “Actual” figures placed right next to the “Estimated” ones. You’ll have complete visibility over:

Time to first row: How long it takes to receive the first record (critical for web apps requiring fast responses).
Time to all rows: Total time to complete that step.
Actual rows: Actual number of records scanned.
Loops: Actual number of iterations when performing Joins.

Looking at the results, you’ll immediately see where the bottleneck is. No more arguing over whether an Index is effective; the actual numbers speak for themselves.

Caution: Real Measurements Require Real Resources

Since EXPLAIN ANALYZE actually executes the SQL statement, data will be affected if used with INSERT, UPDATE, or DELETE. For large tables with around 100 million records, running this command takes exactly as long as a normal query—sometimes longer due to detailed logging. Never “test” this directly on a production DB during peak hours to avoid hanging the system. Ideally, bring the query to a Staging environment with equivalent data for dissection.

How to Read the Data to Diagnose Queries

The syntax is very simple; just add EXPLAIN ANALYZE before the SELECT statement:

EXPLAIN ANALYZE 
SELECT e.first_name, d.dept_name
FROM employees e
JOIN dept_emp de ON e.emp_no = de.emp_no
JOIN departments d ON de.dept_no = d.dept_no
WHERE e.last_name = 'Facello';

The returned result will be in a Tree format. Focus on reading from the inside out and bottom up to see the execution sequence:

-> Inner hash join (de.dept_no = d.dept_no)  (cost=120.50 rows=150) (actual time=0.512..0.890 rows=180 loops=1)
    -> Table scan on d  (cost=1.10 rows=10) (actual time=0.011..0.025 rows=10 loops=1)
    -> Hash
        -> Inner hash join (e.emp_no = de.emp_no)  (cost=100.20 rows=140) (actual time=0.300..0.750 rows=150 loops=1)
            -> Index lookup on e using last_name (last_name='Facello')  (cost=20.10 rows=100) (actual time=0.100..0.250 rows=100 loops=1)
            -> Hash
                -> Table scan on de  (cost=50.50 rows=1000) (actual time=0.050..0.400 rows=1000 loops=1)

The Three Most Important Metrics to Watch

actual time=0.100..0.250: 0.100ms for the first row, 0.250ms to finish that step. If the gap between these two numbers is too large, that step is processing a massive amount of data.
rows=180: If this number deviates significantly from the estimated rows (e.g., estimated 10 but actually 1000), it means the Optimizer is being misled by outdated statistics.
loops=1: In a Nested Loop Join, if loops jump into the thousands, that’s exactly where you need to add an Index immediately.

Real-World Experience: Lessons from an E-commerce System

Recently, I optimized a website where loading the order list for VIP customers took over 5 seconds. EXPLAIN reported that MySQL was using an index on customer_id and only scanning 50 rows. Theoretically, it looked perfect, but it was still slow in practice.

When I enabled EXPLAIN ANALYZE, the truth was revealed: the Index lookup step took 4.5 seconds. Even though there were only 50 rows, the data for this customer was scattered across different disk partitions. MySQL had to perform continuous Random I/O to read the data. I decided to create a Covering Index that included all the necessary columns. As a result, MySQL only had to read from the index and didn’t need to access the disk. The execution time dropped from 5 seconds to just 0.04 seconds.

The lesson learned: Don’t trust what the Optimizer predicts; look at what it actually does.

A 3-Step Optimization Process

Quick Check: Use standard EXPLAIN to eliminate basic errors like missing indexes or type=ALL (full table scans).
Dissection: Use EXPLAIN ANALYZE to compare different SQL writing options or check which step consumes the most milliseconds.
Verification: After fixing code or adding an Index, run EXPLAIN ANALYZE again to ensure the actual time numbers have decreased as expected.

If you are using MySQL 8, make this tool a habit. It helps you eliminate vague guesswork and focus on solutions based on real-world data.