When Your Database Is Slow and You Don’t Know Where to Start
There’s a situation every seasoned backend developer eventually faces: the application suddenly slows down, users start complaining, but when you check the server, CPU is only at 30% and RAM is comfortable. So where’s the problem?
The answer usually lies in the database — a handful of queries consuming a disproportionate amount of time. I’ve worked with MySQL, PostgreSQL, and MongoDB across different projects. Each has its strengths, but when it comes to debugging performance issues, MySQL has a weapon that many people overlook: pt-query-digest from the Percona Toolkit.
This article gets straight to the point: enable Slow Query Log, collect data, then use pt-query-digest to identify exactly which queries need optimization — no more guessing.
Understanding the Two Key Tools Before You Begin
What Is Slow Query Log?
MySQL has a built-in feature that logs all queries exceeding a defined time threshold to a log file. This file is called the Slow Query Log. MySQL doesn’t enable this by default because it adds extra I/O overhead, but when you need to debug performance issues, it’s the most valuable source of information you have.
The problem is that this log file can get very long and messy. Reading it manually makes it nearly impossible to draw any useful conclusions. A production system running for just a few hours can easily generate tens of thousands of lines.
What Does pt-query-digest Do?
pt-query-digest is a tool from the Percona Toolkit — a collection of command-line utilities developed by Percona for managing and optimizing MySQL/MariaDB. It reads the Slow Query Log, groups similar queries together (even when parameter values differ), and produces statistics: which queries run most frequently, which consume the most total time, and which have the highest average execution time.
The result: 50,000 lines of log condensed into a 2-page report with a concrete priority ranking.
Enabling Slow Query Log in MySQL
No MySQL restart required — enable it directly in the running session:
-- Connect to MySQL with root privileges
mysql -u root -p
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
-- Time threshold (seconds) — queries exceeding this will be logged
SET GLOBAL long_query_time = 1;
-- Also log queries not using indexes
SET GLOBAL log_queries_not_using_indexes = 'ON';
-- Check where the log file is located
SHOW VARIABLES LIKE 'slow_query_log_file';
To enable it permanently (persisting across restarts), add the following to /etc/mysql/mysql.conf.d/mysqld.cnf (Ubuntu) or /etc/my.cnf:
[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
Then restart: sudo systemctl restart mysql
To generate test data right away, run some deliberately slow queries:
-- Query without an index — will be captured in the log
SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- Or simulate real load with SLEEP
SELECT SLEEP(2);
Installing Percona Toolkit
On Ubuntu/Debian:
# Add Percona repository
wget https://repo.percona.com/apt/percona-release_latest.$(lsb_release -sc)_all.deb
sudo dpkg -i percona-release_latest.$(lsb_release -sc)_all.deb
sudo apt-get update
# Install Percona Toolkit
sudo apt-get install percona-toolkit
# Verify installation
pt-query-digest --version
On CentOS/RHEL:
sudo yum install https://repo.percona.com/yum/percona-release-latest.noarch.rpm
sudo yum install percona-toolkit
Analyzing the Log with pt-query-digest
After letting the system run for a while (at minimum a few hours in production to gather enough data), run the analysis command:
# Basic analysis
pt-query-digest /var/log/mysql/mysql-slow.log
# View only the top 5 worst queries
pt-query-digest --limit 5 /var/log/mysql/mysql-slow.log
# Analyze only the last 1 hour of logs
pt-query-digest --since '1h' /var/log/mysql/mysql-slow.log
# Save the report to a file for later review
pt-query-digest /var/log/mysql/mysql-slow.log > /tmp/slow-query-report.txt
Reading the Report Output
The pt-query-digest output has two main sections. The first is an overview of the entire log:
# 120 queries in 0.5k lines, 0.00 QPS, 0x concurrency ______________
# Time range: 2024-01-15 09:00:00 to 2024-01-15 11:00:00
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 245s 100ms 8s 2s 5s 1s 2s
# Lock time 120ms 0us 30ms 1ms 5ms 3ms 500us
# Rows sent 1.50k 1 500 13 100 50 5
# Rows examined 45.00k 1 5000 375 2000 800 100
The second section lists each query group, ranked by total execution time (Response time) — this is the critical part, read from top to bottom:
# Query 1: 0.50 QPS, 4.00x concurrency, ID 0xABC123 at byte 1234
# Scores: V/M = 0.10
# Time range: all events
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 5 60
# Exec time 40% 98s 1s 8s 1s 5s 1s 1s
# Rows examined 60% 27000 50 5000 450 2000 800 100
# EXPLAIN /*!50100 PARTITIONS*/
SELECT * FROM orders
WHERE YEAR(created_at) = 2024
AND status = 'pending'\G
The three most important numbers to watch:
- Count: How many times this query ran during the analysis period
- Exec time (total): Total time — 40% of all slow query time belongs to this single query
- Rows examined: How many rows MySQL had to scan to return results — a high number here signals a missing index
From Report to Optimization Action
Once you’ve identified the worst query, the next step is running EXPLAIN to see how MySQL is executing it:
EXPLAIN SELECT * FROM orders
WHERE YEAR(created_at) = 2024
AND status = 'pending';
The type column showing ALL is a red flag: MySQL is doing a full table scan without using any index. This is exactly what caused the 8-second exec time in the report.
The problem with that query is the YEAR() function wrapping the column, which prevents MySQL from using an index. Rewrite it like this:
-- Instead of using a function (index cannot be used)
SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- Rewrite so MySQL can use the index on created_at
SELECT * FROM orders
WHERE created_at >= '2024-01-01'
AND created_at < '2025-01-01'
AND status = 'pending';
-- Create a composite index for both columns
CREATE INDEX idx_orders_created_status
ON orders (created_at, status);
After creating the index, run EXPLAIN again — the type column should change from ALL to range or ref, and the rows count should drop significantly.
I always add one more step after deploying the fix: let the log run for a few more hours, then run pt-query-digest again. Compare the before and after reports — does that query still appear at the top? Did execution time drop? This is the only reliable way to confirm the optimization actually worked.
Wrapping Up
pt-query-digest isn’t complicated, but it’s effective where it counts. From tens of thousands of messy log lines, you get a ranked report with concrete numbers. In practice, fixing just the top 2–3 queries is usually enough to see a clear difference — sometimes response time drops from several seconds to under 100ms just by adding the right index.
The workflow: enable Slow Query Log → let it run for a few hours → run pt-query-digest → EXPLAIN the top queries → add indexes or rewrite → compare before/after reports. Repeat until nothing alarming remains at the top.

