Mastering MySQL & PostgreSQL with Prometheus Exporter: Ending the ‘Guessing Game’ of Slow Queries

The Nightmare of a “Silent” Database

Have you ever been woken up at 2 AM because the web was slow, while the database remained a “black box” with no answers? Early in my career, my only reflex was to SSH into the server, frantically typing top, htop, and then checking iotop to see if the disk was “screaming.” If everything looked green, I’d have to painstakingly grep through gigabytes of log files to find Slow Queries. This approach was both painful and hit-or-miss.

To be honest, after six months on the front lines with the Prometheus and Grafana duo, I finally understood the value of observability. Having a dashboard is like driving a car with a full set of speed and temperature gauges. Every production metric is clearly visible through numbers; there’s no room for vague complaints like “it feels like the DB is lagging.”

Evaluating Monitoring Solutions

Before settling on Prometheus Exporter, I tried several other options:

  • Cloud Solutions (CloudWatch, Azure Monitor): Extremely low maintenance with just a few clicks. However, the bill at the end of the month will shock you if you want high-frequency custom metrics.
  • Zabbix/Netdata: Zabbix has great alerting, but its dashboard interface is a bit “vintage.” Customizing time-series data is also a major challenge. Netdata is too detailed at the OS level, which sometimes creates noise when you’re looking for specific info.
  • Prometheus Exporter: This was the optimal choice for me. The exporter queries directly into MySQL’s performance_schema or Postgres’s pg_stat_activity. It captures “deep” metrics that the OS can never reach.

Pros and Cons of Exporters

Pros:

  • Captures internal metrics: buffer pool hit ratio, transaction locks, or second-accurate replication lag.
  • Extremely low system overhead (typically only 1-2% CPU).
  • Strong community support. You can just import a ready-made Grafana dashboard template and use it immediately.

Cons:

  • Requires installing a small agent on the server or running a sidecar container.
  • Requires strict security configuration for the port the Exporter listens on.

Getting Started with MySQL Monitoring using mysqld_exporter

For security, never use the root user. You should create a dedicated user in MySQL with minimum privileges for the Exporter to function.

-- Create a secure monitoring user
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'MatKhauSieuKho123' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
FLUSH PRIVILEGES;

If you prefer tidiness, use Docker Compose. This makes managing versions and environments extremely simple.

# docker-compose.yml
services:
  mysql-exporter:
    image: prom/mysqld-exporter
    container_name: mysql-exporter
    environment:
      - DATA_SOURCE_NAME=exporter:MatKhauSieuKho123@(mysql-host:3306)/
    ports:
      - "9104:9104"
    restart: always

Pro tip: To track Slow Queries effectively, ensure you have slow_query_log enabled. The Exporter will pull data from information_schema to visualize it on your charts.

Setting Up PostgreSQL Monitoring with postgres_exporter

With PostgreSQL, the power lies in the Stats Collector. I often combine it with pg_stat_statements to drill down into the queries consuming the most resources.

-- Add this line to postgresql.conf and restart the DB
shared_preload_libraries = 'pg_stat_statements'

-- Then configure the user for the exporter
CREATE USER postgres_exporter PASSWORD 'MatKhauSieuKho123';
ALTER USER postgres_exporter SET SEARCH_PATH TO postgres_exporter,pg_catalog;
GRANT pg_monitor TO postgres_exporter;

Launch the Postgres Exporter with a single command:

docker run -d \
  --name postgres-exporter \
  -e DATA_SOURCE_NAME="postgresql://postgres_exporter:MatKhauSieuKho123@db-host:5432/postgres?sslmode=disable" \
  -p 9187:9187 \
  prometheuscommunity/postgres-exporter

4 “Golden” Metrics to Keep an Eye On

After many “firefighting” sessions, here is the list of metrics I always prioritize in the most visible spots on my Dashboard.

1. Connection Pool

When active_connections hits 90% of max_connections, the system will start rejecting users. I usually set alerts at 80% to handle it before a disaster strikes.

2. Slow Queries

Don’t just look at the count; look at the trend. If the number of queries running over 100ms spikes after a deployment, the dev team likely forgot to add an index somewhere.

3. Replication Lag

In Master-Slave systems, lag is a critical metric. If the Slave lags by more than 5-10 seconds, the data users see will be stale. In MySQL, track the seconds_behind_master metric closely.

4. Cache Hit Ratio

A healthy database should have a Cache Hit Ratio above 95%. If this drops, the DB is reading from disk too much. At this point, upgrading RAM is a non-negotiable solution.

Real-world Case Study: When a “Hung” Query Bloated Disk Space by 50GB

Once, thanks to postgres_exporter charts, I discovered a transaction stuck in “Idle in transaction” for 2 hours. It held a lock that prevented the Auto-vacuum process from cleaning up, causing the DB size to bloat by 50GB overnight. If I had only used the standard ps aux command, I definitely would have missed this.

My advice: Don’t wait for an incident to install monitoring. Implement it starting from Staging. Looking at charts daily helps you understand the “heartbeat” of your database, allowing you to predict when to scale resources before the system crashes.

Finally, mastering the numbers gives you more confidence when working with the Dev team. Instead of saying “I think the DB is a bit slow,” say “Query X is taking up 40% of IOPS and causing a 5-second lag on the Slave.” Sounds much more professional and convincing, doesn’t it?

Share: