pg_stat_statements in PostgreSQL: Find and Optimize Resource-Hungry SQL Queries

Database tutorial - IT technology blog
Database tutorial - IT technology blog

When Your Database Slows Down and You Don’t Know Why

PostgreSQL running slow is a familiar situation: CPU hits 100%, response time spikes, but there’s nothing obvious in the logs. If you’ve used EXPLAIN ANALYZE to analyze individual queries, you know how effective that tool is — but the challenge is knowing which query to analyze first.

pg_stat_statements was built to solve exactly this problem. This extension tracks execution statistics for all SQL statements throughout the database’s lifetime — not sampling, not manually enabling logs and grepping. Every query is aggregated by template, and you can ask precisely: which queries consume the most time? Which queries read the most blocks? Which queries run thousands of times a day with a worrying mean execution time?

Unlike EXPLAIN ANALYZE — which only analyzes a specific query on demand — pg_stat_statements runs in the background and continuously accumulates data. When an incident occurs, you already have the metrics to investigate, with no need to reproduce the problem.

Installing the Extension

This extension comes bundled with PostgreSQL — no external packages required. The process involves 3 mandatory steps that must be followed in order.

Step 1: Declare in shared_preload_libraries

Find the postgresql.conf file:

sudo -u postgres psql -c "SHOW config_file;"
# Ubuntu/Debian typically: /etc/postgresql/15/main/postgresql.conf
# CentOS/RHEL typically: /var/lib/pgsql/15/data/postgresql.conf

Open the file and add or edit the line:

# If no libraries are loaded yet:
shared_preload_libraries = 'pg_stat_statements'

# If other libraries are already present, append — separated by comma:
shared_preload_libraries = 'pg_stat_statements,other_library_name'

This step is mandatory — the extension must be loaded when PostgreSQL starts. You cannot simply run CREATE EXTENSION and skip this step.

Step 2: Restart PostgreSQL

sudo systemctl restart postgresql

Step 3: Create the Extension in the Target Database

-- Connect to the database you want to monitor
\c your_database

CREATE EXTENSION pg_stat_statements;

-- Verify the extension is working
SELECT * FROM pg_stat_statements LIMIT 5;

If it returns data (even just a few rows), the extension is working. If you get a “could not open file” error, you most likely skipped the restart in Step 2.

Detailed Configuration

Once installed, you’re ready to go. But there are a few parameters worth adjusting in postgresql.conf depending on your needs:

# Maximum number of query templates to store (default: 5000)
pg_stat_statements.max = 10000

# TOP: only queries called directly from clients (lighter, sufficient for most cases)
# ALL: includes nested queries inside stored procedures/functions
pg_stat_statements.track = top

# Include utility commands such as VACUUM, ANALYZE, CREATE INDEX...
pg_stat_statements.track_utility = on

# Track planning time separately (PostgreSQL 13+)
pg_stat_statements.track_planning = on

After making changes, reload the config without restarting:

sudo -u postgres psql -c "SELECT pg_reload_conf();"

I typically keep track = top in normal production — it’s lighter and sufficient to catch most issues. I only switch to all when debugging performance inside a specific stored procedure, then switch back afterward.

Regarding pg_stat_statements.max: once the limit is reached, the least-seen queries get evicted from the statistics — actual data isn’t lost, only the tracking record is. If your system handles many distinct query types (microservices with hundreds of API endpoints, for example), consider raising this to 10000–20000.

Inspection and Monitoring

The pg_stat_statements view has many columns. Here are practical queries organized by investigation purpose.

Top 10 Queries by Total Execution Time

SELECT 
    round(total_exec_time::numeric, 2) AS total_ms,
    calls,
    round(mean_exec_time::numeric, 2) AS mean_ms,
    round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct_cpu,
    left(query, 100) AS query_snippet
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

This is the first query I run whenever I get a “database is slow” report. The pct_cpu column immediately reveals the culprits — it’s normal for the top 3 queries to account for 80%+ of total time in a real production system. Teams managing multiple PostgreSQL instances often complement this with Percona Monitoring and Management (PMM), which surfaces these query statistics in pre-built dashboards without manual SQL queries.

Top 10 Slowest Queries (by Average Execution Time)

SELECT
    calls,
    round(mean_exec_time::numeric, 2) AS mean_ms,
    round(total_exec_time::numeric, 2) AS total_ms,
    left(query, 100) AS query_snippet
FROM pg_stat_statements
WHERE calls > 10
ORDER BY mean_exec_time DESC
LIMIT 10;

Filtering by calls > 10 avoids false alarms from admin or maintenance queries that run only once or twice. A query that takes 30 seconds but runs once is far less dangerous than one that averages 50ms but runs 10,000 times. To validate that a fix actually improves things under realistic load, Sysbench lets you reproduce that call volume in a controlled environment.

Top Queries by Block Reads — Identifying I/O Bottlenecks

SELECT
    calls,
    shared_blks_hit,
    shared_blks_read,
    round(
        shared_blks_hit * 100.0 / 
        nullif(shared_blks_hit + shared_blks_read, 0)
    , 2) AS cache_hit_pct,
    left(query, 100) AS query_snippet
FROM pg_stat_statements
WHERE shared_blks_read > 0
ORDER BY shared_blks_read DESC
LIMIT 10;

A cache_hit_pct below 90% signals that the query is reading heavily from disk instead of shared buffers. Common causes: missing indexes forcing full sequential scans, or shared_buffers that’s too small relative to the working set. When tuning shared_buffers alone isn’t enough, pgpool-II’s query cache can intercept repeated identical reads before they reach the buffer pool.

Queries with High Planning Time (PostgreSQL 13+)

SELECT
    calls,
    round(mean_plan_time::numeric, 2) AS plan_ms,
    round(mean_exec_time::numeric, 2) AS exec_ms,
    left(query, 100) AS query_snippet
FROM pg_stat_statements
WHERE mean_plan_time > 1
ORDER BY mean_plan_time DESC
LIMIT 10;

High plan time (above a few milliseconds) is usually caused by queries with many complex JOINs, or stale table statistics that force the planner to do more computation. Running ANALYZE table_name; to update statistics often brings plan time down from tens of milliseconds to under 1ms with just that one command. For queries that remain expensive even with fresh statistics — dense aggregations run repeatedly on the same data — materialized views remove both the planning and execution cost entirely.

Resetting Statistics for Fresh Measurements

-- Reset all stats (after deploying a fix or changing config)
SELECT pg_stat_statements_reset();

-- Reset a specific query only (PostgreSQL 14+)
SELECT pg_stat_statements_reset(0, 0, queryid);
-- queryid is taken from the queryid column in the pg_stat_statements view

Viewing Currently Running Queries in Real Time

To see what’s running right now, combine with pg_stat_activity:

SELECT 
    pid,
    now() - query_start AS duration,
    state,
    left(query, 120) AS query_snippet
FROM pg_stat_activity
WHERE query_start IS NOT NULL
  AND state != 'idle'
ORDER BY duration DESC;

If you spot a query that’s been stuck too long and you’re certain it needs to be stopped:

SELECT pg_cancel_backend(pid);    -- sends a cancel request (graceful, prefer this)
SELECT pg_terminate_backend(pid); -- force terminate if cancel doesn't work

A Practical Workflow for Performance Investigation

The process I use whenever I get a “database is slow” report:

  1. Run the total-time query → identify the top 3 most resource-heavy queries
  2. Retrieve the full query text → run EXPLAIN (ANALYZE, BUFFERS) on each one
  3. Check cache hit ratio → if low, review indexes or increase shared_buffers
  4. Check pg_stat_activity for any blocked queries
  5. Reset statistics → deploy the fix → re-measure after 24 hours to confirm real improvement

When processing pg_stat_statements output offline (exporting CSV from psql and analyzing with a script), I often use the converter at toolcraft.app/en/tools/data/csv-to-json to convert to JSON before feeding it into Python — the convenient part is that it runs entirely in the browser, so there’s no concern about query log data being sent off-site.

Key Operational Considerations

  • Low overhead: roughly 1–5% — small enough to keep enabled continuously in production
  • Queries are automatically normalized: Literal values are replaced with $1, $2… to group queries sharing the same template into a single record. This is by design — 1,000 calls to SELECT * FROM users WHERE id = $1 with 1,000 different IDs still appears as a single statistics row
  • Statistics don’t survive restarts: Data is not persisted across PostgreSQL restarts. If you need historical records, export to a file before scheduled maintenance
  • Access permissions: Superusers see all queries from all users. Regular users can only see their own queries (PostgreSQL 14+)

Share: