Integrating Nginx VTS Module with Prometheus and Grafana: Per-Domain Traffic Monitoring

Monitoring tutorial - IT technology blog
Monitoring tutorial - IT technology blog

2 AM, server slowing down — and the question “which domain is eating up bandwidth?”

I remember that production night vividly when timeouts started hitting. I opened Grafana — CPU was fine, RAM had headroom, disk I/O looked normal. But response time had spiked to 8 seconds. The server was hosting 5 different domains on the same Nginx instance, and there was no way to tell which one was causing the problem without sitting there grep-ing through access logs with eyes half shut.

The lesson from that night: system metrics aren’t enough when you need to analyze traffic per virtual host. Nginx’s default stub_status only gives you total connections — meaningless in a multi-domain setup. You need VTS Module.

What is Nginx VTS Module?

Nginx Virtual Host Traffic Status (VTS) is a third-party module that provides detailed metrics for each server block. Instead of aggregated totals, you get a breakdown per domain:

  • Request rate and response time per virtual host
  • Inbound/outbound bandwidth per domain
  • HTTP status code distribution (2xx, 4xx, 5xx) per host
  • Upstream response time when using Nginx as a reverse proxy

VTS exposes metrics in Prometheus format — combined with Prometheus + Grafana, it forms a complete Nginx monitoring stack. Before having proper monitoring, I had to SSH into each server to investigate — now I just open the dashboard and see everything at a glance. Especially for multi-domain setups, this is something I wish I’d known years earlier.

Hands-on: Installation and Configuration from Scratch

Step 1: Compile Nginx with VTS Module

VTS is not a built-in module — it needs to be compiled from source. First, check your current Nginx version:

nginx -v
# nginx version: nginx/1.24.0

Install dependencies:

sudo apt update
sudo apt install -y build-essential libpcre3-dev zlib1g-dev libssl-dev libgd-dev git

Download the matching Nginx source version and clone VTS:

cd /tmp
wget http://nginx.org/download/nginx-1.24.0.tar.gz
tar -xzf nginx-1.24.0.tar.gz

git clone https://github.com/vozlt/nginx-module-vts.git

Get the configure arguments of the running Nginx — important: you must preserve all existing modules:

nginx -V 2>&1 | grep "configure arguments"

Compile with VTS added (append --add-module to the existing arguments):

cd /tmp/nginx-1.24.0

./configure \
    --with-compat \
    --with-http_ssl_module \
    --with-http_v2_module \
    --with-http_gzip_static_module \
    --add-module=../nginx-module-vts

make
sudo make install

If Nginx was installed via apt, back up the old binary then replace it:

sudo cp /usr/sbin/nginx /usr/sbin/nginx.bak
sudo cp /tmp/nginx-1.24.0/objs/nginx /usr/sbin/nginx

# Confirm the module is present
nginx -V 2>&1 | grep vts

Step 2: Configure Nginx to Expose VTS Metrics

Add the following to nginx.conf inside the http block:

http {
    # Enable VTS — required
    vhost_traffic_status_zone;

    # Split metrics by virtual host
    vhost_traffic_status_filter_by_host on;

    server {
        listen 9145;  # Dedicated port for internal metrics
        server_name localhost;

        location /metrics {
            vhost_traffic_status_display;
            vhost_traffic_status_display_format prometheus;

            # Only allow the Prometheus server and localhost
            allow 127.0.0.1;
            allow 10.0.0.0/8;
            deny all;
        }

        location /nginx_status {
            vhost_traffic_status_display;
            vhost_traffic_status_display_format html;
            allow 127.0.0.1;
            deny all;
        }
    }

    # Regular virtual hosts
    server {
        listen 80;
        server_name example.com;
        # ... regular config
    }

    server {
        listen 80;
        server_name blog.example.com;
        # ... regular config
    }
}

Test and reload:

sudo nginx -t
sudo systemctl reload nginx

# Verify metrics are being exposed
curl http://localhost:9145/metrics | head -20

A correct output will look like this:

# HELP nginx_vts_info Nginx info
# TYPE nginx_vts_info gauge
nginx_vts_info{hostname="prod-server-01",version="1.24.0"} 1
# HELP nginx_vts_server_bytes_total The request/response bytes
# TYPE nginx_vts_server_bytes_total counter
nginx_vts_server_bytes_total{host="example.com",direction="in"} 1234567
nginx_vts_server_bytes_total{host="example.com",direction="out"} 9876543
nginx_vts_server_bytes_total{host="blog.example.com",direction="in"} 456789

Seeing metrics split by individual host labels means it’s working. If everything is grouped under *, double-check the vhost_traffic_status_filter_by_host on directive.

Step 3: Configure Prometheus to Scrape VTS

Add a new job to prometheus.yml:

scrape_configs:
  # ... keep existing jobs as-is

  - job_name: 'nginx-vts'
    static_configs:
      - targets: ['prod-server-01:9145']
        labels:
          server: 'prod-server-01'
          env: 'production'
    metrics_path: '/metrics'
    scrape_interval: 15s

Reload Prometheus without a full restart:

curl -X POST http://localhost:9090/-/reload

# Or if using systemd
sudo systemctl reload prometheus

Go to http://prometheus:9090/targets and verify the nginx-vts target shows a status of UP.

Step 4: Import the Grafana Dashboard

Grafana Dashboard ID 14824 is the most popular dashboard for Nginx VTS:

  1. Go to Grafana → Dashboards → Import
  2. Enter ID: 14824 → Load
  3. Select your Prometheus datasource → Import

The dashboard immediately shows you:

  • Requests/second broken down per domain
  • Real-time bandwidth in/out
  • HTTP 5xx error rate per host — the first thing I check whenever an alert fires
  • Upstream response time if Nginx is acting as a reverse proxy to an app server

Step 5: Alert on Per-Domain 5xx Spikes

Here’s the PromQL to detect which domain is experiencing elevated errors — the query I use in production:

# 5xx rate per domain over 5 minutes (alert when > 5%)
sum by (host) (
  rate(nginx_vts_server_requests_total{code=~"5.."}[5m])
)
/
sum by (host) (
  rate(nginx_vts_server_requests_total[5m])
) > 0.05

# Or simpler — absolute rate
rate(nginx_vts_server_requests_total{code=~"5.."}[5m]) > 0.5

Add this to Grafana Alerting with a Telegram or email notification channel — when 5xx spikes, you get alerted immediately instead of finding out when a customer calls.

Troubleshooting Common Issues

Metrics endpoint returns 404

# Check if the module was compiled in
nginx -V 2>&1 | grep vts

# Check config syntax
sudo nginx -t

# Check the error log if still getting 404
sudo tail -f /var/log/nginx/error.log

All metrics grouped under host “*”

The vhost_traffic_status_filter_by_host on; directive is missing from the http block. This directive must be placed at the http level, not inside a server block.

Prometheus cannot scrape the target

# Test from the Prometheus server
curl -v http://prod-server-01:9145/metrics

# Check the firewall
sudo ufw status
sudo iptables -L INPUT -n | grep 9145

# Open the port if needed (Prometheus server only)
sudo ufw allow from 10.0.1.50 to any port 9145

Conclusion

After that night of blind debugging, VTS Module became the first thing I install whenever I set up a new Nginx server running multiple domains. The ability to break down traffic per host lets you isolate problems in minutes rather than hours.

The Nginx VTS + Prometheus + Grafana stack is especially valuable when hosting multiple services on a single server, or when you need to prove to a client which domain a traffic spike originated from. This is the kind of visibility that tailing access logs in real time simply cannot give you.

One security note: the /metrics endpoint reveals fairly sensitive traffic pattern data. Always restrict it with an IP whitelist and consider placing it behind an internal VPN — never expose it to the public internet.

Share: