Monitoring Node.js Applications with prom-client: Request Rate, Latency, and Custom Business Metrics in Real Time

Monitoring tutorial - IT technology blog
Monitoring tutorial - IT technology blog

Have you ever gotten a 2 AM phone call about a slow API with no idea where to even start looking? I’ve been there — SSH-ing into servers one by one, running top, netstat, and still coming up empty. Since integrating prom-client + Grafana, everything changed. Now I just open the dashboard and instantly see: current request rate, which endpoints are slow, and what the error rate is.

The blog already has a post on setting up Prometheus + Grafana for server monitoring (CPU, RAM, disk). This post goes one layer deeper: monitoring at the application level — what your actual Node.js code is doing, which endpoints are bottlenecked, and whether your business logic is running correctly.

3 Ways to Monitor Node.js — Compare Before You Choose

prom-client isn’t always the right answer. Here are 3 common approaches — each exists for a reason.

Option 1: Logs Only + Manual Analysis

Log requests/responses to a file, then use grep, awk, or Graylog to analyze after the fact.

  • Pros: No additional setup required, logs are already there, easy to debug specific issues
  • Cons: No visibility into trends over time, no real-time alerting, manual analysis is time-consuming — especially when incidents happen at 3 AM

Option 2: Commercial APM (Datadog, New Relic, Dynatrace)

Install an agent, everything gets traced automatically, beautiful dashboards out of the box.

  • Pros: Extremely easy to set up, includes distributed tracing and anomaly detection, no infrastructure to manage
  • Cons: High cost (Datadog starts at $15/host/month, plus $0.10/GB data ingested), vendor lock-in, can’t define custom metrics tailored to your own business logic

Option 3: prom-client + Prometheus + Grafana (self-hosted)

You expose metrics directly from your code, Prometheus scrapes them on a schedule, and Grafana visualizes and alerts.

  • Pros: Completely free, full control, define metrics exactly as you want, large community, excellent Kubernetes integration
  • Cons: Requires learning PromQL for queries, you manage the Prometheus + Grafana infrastructure yourself

Why Choose prom-client?

If your project already has Prometheus (or you’re planning to add it), prom-client is the most natural choice. Commercial APMs suit large teams with big budgets that need complex distributed tracing. For startups, side projects, or when you want to track specific business metrics your own way — prom-client + Grafana is more than enough and completely free.

prom-client has 4 metric types. Counter only goes up — use it to count requests and errors. Gauge goes up and down — active connections, memory. Histogram captures distributions — request duration. Summary computes client-side quantiles. For a web API, Counter and Histogram are the two you’ll use most.

Integrating prom-client into Express.js — Step by Step

Step 1: Install the Package

npm install prom-client

Step 2: Initialize Metrics in a Separate File

Separate monitoring logic into its own metrics.js file to keep it out of your business code:

// metrics.js
const client = require('prom-client');

const register = new client.Registry();

// Collect Node.js default metrics (memory heap, event loop lag, GC...)
client.collectDefaultMetrics({ register });

// Counter: track total HTTP requests
const httpRequestsTotal = new client.Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'route', 'status_code'],
  registers: [register],
});

// Histogram: distribution of request processing time (latency)
const httpRequestDuration = new client.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
  registers: [register],
});

// Business metric: count orders created (example)
const ordersCreatedTotal = new client.Counter({
  name: 'orders_created_total',
  help: 'Total number of orders created',
  labelNames: ['status', 'payment_method'],
  registers: [register],
});

// Gauge: number of currently active users (can go up or down)
const activeUsers = new client.Gauge({
  name: 'active_users_current',
  help: 'Number of currently active users',
  registers: [register],
});

module.exports = { register, httpRequestsTotal, httpRequestDuration, ordersCreatedTotal, activeUsers };

Step 3: Middleware to Automatically Track Every HTTP Request

// middleware/metricsMiddleware.js
const { httpRequestsTotal, httpRequestDuration } = require('../metrics');

function metricsMiddleware(req, res, next) {
  const start = Date.now();

  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    // req.route.path returns the pattern like /api/users/:id instead of /api/users/123
    const route = req.route ? req.route.path : req.path;
    const labels = { method: req.method, route, status_code: res.statusCode };

    httpRequestsTotal.inc(labels);
    httpRequestDuration.observe(labels, duration);
  });

  next();
}

module.exports = metricsMiddleware;

Step 4: Register Middleware and Expose /metrics

// app.js
const express = require('express');
const { register } = require('./metrics');
const metricsMiddleware = require('./middleware/metricsMiddleware');

const app = express();
app.use(express.json());
app.use(metricsMiddleware);

// Prometheus scrape endpoint — do NOT expose publicly, see notes below
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType);
  res.end(await register.metrics());
});

app.get('/api/orders', (req, res) => {
  res.json({ orders: [] });
});

app.listen(3000, () => console.log('Server :3000 | Metrics: :3000/metrics'));

Step 5: Track Custom Business Metrics in Route Handlers

This is where prom-client shines compared to generic monitoring — you can track your exact business logic:

// routes/orders.js
const { ordersCreatedTotal, activeUsers } = require('../metrics');

app.post('/api/orders', async (req, res) => {
  try {
    const order = await createOrder(req.body);
    ordersCreatedTotal.inc({ status: 'success', payment_method: order.paymentMethod });
    res.json({ success: true, orderId: order.id });
  } catch (err) {
    ordersCreatedTotal.inc({ status: 'failed', payment_method: req.body.paymentMethod || 'unknown' });
    res.status(500).json({ error: err.message });
  }
});

app.post('/api/login', async (req, res) => {
  // ... auth logic
  activeUsers.inc();
  res.json({ token: '...' });
});

app.post('/api/logout', (req, res) => {
  activeUsers.dec();
  res.json({ success: true });
});

Configuring Prometheus to Scrape the Node.js App

Open prometheus.yml and add a new job alongside the existing node-exporter job:

scrape_configs:
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']

  # New job for the Node.js application
  - job_name: 'nodejs-app'
    static_configs:
      - targets: ['localhost:3000']
    metrics_path: '/metrics'
    scrape_interval: 15s

Reload the Prometheus config (no restart required):

curl -X POST http://localhost:9090/-/reload

PromQL Queries for the Grafana Dashboard

With Prometheus scraping data, it’s time to build the dashboard. These 4 panels give you a complete view of API health:

Request Rate (requests/second)

sum(rate(http_requests_total[5m])) by (route, method)

P95 Latency — The Most Important Metric

histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, route))

P95 means 95% of requests complete within time X. It’s far more informative than the average — averages get skewed by fast requests and can mask the ones that are genuinely slow.

Error Rate (% of 5xx errors)

rate(http_requests_total{status_code=~"5.."}[5m]) / rate(http_requests_total[5m]) * 100

Business Metric: Order Success Rate

rate(orders_created_total{status="success"}[5m]) / rate(orders_created_total[5m]) * 100

Quick Verification Before Connecting Grafana

# Run the app
node app.js

# Send a few test requests
curl http://localhost:3000/api/orders
curl -X POST http://localhost:3000/api/orders \
  -H "Content-Type: application/json" \
  -d '{"item":"product-1","paymentMethod":"card"}'

# View raw metrics output
curl http://localhost:3000/metrics

If you see output like this, you’re good:

# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",route="/api/orders",status_code="200"} 3
http_requests_total{method="POST",route="/api/orders",status_code="200"} 1

# HELP http_request_duration_seconds Duration of HTTP requests in seconds  
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.005",...} 2
http_request_duration_seconds_bucket{le="0.01",...} 4

Real-World Notes

  • Don’t use user_id as a label: Labels must have low cardinality. Using user_id or request_id as labels creates millions of time series — Prometheus will run out of memory fast. Method, route, and status_code are safe choices.
  • Protect the /metrics endpoint: Don’t expose it publicly. Use Basic Auth, whitelist internal IPs, or bind the metrics server to a separate port accessible only to internal Prometheus. This endpoint reveals quite a bit about your infrastructure.
  • A scrape_interval of 15s is sufficient: Don’t set it to 5s or lower without a specific reason — it adds unnecessary load on both the app and Prometheus.
  • Test route normalization: With nested Express routers, req.route.path may return a relative path. Test thoroughly to make sure /api/users/:id doesn’t accidentally become /api/users/123.

With metrics in Grafana, the next step is setting up alerts — for example, alerting when P95 latency exceeds 500ms, or when error rate stays above 5% for 5 consecutive minutes. Alertmanager has its own dedicated post on the blog; combine it with this dashboard and you have a complete monitoring loop.

Share: