Installing Prometheus and Grafana for Real-Time Server Monitoring – ITFROMZERO

Table of Contents

How do you know when your server is having issues?

Back in the day, I’d find out the server was slow the hard way — users would message me to complain first. Then I’d SSH in and run top, df -h, free -m one by one. Three or four servers was manageable — but once you hit ten-plus, you don’t even know where to start.

After setting up Prometheus + Grafana, everything changed. Open the dashboard and it’s all right there: when the CPU spiked, current RAM usage, how much disk space is left — all on a single screen. No more SSHing into individual servers.

This article goes straight to a practical installation on Ubuntu 22.04. Node Exporter for collecting metrics, Grafana for visualization.

How it works — the 30-second overview

Prometheus is a time-series database and scraper rolled into one. Every 15 seconds, it makes HTTP calls to exporters, pulls in metrics, and stores them in its own storage backend. You query with PromQL — Prometheus’s own query language, which you can pick up and use in a few hours.

Node Exporter runs on each server you want to monitor and exposes a /metrics endpoint with 700+ metrics: per-core CPU, RAM, disk I/O, network traffic, open file descriptors, and more.

Grafana acts as the frontend. It connects to Prometheus, queries the data, and renders it into dashboards. Alerts can be sent via email, Slack, or a Telegram webhook.

The data flow:

Node Exporter (port 9100) ← Prometheus scrapes every 15s → Stored in TSDB → Grafana queries → Dashboard

Installing Node Exporter on the servers you want to monitor

Create a dedicated system user — running the exporter as root is bad practice:

# Create system user
sudo useradd --no-create-home --shell /bin/false node_exporter

# Download Node Exporter (check the latest release at github.com/prometheus/node_exporter)
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar xvf node_exporter-1.8.2.linux-amd64.tar.gz
sudo cp node_exporter-1.8.2.linux-amd64/node_exporter /usr/local/bin/
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

Next, create a systemd service so Node Exporter starts automatically on reboot:

sudo nano /etc/systemd/system/node_exporter.service

[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter

# Verify it's running
curl http://localhost:9100/metrics | head -20

If the output contains lines like node_cpu_seconds_total{...}, Node Exporter is running correctly.

Installing Prometheus on the monitoring server

I keep things separate: one dedicated server runs Prometheus + Grafana, while all other servers only have Node Exporter installed. It makes backups easier and keeps monitoring isolated from production workloads.

sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus

wget https://github.com/prometheus/prometheus/releases/download/v2.52.0/prometheus-2.52.0.linux-amd64.tar.gz
tar xvf prometheus-2.52.0.linux-amd64.tar.gz
sudo cp prometheus-2.52.0.linux-amd64/{prometheus,promtool} /usr/local/bin/
sudo cp -r prometheus-2.52.0.linux-amd64/{consoles,console_libraries} /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus

Configuring scrape targets

This is the most important part — declaring which servers Prometheus should collect metrics from:

sudo nano /etc/prometheus/prometheus.yml

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node_exporter'
    static_configs:
      - targets:
          - '192.168.1.10:9100'   # web server
          - '192.168.1.11:9100'   # db server
          - '192.168.1.12:9100'   # app server

Create a systemd service for Prometheus:

sudo nano /etc/systemd/system/prometheus.service

[Unit]
Description=Prometheus
After=network.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus/ \
  --storage.tsdb.retention.time=30d \
  --web.listen-address=0.0.0.0:9090

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now prometheus

# Check whether targets are up
curl http://localhost:9090/api/v1/targets | python3 -m json.tool | grep health

Navigate to http://<monitoring-server>:9090/targets to check the status — targets shown in green are being scraped successfully. Quick note: 30-day retention across 3 servers consumes roughly 2–4 GB of disk; adjust --storage.tsdb.retention.time if disk space is tight.

Installing Grafana

Grafana has an official APT repository, which is cleaner than downloading files manually:

sudo apt-get install -y apt-transport-https software-properties-common wget
sudo mkdir -p /etc/apt/keyrings/
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list

sudo apt-get update
sudo apt-get install grafana
sudo systemctl enable --now grafana-server

The default port is 3000. Go to http://<server-ip>:3000, log in with admin/admin, and change your password immediately on first login.

Connecting Grafana to Prometheus

Go to Connections → Data Sources → Add data source
Select Prometheus
URL: http://localhost:9090 (if on the same server) or the IP address of your Prometheus server
Click Save & Test — you’re done when you see “Successfully queried the Prometheus API”

Importing a dashboard in 2 minutes

No need to build from scratch. Grafana has a community dashboard library at grafana.com/grafana/dashboards — ID 1860 (Node Exporter Full) is the most downloaded dashboard with millions of installs, covering nearly everything you need:

Dashboards → Import
Enter ID 1860 and click Load
Select the Prometheus data source you just created
Import — CPU, RAM, disk, and network panels appear immediately, including per-core and per-disk breakdowns

Commonly used PromQL queries

When you need to build custom panels, these are the queries I use almost every day:

# CPU usage % (5-minute average)
100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# RAM usage %
100 - ((node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100)

# Disk usage by mount point
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes) * 100)

# Inbound network traffic (bytes/s)
rate(node_network_receive_bytes_total{device!="lo"}[5m])

Setting up an alert when CPU exceeds a threshold

Grafana can create alerts directly from a panel — no Alertmanager needed for simple cases. Example: alert when CPU stays above 80% for 5 consecutive minutes:

Open the CPU Usage panel → Edit
Go to the Alert tab → New alert rule
Condition: WHEN avg() OF query IS ABOVE 80
For: 5m — wait 5 minutes before triggering, to avoid false positives from brief CPU spikes
Notification: select a contact point (email, Slack, Telegram webhook)

Wrapping up

This is the exact setup I’m running in production. From start to a fully functional dashboard takes about 30–45 minutes if you’re comfortable with Linux.

The best thing about Prometheus: data is stored as time-series, so when an incident happens, I can rewind and see exactly what RAM looked like at 2:37 AM last night, or pinpoint when the CPU started climbing. No guesswork, no blind spots.

Want to take it further? Alertmanager handles more complex alerting scenarios — grouping, silencing, and routing by team. If you’re running Docker, add cAdvisor to track per-container resource usage.