Graylog: Centralizing and Analyzing Logs from Multiple Linux Servers – ITFROMZERO

Table of Contents

Log Management as Systems Scale: A Problem Nobody Wants but Everyone Faces

When I first set up 2–3 servers, I’d SSH directly into each machine and run tail -f /var/log/syslog to debug. That worked fine — until the server count grew to 8, then 15. When production breaks at 2 AM and you’re SSH-ing into each machine hunting for error messages, that’s when you truly need centralized logging.

Graylog is what I brought into production after I’d been burned enough times. After 6 months of real-world use, I’m writing down what’s actually useful — not a copy of the official documentation.

What Is Graylog and Why Not Use ELK Stack?

Straight to the point: Graylog receives logs from all your servers, indexes everything into OpenSearch, and lets you search and create alerts from a single UI. No more opening 10 terminal tabs to grep across individual machines.

The architecture consists of 3 main components:

Graylog Server: processes, parses, and routes logs
MongoDB: stores metadata — configuration, users, streams, alerts
OpenSearch (or Elasticsearch): the full-text search engine — where logs are actually stored and queried

What about ELK Stack? I’ve run both. ELK offers more customization power, but takes around 2–3 days to set up properly. Graylog? Half a day and you’re done. For a small team without dedicated DevOps, that’s a significant difference — especially for alert and stream management, where Graylog clearly wins.

Installing Graylog with Docker Compose

Docker Compose is the fastest way to get Graylog running. I use this approach in both dev and small production environments — it’s stable for workloads under 50GB of logs per day.

Create the docker-compose.yml file:

version: '3.8'
services:
  mongodb:
    image: mongo:6.0
    volumes:
      - mongodb_data:/data/db
    restart: unless-stopped

  opensearch:
    image: opensearchproject/opensearch:2.11.0
    environment:
      - discovery.type=single-node
      - DISABLE_SECURITY_PLUGIN=true
      - OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - opensearch_data:/usr/share/opensearch/data
    restart: unless-stopped

  graylog:
    image: graylog/graylog:5.2
    environment:
      - GRAYLOG_PASSWORD_SECRET=somepasswordpepper1234567890abc
      - GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
      - GRAYLOG_HTTP_EXTERNAL_URI=http://YOUR_SERVER_IP:9000/
      - GRAYLOG_ELASTICSEARCH_HOSTS=http://opensearch:9200
      - GRAYLOG_MONGODB_URI=mongodb://mongodb:27017/graylog
    depends_on:
      - mongodb
      - opensearch
    ports:
      - "9000:9000"        # Web UI
      - "12201:12201/udp"  # GELF UDP input
      - "514:514/udp"      # Syslog UDP input
    volumes:
      - graylog_data:/usr/share/graylog/data
    restart: unless-stopped

volumes:
  mongodb_data:
  opensearch_data:
  graylog_data:

Important: The GRAYLOG_ROOT_PASSWORD_SHA2 above is the hash of the string “admin”. Change it before deploying to production:

echo -n "your_strong_password" | sha256sum | cut -d' ' -f1

Start the stack:

docker compose up -d
# Wait approximately 60–90 seconds for Graylog to fully start
docker compose logs -f graylog

The Web UI runs at http://YOUR_SERVER_IP:9000. Log in with admin and the password you just created.

Creating Inputs to Receive Logs

Open System → Inputs in the Web UI. Create these 2 inputs first:

1. Syslog UDP (port 514)

Select Syslog UDP, set the bind address to 0.0.0.0, port 514. This input receives logs directly from rsyslog — fast, simple, no extra agent required.

2. GELF UDP (port 12201)

Select GELF UDP, port 12201. GELF (Graylog Extended Log Format) is a structured JSON format — it supports custom fields and is ideal for application logs from Docker containers or custom-built apps.

Configuring rsyslog on Client Servers

On each Linux server that needs to send logs to Graylog, configure rsyslog:

sudo nano /etc/rsyslog.d/90-graylog.conf

Add a single line to the file:

# Forward all logs via UDP to Graylog
*.* @GRAYLOG_SERVER_IP:514;RSYSLOG_SyslogProtocol23Format

Restart rsyslog and test immediately:

sudo systemctl restart rsyslog

# Send a test message
logger -t test "Hello from $(hostname)"

Open the Graylog Web UI → Search, look for source:your_hostname — if you see the “Hello from…” message, you’re done.

Sending Docker Container Logs via GELF

For Docker, add the log driver to the docker-compose.yml of the service you want to monitor:

services:
  my_app:
    image: my_app:latest
    logging:
      driver: gelf
      options:
        gelf-address: "udp://GRAYLOG_SERVER_IP:12201"
        tag: "my_app"

Creating Streams and Alerts

Streams are my favorite Graylog feature. They route logs by rule into separate “channels” — a dedicated stream for Nginx, one for SSH auth, one for application errors. Incredibly useful when you need to debug a specific service without drowning in the overall log feed.

Create a stream at Streams → Create Stream. Some useful rules:

Field source contains webserver-01 — filter logs from a specific server
Field message matches regex ERROR|CRITICAL|FATAL — catch all error-level events
Field facility equals auth — only capture auth logs (SSH, sudo)

Once you have a stream, add an Alert Condition. For example: “send a notification if there are more than 10 ERROR messages within 5 minutes”. Pair it with a Notification to deliver alerts to Slack, email, or a webhook — your choice.

I’ll be honest: alert fatigue was a real problem I ran into immediately. Set the threshold too low — alerts fire constantly, and you start ignoring everything. I had to tune it multiple times: raise thresholds, add AND conditions, whitelist certain sources. The lesson learned: start with high thresholds, observe for 1–2 weeks to understand your system’s baseline, then gradually lower them. Your production environment has its own traffic patterns — don’t copy numbers from some blog post.

Takeaways After 6 Months in Production

Graylog genuinely solves the “where are the logs?” problem. Here’s what I noted after six months:

Incident debugging time dropped from ~30 minutes to ~5 minutes because I can search across all servers in one UI
Detected 2 SSH brute-force attempts thanks to alerts on failed auth logs
Retention policy (auto-deleting logs older than 30 days) keeps disk usage stable — no more sudden disk-full surprises

One thing to know upfront: Graylog is fairly RAM-hungry. My single-node cluster needs at least 8GB to run smoothly — OpenSearch alone consumes 2–4GB. For smaller servers, consider Grafana Loki: much lighter, slightly weaker search capabilities, but sufficient for most use cases.

But if you have the budget and are managing 20+ servers — Graylog remains a solid choice. Set it up once, use it for the long haul. The next time something breaks at 2 AM, you’ll suffer a lot less.