ELK Stack Installation Guide: Centralized Log Management from A-Z for SysAdmins

Monitoring tutorial - IT technology blog
Monitoring tutorial - IT technology blog

Why should you stop reading logs manually?

Managing a cluster of 10-20 servers while still SSHing into each one to tail -f is truly a nightmare. When the system crashes at 2 AM, squinting to grep through dozens of Nginx or Application log files only adds to the stress. Especially with microservices architecture, where a single request passes through 5-7 different services, tracing errors is nearly impossible without centralized logging.

That is why the ELK Stack has become the gold standard. I used to prefer Graylog because it’s lightweight, but when it comes to high-speed full-text search and in-depth dashboards, ELK remains an unshakeable monument. However, Elasticsearch is notorious for being a “RAM vacuum.” I once tried installing it on a 2GB VPS, and the result was the Kernel OOM Killer “axing” it in less than 60 seconds. This article will help you install it correctly and avoid those resource traps.

The ELK Atomic Trio: What Do They Do?

Understanding the data flow will help you debug faster when the system starts “acting up”:

  • Elasticsearch (The Brain): The heart of data storage. It indexes logs, allowing you to search through millions of lines in just a few milliseconds.
  • Logstash (The Pipe): A versatile filter. It receives raw logs, parses data (such as extracting IPs or User-Agents), and then pushes it to the storage.
  • Kibana (The Face): The visual interface. This is where you build traffic charts, monitor 5xx error rates, or create real-time monitoring dashboards.

In practice, people often add Filebeat (a member of the Beats family) to ship logs from client servers. It is extremely lightweight, consuming only about 10-20MB of RAM, instead of installing the heavy Logstash on every node.

Infrastructure Preparation Before Deployment

Don’t try to run ELK on weak hardware if you don’t want your server to hang constantly. My recommended configuration:

  • Operating System: Ubuntu 22.04 LTS (the most stable currently).
  • RAM: Minimum 4GB for a Lab environment, 16GB-32GB for Production.
  • CPU: 2 Cores or more.
  • Disk: SSD is mandatory. ELK’s log writing speed is massive; using an HDD will cause immediate I/O bottlenecks.

Step 1: Installing Elasticsearch – Setting Up the Storage

First, import the GPG key and add the official repository from Elastic:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
sudo apt-get install apt-transport-https
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update && sudo apt-get install elasticsearch

Warning: In version 8.x, security is enabled by default. Once the installation is complete, the terminal will display the password for the elastic user. Copy it to Notepad immediately. If you accidentally clear the terminal screen, resetting the password is a huge hassle.

Open the /etc/elasticsearch/elasticsearch.yml file and edit the basic parameters for a single node:

network.host: 0.0.0.0
discovery.type: single-node
xpack.security.enabled: true

Enable the service to run automatically on reboot:

sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch

Step 2: Installing Kibana – The Data Window

Installing Kibana is quite straightforward as it doesn’t require complex initial configuration:

sudo apt-get install kibana
sudo systemctl enable kibana
sudo systemctl start kibana

To access the interface via a browser, you need to change server.host in the /etc/kibana/kibana.yml file to "0.0.0.0". Then, navigate to http://YOUR_IP_ADDRESS:5601. Kibana will ask for an enrollment token. You can generate this token using the following command:

sudo /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana

Step 3: Configuring Logstash – Smart Pipeline Processing

Logstash helps transform messy logs into structured data. Install it with sudo apt-get install logstash. Then, create a pipeline configuration file at /etc/logstash/conf.d/nginx-log.conf:

input {
  beats { port => 5044 }
}

filter {
  if [type] == "nginx_access" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
  }
}

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    user => "elastic"
    password => "PASSWORD_FROM_STEP_1"
    ssl_certificate_verification => false
  }
}

Note: The Grok filter is a double-edged sword. If you write the Regex incorrectly, Logstash will consume a lot of CPU. Always use the Grok Debugger in Kibana’s Dev Tools to test before applying it to production.

Hard-earned Experience: Keeping ELK from Becoming a Burden

After several system crashes, I’ve learned three major lessons:

  1. Control JVM Heap Size: Never let Elasticsearch decide how much RAM to use. Limit it in the jvm.options file. The standard formula: Set it to 50% of the total system RAM, but do not exceed 32GB.
  2. Manage Log Lifecycle (ILM): There was a time my server filled up 500GB of SSD in just two weeks because of excessive debug logs. Configure a Policy to automatically delete or compress old logs after 15-30 days.
  3. Avoid Alert Fatigue: Don’t set Telegram alerts for every 404 error. You will soon be overwhelmed by notifications. Only alert when the 5xx error rate spikes by 15% within 5 minutes.

Conclusion

Installing the ELK Stack is just the beginning. The true power lies in how you filter data and build dashboards to see things that are invisible to the naked eye. In the next post, I will guide you on how to set up Filebeat to collect logs from Docker Swarm/K8s clusters into ELK. If you encounter Connection Refused or Authentication Failed errors, please leave a comment below!

Share: