OOM Killer: Why Linux Kills Your MySQL and How to Protect It

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

When a Service Disappears Without a Trace

Monday morning, you check the dashboard and find MySQL or Redis has been down for who knows how long. The application logs are spotless — not a single error line. You restart the service and everything runs fine, but a few hours later, the same thing happens again.

Many junior sysadmins end up spending an entire day debugging code or reviewing app configs. The real culprit, however, lives at the OS level. Linux quietly terminated your process to save the system. The enforcer of that decision is the OOM Killer (Out of Memory Killer).

This article will help you catch the OOM Killer in the act and equip your critical processes with the protection they need to survive.

Why Does Linux Kill Your Application?

The root cause lies in a mechanism called Memory Overcommit. The Linux kernel allows applications to allocate more RAM than is physically available. It does this because, in practice, applications rarely use all the memory they’ve reserved at the same time.

Imagine a server with 8 GB of RAM where the total memory claimed by all apps adds up to 12 GB. When they all start consuming resources simultaneously, or when one app develops a memory leak, both RAM and Swap get exhausted. To prevent the entire system from locking up in a Kernel Panic, the OOM Killer kicks in. It selects one process as a sacrifice and terminates it to free up memory immediately. If your server is running low on RAM, setting up a Linux SWAP partition can provide a crucial buffer before the OOM Killer ever gets involved.

How the Target Is Chosen

Linux doesn’t pick randomly. It calculates a score called oom_score (ranging from 0 to 1000). The higher the score, the more likely that process is to get killed. The factors involved include:

  • Memory footprint: An app consuming 90% of RAM is almost certainly at the top of the hit list.
  • Uptime: Long-running processes are generally favored over recently started ones.
  • Privileges: Processes running as root receive a slight advantage over regular user processes.
  • oom_score_adj setting: This is the variable you can actually control.

How to Confirm a Service Was Killed by the OOM Killer

Because the OOM Killer acts from outside the process, applications typically don’t have a chance to write any logs. You need to look for evidence in the system logs instead. Tools like journalctl and dmesg are your best friends here for tracing kernel-level events.

1. Trace It with dmesg

This is the fastest way to check messages from the kernel ring buffer. Run the following command:

dmesg -T | grep -i "out of memory"

The -T flag converts kernel timestamps into human-readable time. If you see a line like “Killed process [PID] (mysqld)”, you’ve found your culprit.

2. Use journalctl

On modern systems like Ubuntu 22.04 or CentOS 7/8, use this command for more detail:

journalctl -xe | grep -i oom-killer

A typical kill entry looks like this:

[Oct 25 14:30:02] Out of memory: Killed process 1234 (mysqld) total-vm:2048500kB, anon-rss:1500400kB, oom_score_adj:0

The anon-rss value shows the actual RAM the process was using (around 1.5 GB) at the moment it was killed.

How to Grant a Process Immunity from the OOM Killer

If you want to protect your database at all costs, you can intervene through /proc/[PID]/oom_score_adj. This value ranges from -1000 (never kill) to 1000 (kill first).

Temporary Configuration

Assuming MySQL is running with PID 1234, run the following:

echo -1000 > /proc/1234/oom_score_adj

This is quick and easy, but the setting is lost as soon as the service restarts because the PID will change.

Persistent Configuration via Systemd

This is the most robust approach. Edit the service’s unit file (for example, mysql.service). If you’re not yet familiar with systemd unit management, the guide on managing processes with systemd covers the essentials. Run:

sudo systemctl edit mysql

Add the following lines:

[Service]
OOMScoreAdjust=-1000

Then apply the change by reloading the daemon:

sudo systemctl daemon-reload
sudo systemctl restart mysql

Practical Advice: Don’t Overdo It

Setting -1000 for every service is a critical mistake. The OOM Killer is the last line of defense for your server. If you prevent it from killing MySQL while MySQL is actively leaking memory, the entire system will freeze. At that point, you won’t even be able to SSH in to fix things remotely.

Instead of just playing defense, take these proactive steps:

  1. Cap application memory: Configure innodb_buffer_pool_size so MySQL uses around 60–70% of total RAM.
  2. Set up Swap: Even though Swap is slower than RAM, it provides a safety buffer that prevents the system from going into shock during sudden memory spikes.
  3. Monitor proactively: Set up alerts for when RAM usage exceeds 85% using Prometheus or Zabbix. The htop and iotop monitoring guide is a good starting point for real-time visibility into memory pressure.

Summary

The OOM Killer is not a bug — it’s a protective feature. Understanding how to read its logs and tune oom_score_adj gives you much greater control over your system. You may also want to review kernel parameter tuning with sysctl to further harden memory management behavior at the OS level. Next time a process vanishes, don’t jump straight into the code — run dmesg first.

Share: