2 AM and the Nightmare Named “Slow Request”
My phone buzzed incessantly. A notification from the monitoring system read: “Latency spike, widespread 504 Gateway Timeout.” I jolted awake and SSH’d into the server, eyes still half-closed. CPU was low, RAM had plenty to spare, and Nginx and Application logs were spotless. Everything looked fine on the surface, but the system was crawling.
Back in my early days as a sysadmin, I’d spend an entire afternoon hopelessly restarting services or scaling up RAM. But that night was different. I needed to see through the operating system’s shell, deep into the kernel—where application logs often fail to reach. That’s when eBPF and the BCC Tools suite proved their real-world value.
In this post, I’ll show you how to use eBPF to “spy” on even the smallest system operations without touching a single line of application code.
Quick Start: Install and Run in 3 Minutes
Don’t let the name eBPF intimidate you. Using it via the BCC (BPF Compiler Collection) suite is actually quite simple. On Ubuntu or Debian, just run:
# Install BCC tools and current kernel headers
sudo apt update
sudo apt install bpfcc-tools linux-headers-$(uname -r)
Once installed, the tools are located in /usr/sbin with the -bpfcc suffix. Try the execsnoop command right away to see processes executing in real-time:
sudo execsnoop-bpfcc
Every newly initialized process will appear instantly. This is the fastest way to catch misbehaving cronjobs or mysterious scripts silently undermining your system from the inside.
What exactly is eBPF?
At its core, eBPF is like a JavaScript engine (similar to V8) but running directly inside the Linux Kernel. It allows you to execute tiny monitoring programs as soon as events occur within the OS kernel. Thanks to a strictly enforced safety verifier, your script will be blocked if it risks crashing the system.
Previously, for deep debugging, you had to install Kernel Modules (highly prone to Kernel Panics) or use ptrace (which can slow down apps by up to 10x). eBPF solves both problems: it’s safe and offers near-zero performance overhead.
The Power of BCC
Writing pure eBPF code in C is a nightmare. BCC allows us to write logic in simpler Python or Lua. The heavy lifting under the hood is automatically compiled into eBPF bytecode by BCC. Most common system issues already have dedicated tools in the BCC suite; you just need to call them.
In Action: Hunting the Culprit Behind the Lag
Back to our 2 AM story. When execsnoop didn’t report anything unusual, I turned my suspicion toward Disk I/O. Perhaps a process was hogging disk bandwidth, forcing other requests into a queue.
Step 1: Checking Disk Latency with biolatency
sudo biolatency-bpfcc 1 10
This command summarizes I/O latency as a histogram. If the bars are in the microsecond (usecs) range, the system is fine. But if you see spikes in the millisecond (msecs) or second range, the disk is definitely struggling or overloaded.
Step 2: Finding Files Being Opened Excessively
If you notice slow I/O, use opensnoop to see which files the application is interacting with:
sudo opensnoop-bpfcc
I once discovered a Java app with an infinite loop bug that was opening and closing a config file 2,000 times per second. Application logs were silent, but opensnoop pinpointed the culprit immediately.
Step 3: Network Debugging with tcptracer
When you suspect issues with microservices or database connections, tcptracer helps you track every TCP connection:
sudo tcptracer-bpfcc
This tool clearly lists the Source IP, Destination IP, and Port as soon as a connection is established. It’s incredibly useful for quickly determining if a firewall is incorrectly blocking connections.
Advanced: Writing Your Own Monitoring Scripts
Sometimes the built-in tools aren’t enough. Suppose you want an alert whenever someone executes rm -rf inside the /var/www directory. With BCC and Python, this only takes a few lines of code.
A basic BCC script structure looks like this:
from bcc import BPF
# C code running directly in the Kernel
program = """
int kprobe__do_sys_open(struct pt_regs *ctx, int dfd, const char __user *filename) {
// Check logic and logging goes here
return 0;
}
"""
# Load into Kernel and run
b = BPF(text=program)
print("Monitoring... Press Ctrl+C to stop.")
b.trace_print()
You can hook into almost any kernel function (kprobes) or user-space function (uprobes). The level of customization is massive.
Pro-tips for Using eBPF in Production
- Kernel Version: eBPF requires at least kernel 4.1. However, for the most stable and feature-complete experience, prioritize kernel 4.15 or newer (Ubuntu 18.04 and above meet this standard).
- Overhead: While extremely fast, placing probes on functions called millions of times per second (like processing every network packet) can still overload the server. Choose your “observation points” carefully.
- Permissions: Most tools require root or
CAP_SYS_ADMINprivileges because they interact directly with the system kernel. - Leverage Documentation: The
/usr/share/bcc/tools/doc/directory contains example text files for each tool. Give them a read before starting a difficult debugging session.
Mastering eBPF doesn’t just help you solve “trace-less” incidents; it also deepens your understanding of how Linux operates. If traditional commands like top or iostat aren’t cutting it, don’t hesitate to call on eBPF. Wishing you peaceful on-call nights and no more hidden bugs!

