Customizing Kernel Parameters with sysctl on Linux: Optimizing Server Performance and Security – ITFROMZERO

That night, around 2 AM, my PagerDuty alarm blared, jolting me awake. This time, it wasn’t a deployment error or a complete service outage, but the performance of a critical microservice had suddenly plummeted.

The monitoring charts were wildly erratic, API requests were constantly timing out, yet CPU and RAM usage on the server remained normal. Customers started complaining, and I broke into a sweat, wondering what was happening with the Ubuntu 22.04 4GB RAM server I was managing.

Table of Contents

The Actual Problem: Unexplained Server Performance Degradation

This situation persisted, severely impacting user experience. All signs indicated the application was still running, but it seemed there was a bottleneck at the operating system level throttling its processing capability. Monitoring tools like htop, top showed normal results for CPU and RAM resources, but network metrics were problematic. Specifically, I noticed a surge in failed connections and latency, which was truly alarming.

Root Cause Analysis: Unoptimized Kernel Parameters

After a few minutes wrestling with logs and commands like netstat, I discovered a large number of connections piled up in TIME_WAIT and CLOSE_WAIT states. This indicated that the Linux kernel was struggling to manage the lifecycle of network connections, especially when the server had to continuously handle many connection and disconnection requests in a short period. It’s possible the number of connections had exceeded default limits, or the kernel’s memory management for sockets was inefficient.

The problem wasn’t with the application code, but at the operating system level, specifically the Linux kernel parameters. By default, Linux distributions like Ubuntu are configured to operate stably across various hardware types and use cases, from personal desktops to small servers.

However, they are not optimized for any specific scenario. Especially for high-load servers, where every millisecond is precious and the ability to handle thousands of concurrent connections is crucial, default configurations can become a bottleneck.

Kernel parameters are configuration values stored in the kernel’s memory space that can be changed while the system is running. They control everything from how the kernel manages virtual memory, handles the network stack, to filesystem behavior. These parameters are accessed via the virtual filesystem /proc/sys/. For instance, when you want to view the value of net.ipv4.tcp_tw_reuse, you are essentially reading the file /proc/sys/net/ipv4/tcp_tw_reuse.

Solutions

1. Temporary Changes with the `sysctl -w` Command

Initially, I needed a quick solution to get the system back up and running and reduce the load on PagerDuty. The sysctl tool was the savior. I decided to try adjusting a few network-related parameters to see if it would improve. To check the current value of a kernel parameter, I used the command:

sysctl net.ipv4.tcp_tw_reuse

To temporarily change a parameter, I used the -w flag:

sudo sysctl -w net.ipv4.tcp_tw_reuse=1

I tried increasing net.core.somaxconn (maximum number of pending connection requests) and enabling the reuse of sockets in TIME_WAIT state by turning on net.ipv4.tcp_tw_reuse. Immediately, I saw connections being released faster, timeout errors gradually decreasing, and the server starting to breathe easier. The load reduced, and customers slowly felt relieved. This solution helped me get through that stressful 2 AM moment.

2. Permanent Changes via Configuration Files

However, changes made with sysctl -w are only effective until the server restarts. Once the server reboots, everything will revert to default. For these settings to take permanent effect and not have to be reapplied every time the server starts, I needed to edit the configuration file /etc/sysctl.conf or add separate configuration files to the /etc/sysctl.d/ directory.

Best Practice: Managing sysctl Configuration with `/etc/sysctl.d/`

I usually prefer creating a separate file in the /etc/sysctl.d/ directory. This approach allows me to easily manage and control each group of parameters and roll back when necessary, without affecting the system’s original sysctl.conf file. Filenames should start with two digits to specify the load order (e.g., 99-custom-performance.conf), with higher numbers indicating greater priority.

For example, I created the file /etc/sysctl.d/99-custom-performance.conf:

sudo nano /etc/sysctl.d/99-custom-performance.conf

And added the following lines:

# Allows quick reuse of TIME_WAIT sockets.
# Helps the system quickly free up ports stuck in TIME_WAIT state,
# preventing port exhaustion when there are many new connections.
net.ipv4.tcp_tw_reuse = 1

# Increase the maximum connection backlog (default is usually 128).
# Very important for servers that need to accept many new connections quickly.
# If the value is too low, new connections might be rejected.
net.core.somaxconn = 65535

# Increase the maximum number of file descriptors the system can open.
# Each socket is a file descriptor; increasing this limit helps avoid "Too many open files" errors.
fs.file-max = 1000000

# Increase the default and maximum receive/send buffers for TCP sockets.
# Helps improve data transfer performance over the network, especially with high-speed connections.
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_max = 4194304

# Port range that outbound connections can use.
# Expands the port range so the server has more available ports for outgoing connections,
# avoiding port conflicts in applications with many outbound connections.
net.ipv4.ip_local_port_range = 1024 65535

# Adjust vm.swappiness to prioritize RAM cache over swap.
# A lower value reduces the system writing data to disk (swap), thereby improving performance.
# A value of 0 makes the kernel avoid swapping entirely (unless physical RAM runs out). For servers with sufficient RAM, a value of 10 is a good choice.
vm.swappiness = 10

# Enable SYN cookies to protect against SYN Flood attacks.
# When the system is under attack, SYN cookies help continue processing connections without using too many resources.
net.ipv4.tcp_syncookies = 1

# Reduce the wait time for FIN-WAIT-2 sockets.
# Helps quickly free up network resources after a connection ends.
net.ipv4.tcp_fin_timeout = 30

# Inactive time (seconds) before sending keepalive packets.
# Helps detect and close dead connections earlier.
net.ipv4.tcp_keepalive_time = 600

# Force the system to panic on a kernel oops, useful for early diagnosis of kernel issues.
# Important in production environments to quickly identify and fix kernel errors.
kernel.panic_on_oops = 1

# Disable IPv6 if not used.
# Helps simplify the network stack and minimize unnecessary security risks.
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

After saving the file, to apply the changes without rebooting the server, I used the command:

sudo sysctl --system

This command will read all configuration files in the /etc/sysctl.d/ directory and the /etc/sysctl.conf file (if it exists), then apply the new parameters to the running kernel. You can verify the applied values using the command sysctl -a | grep <parameter>.

Personal Experience and Benefits

After many trials and adjustments in various environments, I realized that creating separate configuration files in /etc/sysctl.d/ is the most effective management method. It not only helps me easily control each group of parameters but also allows for quick rollbacks if issues arise.

On the production Ubuntu 22.04 server with 4GB RAM that I manage, this significantly reduced request processing times, from several hundred milliseconds to tens of milliseconds during peak hours. Timeout incidents almost ceased, the system operated much more smoothly, allowing me to get better sleep, no longer worrying about midnight calls.

Optimizing kernel parameters isn’t just about network performance. It also affects how the system manages memory, I/O, and even security. Understanding and adjusting these parameters is a crucial part of professional Linux system administration.

Monitoring Performance After Changes

After applying any changes, I always use tools like netstat, ss, sar, iostat to monitor system performance. This helps me determine if the changes yield the desired results and if there are any side effects. It’s important to have objective data for evaluation.

# View all current sysctl parameters
sysctl -a

# Filter parameters related to TCP/IP
sysctl -a | grep "net.ipv4.tcp"

# Count the number of connections in TIME_WAIT state
sudo netstat -an | grep -i "time_wait" | wc -l

# View summary information about sockets
sudo ss -s

# View CPU, I/O, Network information (5 samples, 1-second interval)
sar -u 1 5  # CPU utilization
sar -n DEV 1 5 # Network interface statistics
iostat -xz 1 5 # CPU and disk I/O statistics

Important Notes Before Optimization

Backup Configuration: Before making any major changes, always back up the /etc/sysctl.conf file and files in /etc/sysctl.d/. A good backup will help you easily restore to the previous state if issues arise.
Test in Staging Environment: Every kernel parameter change can have a far-reaching impact on the system. I recommend thoroughly testing in a staging or development environment with a configuration similar to production. This helps avoid unnecessary risks when applying directly to the main system.
Change Gradually: Never change too many parameters at once without clearly understanding their effects. Make changes incrementally, closely monitor system performance, and evaluate the impact of each change.
Read Documentation: Always consult official Linux kernel documentation or reputable sources to better understand each parameter you wish to adjust. Deep knowledge of them will help you make more optimal decisions.

Adjusting kernel parameters with sysctl is a powerful technique to optimize Linux server performance and enhance security.

From a late-night troubleshooting experience, I realized that understanding how the kernel works and knowing how to fine-tune it can save a system in the most critical situations. This is an indispensable skill for any IT engineer managing production Linux systems, helping you not only solve problems but also build a robust, stable system that brings higher value to the business.