Pinning Processes to CPU Cores: How to Use taskset and numactl to Maximize Linux Performance – ITFROMZERO

Table of Contents

Why You Shouldn’t Let Linux Decide Which CPU Runs Your Process

By default, the Linux scheduler is incredibly smart. It automatically distributes workloads to idle CPUs. However, this flexibility can sometimes be detrimental to low-latency systems or heavy workloads.

While working as a sysadmin for a Fintech exchange, I once dealt with a strange issue. CPU load was only around 30%, but latency occasionally spiked from 2ms to 100ms. After a deep dive with perf, I discovered the scheduler was constantly jumping cores (context switching). Every time a process was pushed to a new core, data in the L1/L2 cache was flushed. The CPU had to reload data from RAM, causing performance to plummet.

The primary solution is CPU Affinity. By using taskset and numactl, you can force a process to run on a specific core. This maximizes cache locality and eliminates resource contention.

Installing the Tools

Most distributions like Ubuntu or CentOS come with taskset pre-installed as part of the util-linux package. However, numactl usually needs to be installed manually because it serves more complex server architectures.

Installation on Ubuntu/Debian:

sudo apt update && sudo apt install numactl -y

Installation on RHEL/CentOS/AlmaLinux:

sudo yum install numactl -y

Using taskset for Basic CPU Affinity Management

taskset is the most lightweight tool for setting CPU affinity via core IDs or CPU masks.

Pinning a New Process to a Specific Core

If you want to run a heavy Python script on core 0 and core 1, use the -c (cpu-list) parameter:

taskset -c 0,1 python3 heavy_script.py

Pinning an Already Running Process

You don’t need to restart the application to apply changes. As long as you have the PID (Process ID), you can “lock” it to the desired core. For example, forcing PID 1234 to run exclusively on core 2:

taskset -p -c 2 1234

Checking the Current Status

To see which cores a process is currently allowed to run on, use the following command:

taskset -cp 1234
# Output: pid 1234's current affinity list: 0,1

Optimizing Multi-Socket Systems with numactl

On servers with 2 or 4 physical CPUs, the concept of NUMA (Non-Uniform Memory Access) is vital. Each CPU manages its own region of RAM (Local Memory). If CPU 0 has to fetch data from CPU 1’s RAM region, speeds drop by about 30-50% due to inter-socket bandwidth bottlenecks.

numactl allows you to bind a process to both CPU cores and the corresponding RAM region for maximum speed.

Viewing the NUMA Architecture

Check how many nodes your server has before configuring:

numactl --hardware

This command displays the list of cores per node and the distance between them.

Running Databases Optimized for NUMA

When running MySQL or MongoDB on large servers, force them to use resources on the same node to avoid latency:

numactl --cpunodebind=0 --membind=0 /usr/bin/mongod --config /etc/mongod.conf

Where:

--cpunodebind=0: Only run on cores belonging to Node 0.
--membind=0: Only allocate RAM from Node 0.

If you’re worried about Node 0 running out of RAM and causing a crash, use --preferred=0. The system will prioritize Node 0 but still allow “borrowing” RAM from other nodes when necessary.

How to Monitor Processes in Practice

Don’t just run the command and forget it. You need to verify if the process is actually staying on the specified core.

Using htop

Open htop, press F2 (Setup), and select Columns. Find PROCESSOR and press F5 to add it to the monitoring table. You will immediately see the core ID that each process is occupying.

Using the ps Command

Quickly check using the ps command with the psr column:

ps -o pid,psr,comm -p 1234

Real-time Monitoring

To observe process movement every second, use watch:

watch -n 1 "ps -o pid,psr,comm -p 1234"

Hard-Earned Lessons from the Field

After optimizing many systems, I’ve gathered three important takeaways:

Avoid CPU 0: This is usually where hardware interrupts (like network cards or hard drives) are handled. Forcing heavy applications onto core 0 can easily cause system bottlenecks.
Combine with isolcpus: To give an application complete core exclusivity, add the isolcpus parameter to the Grub configuration. This prevents the scheduler from pushing any other processes onto that core.
Understand Hyper-threading: Core 0 and Core 1 might share a single physical core. If you want a real speed boost, select cores from different physical cores.

Mastering CPU Affinity makes your applications significantly more stable and professional. You should try applying this to Redis, Nginx, or data processing workers to see the performance difference immediately.