Mastering User Namespaces on Linux: Preventing Privilege Escalation Risks

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

The Problem: The Cost of Root Privilege Abuse

When I first started as a Sysadmin, I had a very bad habit: whenever a service had a permission error, I would immediately use sudo or run it directly as the root user. At the time, I didn’t fully understand how to configure sudo and the sudoers file for safe permission management. Until one day, a Docker container was hit by a remote code execution vulnerability. Because I was running Docker with default root privileges, the attacker escaped the container and took control of the physical server.

This is the clearest evidence of violating the Principle of Least Privilege. When you run Nginx, a Database, or a Container as root, you are leaving the system’s doors wide open. With just a small bug in the code, a hacker will immediately gain the highest operating system privileges to wipe data or install crypto-mining malware.

Why Traditional Root Privileges are Dangerous?

By default, the Linux kernel cannot distinguish between “root inside the container” and “root outside the host”. When a process has a User ID (UID) of 0, the kernel understands that it has full authority to interfere with hardware and sensitive syscalls. To mitigate this, one could explore understanding and using Linux capabilities for fine-grained permissions without root for applications.

Isolation techniques like chroot only change the view of the file system, not the actual nature of the user’s privileges. If the process is still UID 0, it can easily perform “Container Escape” attacks. In fact, vulnerabilities like CVE-2019-5736 once allowed hackers to overwrite host executables from within the container thanks to these root privileges.

The Solution: How Do User Namespaces (UserNS) Work?

User Namespaces allow us to map a range of UIDs from the external environment to a different range of UIDs inside the namespace.

Imagine you are the “king” (UID 0) in your small kingdom. However, when you step out into the real world, you are just an ordinary citizen (UID 100000). If a coup occurs in your kingdom, the rebels only have power within that small scope. They cannot threaten the main system because they have no privileges outside.

Checking System Readiness

Distributions like Ubuntu 22.04, CentOS 8+, or AlmaLinux usually have this feature enabled by default. Try typing the following command:

unshare --user --map-root-user --whoami

If the screen displays root without you needing to use sudo, your system is ready for UserNS.

Manual User Namespaces Configuration

To understand the core mechanics, we will manually set up an isolated environment without using automated tools.

1. Setting Up Subordinate UIDs

Linux manages mapping through two configuration files: /etc/subuid and /etc/subgid. Each line here grants a unique range of IDs to each user.

# Grant ID range to user 'vinh'
sudo usermod --add-subuids 100000-165535 vinh
sudo usermod --add-subgids 100000-165535 vinh

In this example, the user vinh is allowed to manage 65,536 external IDs, starting from ID 100,000. This is a standard number to ensure no overlap with other system users.

2. Initializing an Isolated Space with unshare

Now, let’s create a new shell where you have virtual root privileges:

unshare --user --map-root-user --mount --bash

Check your identity inside the shell using the id command:

uid=0(root) gid=0(root) groups=0(root)

Don’t let the number 0 fool you. Try creating a file in /tmp, then use another terminal outside to check it. You will see that the owner of that file is actually UID 1000 or 100000, definitely not the system’s root.

Applications: Running Rootless Docker and Podman

In practice, I always encourage everyone to switch to Rootless Mode for container services to optimize Linux server performance for production and maximize security.

Podman: The Optimal Choice for UserNS

Podman supports UserNS more smoothly than Docker thanks to its daemonless architecture. When you run an Nginx container using Podman under a regular user:

podman run -d --name web-secure -p 8080:80 nginx

The Nginx process needs root privileges to bind to port 80 inside the container. If you are just starting out, you might want to learn how to install and configure Nginx on Ubuntu first. Thanks to UserNS, it has those privileges, but in reality, on the host machine, it is just a regular user process. If a hacker compromises the container, they cannot read the /etc/shadow file or interfere with the kernel.

Practical Experience and Considerations

1. Handling Volume Permission Errors

The most common error when using UserNS is incorrect volume mount permissions. Files on the host belong to UID 1000, but inside the container, virtual UID 0 is needed for read/write access.

Tip: With Podman, add the :U suffix to the mount parameter. It will automatically perform a chown for the ID range in your namespace:

podman run -v ./data:/data:U nginx

2. Fixing newuidmap Errors

If you encounter the message newuidmap: write to uid_map failed, check the /etc/subuid file immediately. This error usually occurs because you haven’t declared the ID range or the ID ranges of users are overlapping.

3. Don’t Forget Resource Limits

UserNS only handles security regarding privileges. A compromised process can still consume all the server’s RAM. You should consider monitoring Linux system resources with htop and iotop and combining UserNS with Cgroups to limit resources. One side handles blocking unauthorized access, while the other prevents hardware exhaustion.

Conclusion

Getting used to User Namespaces can be a bit confusing at first due to the ID mapping concept. However, it is an extremely solid layer of armor for your server. Start by migrating your containers to Rootless mode. You will feel much more at peace knowing that even if an application has a bug, the main system remains secure.

Share: