Mastering lsof on Linux: Pro Tips for Finding Processes Occupying Ports and Fixing ‘Too Many Open Files’ Errors

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

From the “Everything is a file” philosophy to practical issues on servers

Linux enthusiasts are no strangers to the mantra: “Everything is a file.” From text files and directories to network cards or sockets – everything is managed by the kernel via File Descriptors (FD).

However, life isn’t always a dream. You’ll soon encounter scenarios where the server reports a Too many open files error, or disk space remains full even after deleting files. Back when I was managing a system for an e-commerce platform on CentOS 7, I got hit by a Java app leaking connections. It exhausted all 1024 default FDs in just 15 minutes, paralyzing the entire system. At that moment, lsof (LiSt Open Files) was the lifesaver that helped me pin down the exact PID causing the chaos.

Comparing lsof with other system monitoring tools

Many newcomers often only use netstat or ss when inspecting ports. However, each tool has its own strengths that you should clearly distinguish.

lsof vs netstat/ss

  • netstat/ss: Specialized in network statistics, showing which ports are open. However, the associated process information (PID) can sometimes be quite sparse.
  • lsof: Inspects sockets from a file descriptor perspective. It doesn’t just identify the port name but also lists the .so libraries that the process is loading into RAM.

lsof vs fuser

  • fuser: Used for quickly killing processes accessing a specific file.
  • lsof: Provides a more detailed view of file attributes such as User, FD, Type, or Device Size.

The upside of lsof is its extremely powerful filtering by user, port, or mount point. The downside? It’s quite resource-intensive if the system has millions of files open simultaneously because it has to scan the kernel’s /proc directory.

The “Too Many Open Files” error: Don’t underestimate default configurations

This error occurs when a process tries to open more files than the allowed limit. You need to distinguish between two types of limits:

  1. Soft Limit: The warning threshold, which users can adjust themselves.
  2. Hard Limit: The actual ceiling, which only root privileges can intervene with.

The cause is usually buggy application code (not closing DB connections) or default Linux configurations being set too low. The default figure of 1024 files is far too meager for modern web servers running Nginx or Redis.

5 Pro tips for using lsof in real-world troubleshooting

Here are the “pocket” commands I use most frequently in production environments.

1. Tracking down processes occupying ports instantly

When starting Docker and hitting an Address already in use error, use this combo immediately:

# Inspect process occupying port 80
sudo lsof -i :80

# Find TCP connections in LISTEN state
sudo lsof -nP -iTCP -sTCP:LISTEN

In which, -n and -P help bypass domain/port resolution, making the command run instantly.

2. Checking files opened by a specific User

Useful when you suspect an account is running scripts stealthily:

sudo lsof -u dev_user

3. Finding “Ghost Files” – Deleted but space not recovered

A classic scenario: You delete a 50GB log file using the rm command, but df -h still shows the disk is 100% full. The reason is that a process is still holding onto that file.

# Find files that have been deleted but still have FDs
sudo lsof +L1

Simply restart the service or kill that PID, and the space will be immediately reclaimed.

4. Inspecting files in a directory currently in use

When you umount a disk and get a device is busy warning:

sudo lsof +D /data/storage

5. Counting FDs per process

To find the culprit causing system errors, I use this aggregated command:

sudo lsof | awk '{print $1, $2}' | sort | uniq -c | sort -rn | head -n 10

The result will list the Top 10 processes consuming the most files.

Thoroughly resolving the ‘Too Many Open Files’ error

Once the “disease” is diagnosed, increase the system limits. Note: Don’t just use ulimit as it will lose effect when you close the terminal.

Step 1: Verify current status

# Check the current user's limits
ulimit -n

# Inspect the limits of a specific PID
cat /proc/[PID]/limits | grep "Max open files"

Step 2: Permanent configuration

Edit the /etc/security/limits.conf file and add the following lines:

*    soft    nofile    65535
*    hard    nofile    65535
root soft    nofile    65535
root hard    nofile    65535

Step 3: Notes for Systemd

On Ubuntu 22.04 or AlmaLinux, services running via Systemd often ignore limits.conf. You must intervene in the service file:

[Service]
... 
LimitNOFILE=65535
...

Then, remember to run systemctl daemon-reload for the changes to take effect.

Conclusion

When managing systems, don’t fear errors; fear not knowing where the error is. lsof isn’t just a command for viewing files; it’s a tool for you to understand how the kernel handles data streams. In the process of optimizing servers, I’ve learned a lesson: always monitor Open Files metrics alongside CPU/RAM. Don’t wait until customers complain to look for the lsof command.

Share: