How to Use sosreport on CentOS Stream 9: Collect and Analyze System Information for Professional Troubleshooting

CentOS tutorial - IT technology blog
CentOS tutorial - IT technology blog

When Your Server Crashes, What Do You Do First?

Server goes down at 2 AM. You SSH in and start running commands one by one: journalctl, dmesg, netstat, df -h, top… each in its own terminal window, copy-pasting output into a text file to send to a colleague or the support team. I’ve done this countless times, and something always gets missed at exactly the wrong moment.

Back when CentOS 8 reached EOL, I had to urgently migrate 5 servers to Rocky Linux within a week. One server had a networking issue after reboot — I spent nearly 2 hours manually running commands to collect logs before finding the root cause: a NetworkManager profile had been corrupted during the migration. That incident is what finally pushed me to properly learn sosreport, and I realized just how much time I had been wasting.

Three Ways to Collect System Information — and What Really Sets Them Apart

Option 1: Running Commands Manually One by One

Everyone starts here — writing your own scripts or running commands one by one as needed.

# Manual collection — time-consuming, easy to miss things
journalctl -xe > /tmp/journal.log
dmesg > /tmp/dmesg.log
ip addr show > /tmp/network.log
df -h > /tmp/disk.log
cat /etc/os-release > /tmp/os.log
ss -tlnp > /tmp/ports.log
# ... dozens more commands depending on the issue

Pros: No additional installation required, flexible in choosing exactly what you need, lightweight output.

Cons: Easy to miss things you haven’t thought of, time-consuming when the system is already struggling, non-standardized output makes it hard to share with vendors or technical teams.

Option 2: sosreport — Red Hat’s Official Tool

sosreport wraps the entire process into a single command, following Red Hat’s official standards:

sudo sos report

Pros: Collects 200+ types of information via plugins, standardized output as a tarball, accepted directly by Red Hat support, with dedicated plugins for Apache, Docker, KVM, Kubernetes…

Cons: Requires root privileges, takes 3–10 minutes to run, large output files (typically 50–200MB), not suitable for continuous monitoring.

Option 3: Monitoring Stacks (Prometheus, Datadog…)

These aren’t incident debugging tools — they’re proactive alerting layers to catch problems before everything falls apart.

Pros: Visual dashboards, automatic alerts, historical data retention over time.

Cons: Cannot collect detailed logs or system configurations, requires prior setup, not a tool for sharing when debugging a specific incident.

Comparing the Options and Choosing the Right Approach

The right approach depends on the situation. Here’s the breakdown I keep coming back to:

  • Debugging an unexpected incident to send to Red Hat support or a vendor: sosreport — this is exactly the scenario it was designed for.
  • Only need 1–2 quick metrics: Manual commands — faster, no need to wait for sos to run.
  • Monitoring performance in real time: A monitoring stack (Prometheus + Grafana).
  • Auditing a system before major changes (migration, kernel update): sosreport — create a snapshot for before/after comparison.
  • Troubleshooting after a kernel or critical package update: sosreport — you need full context to identify regressions.

The real strength of sosreport: it collects things you didn’t know you needed. When debugging the networking issue I mentioned earlier, what revealed the root cause wasn’t ip addr or journalctl — it was a config file inside /etc/NetworkManager/system-connections/ that sosreport had copied wholesale into the tarball.

Deploying sosreport on CentOS Stream 9

Step 1: Installation

CentOS Stream 9 usually has sos available in the AppStream repository:

sudo dnf install sos -y
sos --version
# sos version 4.x.x

Step 2: Running a Basic sosreport

sudo sos report

Before running, the tool prompts you with a few questions:

Please enter your first initial and last name [root]: hieu
Please enter the case id that you are generating this report for: CASE-2026-001

If you don’t have a case ID (self-troubleshooting), just press Enter to skip. After 3–5 minutes, the file is created at:

/var/tmp/sosreport-hostname-2026-06-20-xxxxxxx.tar.xz

Step 3: Customizing What Gets Collected

You don’t always need a full report. Here are some practical options:

# List available plugins
sudo sos report --list-plugins

# Only collect networking and kernel info (much faster)
sudo sos report --only-plugins networking,networkmanager,kernel,logs

# Skip time-consuming plugins you don't need
sudo sos report --skip-plugins rpm,yum,docker

# Run non-interactively — useful in scripts
sudo sos report --batch --label "post-migration-$(hostname)"

# Specify output directory
sudo sos report --tmp-dir /data/sos-reports/

# Limit log file size collected (MB)
sudo sos report --log-size 20

# Only collect logs from a specific point in time
sudo sos report --since "2026-06-19 00:00:00"

When debugging networking issues after a migration, I usually run this leaner command — faster than a full report while still capturing everything needed:

sudo sos report \
  --only-plugins networking,networkmanager,firewalld,kernel,logs \
  --batch \
  --label "net-debug-$(hostname)-$(date +%Y%m%d)"

Step 4: Analyzing the Output

The tarball is well-organized — extract it and everything is immediately visible:

cd /var/tmp/
tar -xJf sosreport-*.tar.xz
ls -la sosreport-*/

Typical directory structure:

sosreport-hostname-2026-06-20-xxxxxxx/
├── etc/                    # Full /etc directory copied as-is
│   ├── systemd/
│   ├── NetworkManager/
│   └── sysconfig/
├── var/log/                # Log files
│   ├── messages
│   └── audit/audit.log
├── proc/                   # /proc snapshot at time of collection
├── sos_commands/           # Output from each command run by sos
│   ├── networking/
│   │   ├── ip_addr
│   │   ├── ip_route
│   │   └── ss_-tlnp
│   └── kernel/
│       ├── dmesg
│       └── uname
└── sos_logs/               # sos's own logs during collection

Quickly find information after extracting:

# Network state at the time of collection
cat sosreport-*/sos_commands/networking/ip_addr

# Find errors in the messages log
grep -i "error\|failed\|critical" sosreport-*/var/log/messages | tail -50

# View NetworkManager connection configurations
ls sosreport-*/etc/NetworkManager/system-connections/

# Find OOM killer or kernel panic events
grep -r "Out of memory\|Killed process\|kernel panic" sosreport-*/var/log/

# View hardware and kernel information
cat sosreport-*/sos_commands/kernel/uname

Step 5: Collecting from Multiple Servers at Once with sos collect

Managing a cluster or multiple servers? sos collect handles bulk collection over SSH:

sudo sos collect \
  --nodes web1,web2,web3,db1 \
  --ssh-user admin \
  --only-plugins networking,logs,kernel

Integrating a Script into Your Incident Response Workflow

This is a script I recommend every team set up in advance — run it immediately when a serious incident occurs:

#!/bin/bash
# /usr/local/sbin/emergency-sos.sh

LABEL="incident-$(date +%Y%m%d-%H%M)"
OUTPUT_DIR="/data/sos-reports"

mkdir -p "$OUTPUT_DIR"
echo "[$(date)] Collecting sosreport for: $LABEL"

sudo sos report \
  --batch \
  --label "$LABEL" \
  --tmp-dir "$OUTPUT_DIR" \
  --log-size 50 \
  2>&1 | tee "/var/log/sos-${LABEL}.log"

echo "[$(date)] Done:"
ls -lh "$OUTPUT_DIR"/sosreport-*"$LABEL"* 2>/dev/null

During the CentOS 8 → Rocky Linux migration, I ran this script before and after each server to capture comparison snapshots. What saved me from hours of debugging wasn’t some complex tool — it was simply having the right information at the right time.

Common Issues and How to Handle Them

sosreport Takes Too Long or Hangs

The rpm plugin is particularly slow on servers with many packages — it can take 5–10 minutes on its own. Skip it if you don’t need it:

sudo sos report --skip-plugins rpm,yum

# Debug to see which plugin is currently running
sudo sos report --debug 2>&1 | grep "Running plugin"

Output File Is Too Large

# Limit log size and use only specific plugins
sudo sos report --log-size 10 --only-plugins networking,logs,kernel

Insufficient Permissions to Collect Some Information

# Use sudo, not su -
sudo sos report        # Correct
su -c "sos report"     # May be missing environment variables, sudo is better

Practical Takeaways

sosreport doesn’t replace the need to understand Linux — it helps you collect enough information to start understanding the problem. Instead of running 30 commands and worrying about what you’ve missed, a single sudo sos report gives you everything you need in 5 minutes.

My key takeaway: run sosreport early — don’t wait until you’ve already restarted services before thinking about it. A snapshot captured while the incident is still intact is what has value — a reboot or service restart wipes out a lot of error state, and you may never find the real root cause.

Share: