Installing and Configuring Chrony on CentOS Stream 9: Setting Up an NTP Server and Client for Accurate Time Synchronization – ITFROMZERO

Table of Contents

2 AM, Mismatched Logs, and Alarms Going Off

It happened on a Friday night. I was asleep when I got a call from the dev team: “Hey, the transaction logs on the DB server and app server are off by nearly 5 minutes — we can’t tell when the error actually happened.”

I opened my laptop to check, and sure enough — the DB server was keeping correct time, while the app server was running 4 minutes and 37 seconds behind. Enough to cause SSL certificate handshake failures on some services, Kerberos authentication timeouts, and worst of all, distributed tracing became completely useless when timestamps weren’t in sync across nodes.

That server was running old ntpd, the NTP pool had been unreachable for weeks, and no monitoring had raised an alarm. After that, I switched the entire fleet to Chrony. We still have a few servers running CentOS 7 at work, and migrating them to AlmaLinux is something I’ve already dealt with — but whether it’s CentOS 7 or CentOS Stream 9, Chrony is the right time sync solution right now.

Why Does Server Clock Drift?

A server’s hardware clock (RTC) drifts over time — it can lose or gain anywhere from a few seconds to several minutes per day. In virtualized environments (KVM, VMware), this gets even worse because VMs are constantly being suspended and resumed, causing the clock to jump erratically.

Real-world problems I’ve experienced when server time drifts:

TLS/SSL handshake failures: Certificates are rejected as not yet valid or already expired because the server time is wrong
Kerberos auth timeout: Kerberos only tolerates a maximum clock skew of 5 minutes — exceed that and all authentication fails
False replication lag alerts: Timestamp mismatches between master and slave cause monitoring to report replication delay when there’s actually none
Useless log forensics: When an incident occurs, you can’t correlate logs from multiple servers if timestamps aren’t in sync
Cron jobs running at the wrong time: The backup job runs during peak hours instead of 3 AM as configured

Root Cause Analysis: Legacy ntpd and Why Chrony Is the Complete Replacement

CentOS Stream 9 (and RHEL 9) have dropped ntpd entirely, shipping only Chrony by default. Yet many admins still try to install ntpd from third-party repos out of habit — that’s a mistake worth avoiding.

Chrony outperforms ntpd in two key areas:

Faster synchronization after boot: Chrony uses its own algorithm and can step-adjust the clock immediately instead of slewing gradually (ntpd by default won’t step if the offset is under 1000 seconds)
Works well in unstable environments: Laptops powering on/off, VM suspend/resume, intermittent network connections — Chrony handles all of these much better by tracking drift in real time

Check which services are currently running before doing anything:

systemctl status chronyd ntpd 2>/dev/null
timedatectl status

If both are enabled, you have a conflict — disable ntpd first.

Solutions: Installing and Configuring Chrony

Step 1: Install Chrony (usually pre-installed on CentOS Stream 9)

dnf install -y chrony
systemctl enable --now chronyd
systemctl status chronyd

If the server is running ntpd, stop and remove it first:

systemctl stop ntpd
systemctl disable ntpd
systemctl mask ntpd  # Prevent other services from re-enabling it
dnf remove -y ntp

Step 2: Configure a Basic NTP Client

The main configuration file is at /etc/chrony.conf. By default, CentOS Stream 9 uses the 2.centos.pool.ntp.org pool. If your server is located in Vietnam or Japan, switch to the Asia pool to reduce latency:

cat > /etc/chrony.conf << 'EOF'
# Use the nearest NTP pools to reduce latency
pool 0.asia.pool.ntp.org iburst
pool 1.asia.pool.ntp.org iburst
pool 2.asia.pool.ntp.org iburst
pool 3.asia.pool.ntp.org iburst

# Save drift file to track clock offset over time
driftfile /var/lib/chrony/drift

# Step-adjust the clock at startup if offset exceeds 1 second (up to 3 times)
makestep 1.0 3

# Periodically sync the hardware clock (RTC)
rtcsync

logdir /var/log/chrony
EOF

systemctl restart chronyd
chronyc sources -v

The output of chronyc sources looks like this when working normally:

MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* 103.145.x.x                   2   6   377    51   +123us[ +456us] +/- 5ms
^+ time.cloudflare.com            3   6   377    52   -234us[ -789us] +/- 8ms

The ^* symbol indicates the server currently selected as the primary source, ^+ is a backup. If everything shows ^?, the server hasn’t reached any pool — check that firewall port 123/UDP is open.

Step 3: Set Up an Internal NTP Server (Critical for Enterprise Environments)

Instead of letting 50 servers each reach out to the internet to sync NTP, you should set up 1-2 internal NTP servers and have the rest sync to them. The benefits are clear: reduced external traffic, the entire infrastructure shares a single time source, and log correlation is accurate when incidents occur.

On the NTP Server machine (for example, IP 192.168.1.10), add the allow section to the config:

cat > /etc/chrony.conf << 'EOF'
# Sync from external pools
pool 0.asia.pool.ntp.org iburst
pool 1.asia.pool.ntp.org iburst
pool 2.asia.pool.ntp.org iburst

driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony

# Allow internal subnets to query NTP from this server
allow 192.168.1.0/24
allow 10.0.0.0/8

# If internet is lost, still serve time to clients based on the local clock
# stratum 10 = "not very confident but still better than nothing"
local stratum 10
EOF

systemctl restart chronyd

# Open NTP firewall port
firewall-cmd --permanent --add-service=ntp
firewall-cmd --reload

# Check which clients are syncing to this server
chronyc clients

On all other servers (NTP clients), update the config to point to the internal NTP server:

cat > /etc/chrony.conf << 'EOF'
# Point to the internal NTP server
server 192.168.1.10 iburst prefer
server 192.168.1.11 iburst  # Backup NTP server if available

driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
EOF

systemctl restart chronyd
sleep 10
chronyc tracking

Quick Troubleshooting for Common Issues

# Check detailed sync status
chronyc tracking

# Force immediate sync (useful after changing config)
chronyc makestep

# View logs in real time
journalctl -u chronyd -f

“No NTP sources selected” error — usually caused by the firewall blocking port 123/UDP:

firewall-cmd --list-services | grep ntp
# If not listed, open it:
firewall-cmd --permanent --add-service=ntp && firewall-cmd --reload

SELinux blocking chronyd — common after changing the log or drift file path:

ausearch -m avc -ts recent | grep chrony
restorecon -Rv /var/lib/chrony /etc/chrony.conf

Best Practices: A Production-Ready Setup

After getting woken up by calls in the middle of the night more than once, I put together this checklist to make sure it never happens again:

Set up at least 2 internal NTP servers — one primary, one backup. Chrony clients will automatically failover when the primary goes down
Internal NTP servers should sync from at least 3 external pools — Chrony will automatically discard pools with poor reliability
Enable rtcsync — sync the hardware clock so the server doesn’t start off drifted after a reboot
Monitor offset with Prometheus — the node_timex_offset_seconds metric will alert you before the offset exceeds 1 second, giving you enough time to act before it becomes an incident
Add it to your Ansible playbook — every newly deployed server gets the correct chrony config from day one, no manual setup required

Here’s the Ansible task I’m currently using — simple but gets the job done:

- name: Install chrony
  dnf:
    name: chrony
    state: present

- name: Deploy chrony config
  template:
    src: chrony.conf.j2
    dest: /etc/chrony.conf
    owner: root
    mode: '0644'
  notify: restart chronyd

- name: Enable chronyd
  systemd:
    name: chronyd
    state: started
    enabled: yes

- name: Open NTP port (NTP server nodes only)
  firewalld:
    service: ntp
    permanent: yes
    state: enabled
  when: inventory_hostname in groups['ntp_servers']

Since applying this setup to the entire fleet — including servers still waiting to be migrated to AlmaLinux — I haven’t gotten a 2 AM call about time sync again. And when a real incident does happen, logs from all servers correlate accurately, saving hours of unnecessary debugging.