Installing and Configuring NTP Server on Linux: Fixing Time Sync Issues in Production

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

At 2 AM, your phone buzzes — a Grafana alert reporting that log timestamps across 3 servers are mismatched by up to 5 minutes. TLS certificates start throwing errors, cronjobs run at the wrong time, and database replica lag spikes because binlog timestamps are out of phase. All because time sync broke.

This is a real situation I encountered while managing a 15-server cluster. The lesson learned: NTP is not a “set it and forget it” thing — it needs proper configuration and continuous monitoring.

Why Does Time Sync Matter So Much?

Many sysadmins treat NTP as optional — “if the server runs, it’s fine.” But clock drift silently undermines your system in ways that are hard to predict:

  • Log mismatches: When tracing an incident and server A’s logs are 3 minutes ahead of server B’s, you can’t reconstruct the flow
  • TLS certificate errors: TLS checks timestamps during handshakes — a drift of more than 5 minutes causes immediate rejection
  • Unstable distributed systems: Kafka, Cassandra, and etcd all rely on time for consensus and ordering
  • Authentication failures: Kerberos tokens, JWTs, and OAuth all have short time windows
  • Cronjobs running at the wrong time: Backups kicking off during peak hours because the server clock is in the wrong timezone

Early in my sysadmin career, I once spent an entire afternoon chasing down authentication failures between two microservices. I went through the logs over and over and found nothing unusual. Turned out the JWT token was being rejected because the source and destination servers were exactly 8 minutes apart. Since then, I’ve made it a habit: when a new server comes up, the first thing I check is NTP.

chrony vs ntpd — Which One Should You Use?

Since RHEL 7, CentOS 7, and Ubuntu 16.04, chrony has been the default replacement for the older ntpd. The practical reasons:

  • Syncs faster after boot — typically done in seconds, compared to ntpd which can take several minutes
  • Handles virtualized environments better — VMs tend to experience more clock drift than bare metal
  • Small footprint — process memory under 5MB, suitable even for 512MB RAM VPS instances

ntpd still works fine. But chrony has been the default since RHEL 7 and Ubuntu 16.04, so it’s what you’ll encounter daily in production.

Installing chrony

On Ubuntu/Debian

# Install
sudo apt update && sudo apt install -y chrony

# Enable and start
sudo systemctl enable --now chronyd

# Check status
sudo systemctl status chronyd

On CentOS/RHEL/Rocky Linux

# Install (dnf for RHEL 8+, yum for CentOS 7)
sudo dnf install -y chrony

# Enable and start
sudo systemctl enable --now chronyd

A note before you begin: some older systems run both ntpd and chronyd simultaneously — these two daemons conflict and produce unpredictable results. Clean up first:

# Check if ntpd is running
sudo systemctl status ntpd

# If it is, stop and disable it
sudo systemctl stop ntpd
sudo systemctl disable ntpd
sudo systemctl mask ntpd  # Prevent it from starting again

Detailed Configuration

The main config file is: /etc/chrony.conf (Ubuntu: /etc/chrony/chrony.conf).

Basic NTP Client Configuration

# NTP servers — use pools instead of single servers for redundancy
pool 0.pool.ntp.org iburst
pool 1.pool.ntp.org iburst
pool 2.pool.ntp.org iburst
pool 3.pool.ntp.org iburst

# If running on cloud, prefer the provider's NTP server (lower latency)
# AWS:   server 169.254.169.123 prefer iburst
# GCP:   server metadata.google.internal prefer iburst

# Drift file — stores clock drift info for faster sync after reboot
driftfile /var/lib/chrony/drift

# Allow step correction if offset exceeds 1 second (only first 3 times after start)
makestep 1.0 3

# Sync hardware clock (RTC)
rtcsync

# Log directory
logdir /var/log/chrony

Internal NTP Server Configuration for a Full Cluster

Managing a multi-server cluster? The recommended architecture is:

  1. 1–2 servers sync with public NTP (Internet)
  2. All remaining servers sync with the internal NTP server

This reduces outbound traffic and avoids dependency on public network latency. On the NTP master server, append the following to /etc/chrony.conf:

# Allow clients in this subnet to sync time
allow 192.168.1.0/24
allow 10.0.0.0/8

# If internet connection is lost, still serve time to clients using local clock
local stratum 10

On client servers, replace the public pools with the internal NTP address:

# Comment out or remove the pool.ntp.org lines
# pool 0.pool.ntp.org iburst

# Sync only from internal NTP server
server 192.168.1.10 iburst prefer
server 192.168.1.11 iburst  # Backup

makestep 1.0 3
driftfile /var/lib/chrony/drift
rtcsync

After updating the config, restart the service:

sudo systemctl restart chronyd

Firewall — The Most Commonly Forgotten Step

NTP uses UDP port 123. If you’re setting up an internal NTP server, open the firewall so clients can connect:

# Ubuntu with ufw
sudo ufw allow 123/udp

# CentOS/RHEL with firewalld
sudo firewall-cmd --permanent --add-service=ntp
sudo firewall-cmd --reload

Checking & Monitoring

Verifying Sync Status Right After Installation

# View the list of NTP sources and their status
chronyc sources -v

Sample output:

MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* ntp1.example.com             2   6   377    43    +12us[ +34us] +/-  456us
^+ ntp2.example.com             2   6   377    44    -8us[ +12us] +/-  234us

What the leading characters mean:

  • * = Currently selected source (best)
  • + = Good source, being combined
  • ? = Unreachable — debug immediately
  • x = Time may be faulty — this source is being ignored
# View tracking info — current offset and stratum
chronyc tracking
Reference ID    : C0A80101 (192.168.1.1)
Stratum         : 3
System time     : 0.000012345 seconds fast of NTP time
Last offset     : +0.000008234 seconds
RMS offset      : 0.000015678 seconds
Frequency       : 12.345 ppm fast
Leap status     : Normal

The key figure to watch is System time (the offset). Under 100ms is acceptable, under 10ms is good. Running distributed databases like etcd or Cassandra? You need under 1ms — that’s not a recommendation, it’s a requirement.

Offset Check and Alert Script

I often use this small script in a cron job or integrated into Nagios/Zabbix:

#!/bin/bash
# /usr/local/bin/check-ntp-offset.sh

THRESHOLD=0.5  # seconds — alert if offset exceeds 500ms

offset=$(chronyc tracking | grep "System time" | awk '{print $4}')
offset_abs=$(echo "$offset" | tr -d '-')

if (( $(echo "$offset_abs > $THRESHOLD" | bc -l) )); then
    echo "CRITICAL: NTP offset ${offset}s exceeds threshold ${THRESHOLD}s"
    exit 2
else
    echo "OK: NTP offset is ${offset}s"
    exit 0
fi
chmod +x /usr/local/bin/check-ntp-offset.sh
/usr/local/bin/check-ntp-offset.sh

Debugging When chrony Won’t Sync

If chronyc sources shows nothing but ? marks, that’s the clearest sign of a problem. Two causes account for 90% of cases: a firewall blocking UDP 123, or DNS failing to resolve. Check them in order:

# Watch logs in real time
sudo journalctl -u chronyd -f

# Test reachability of an NTP server
ntpdate -q 0.pool.ntp.org 2>&1 | head -5

# Check DNS resolution
nslookup 0.pool.ntp.org

# Force an immediate sync (use when offset is large and needs a quick fix)
sudo chronyc makestep

NTP Setup Checklist for a New Server

  1. Install chrony, check for and disable ntpd if it’s running
  2. Configure the appropriate NTP pool (prefer your cloud provider’s NTP if on cloud)
  3. For clusters: set up 1–2 internal NTP servers, and have all other servers sync from them
  4. Open UDP 123 in the firewall on the internal NTP server
  5. Verify with chronyc sources -v and chronyc tracking
  6. Add monitoring to alert when offset exceeds your threshold

Next time you onboard a new server into a cluster, check NTP from the very start — don’t wait until 2 AM to find out the hard way, like I did.

Share: