How to Use rsync for Efficient File Synchronization on Linux

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

The Real Problem with File Backup and Sync

I used to use cp -r to back up the web directory every night. Seemed fine at first, but after a few weeks I realized the problem: every backup copied several gigabytes of data in full, even when only a few dozen files had actually changed. The result was a cron job that ran for 30–40 minutes, disk I/O spiked hard, and the server crawled right during peak hours.

The issue is that cp can’t tell which files have changed — it just copies everything regardless. rsync was built specifically to solve this.

rsync (Remote Sync) operates on the principle of delta transfer — it only sends the parts of data that have actually changed between source and destination. On the Ubuntu 22.04 production server with 4GB RAM that I manage, switching to rsync cut backup time from 35 minutes down to 2–3 minutes for the same amount of data.

Is rsync Already Installed? Check Before Installing

You might not need to install anything at all. Most modern Linux distros ship with rsync out of the box — do a quick check:

rsync --version

If you see output like rsync version 3.2.x, you’re good to go. If not:

# Ubuntu / Debian
sudo apt install rsync

# RHEL / AlmaLinux / Rocky
sudo dnf install rsync

# Arch Linux
sudo pacman -S rsync

For remote sync over SSH, the destination machine also needs rsync. Check it too:

ssh user@remote-server "rsync --version"

Basic Syntax and Important Flags

The syntax looks simple enough:

rsync [options] source destination

But the key is knowing which flags to use. Here are the ones I reach for most often, with practical explanations:

  • -a — archive mode: preserves permissions, owner, timestamps, symlinks, and recurses into subdirectories
  • -v — verbose: shows which files are being synced
  • -z — compress data during transfer (useful on slow connections)
  • -P — shows progress bar and resumes interrupted transfers
  • --delete — removes files at the destination that no longer exist at the source
  • --dry-run or -n — simulates the run without making any actual changes
  • --exclude — skips files or directories matching a pattern

Practical Use Cases

1. Syncing a Local Directory to Another Local Path

The simplest case — backing up from one drive to another on the same machine:

# Sync /var/www/html to /mnt/backup/www
rsync -av /var/www/html/ /mnt/backup/www/

# Note the trailing slash on the source path:
# /var/www/html/  → copies the CONTENTS inside the directory
# /var/www/html   → copies the html directory ITSELF into the destination

The trailing / on the source path is the most common gotcha for rsync beginners. Always do a --dry-run first to be safe:

rsync -av --dry-run /var/www/html/ /mnt/backup/www/

2. Syncing to a Remote Server over SSH

My most-used scenario — pushing backups to another server:

# Push from local to remote
rsync -avz -P /var/www/html/ [email protected]:/backup/www/

# Pull from remote to local
rsync -avz -P [email protected]:/var/www/html/ /local/backup/www/

# Specify SSH key and custom port
rsync -avz -e "ssh -i ~/.ssh/id_rsa -p 2222" \
  /var/www/html/ user@remote:/backup/www/

3. Mirroring with Deletion

When you want the destination to be an exact mirror of the source — files deleted from the source get deleted from the destination too:

rsync -av --delete /var/www/html/ /mnt/backup/www/

Be careful with --delete: if you accidentally swap source and destination, you’ll wipe out your original data. Always test with --dry-run first.

4. Excluding Files and Directories

# Skip cache, log, and temp directories
rsync -av \
  --exclude='cache/' \
  --exclude='*.log' \
  --exclude='.git/' \
  --exclude='node_modules/' \
  /var/www/myapp/ user@remote:/backup/myapp/

# Use an exclude list file (cleaner when you have many patterns)
cat > /etc/rsync-exclude.txt << 'EOF'
cache/
*.log
*.tmp
.git/
node_modules/
EOF

rsync -av --exclude-from='/etc/rsync-exclude.txt' \
  /var/www/myapp/ user@remote:/backup/myapp/

5. Automated Backup with Cron

For rsync to run unattended, SSH can't prompt for a password. Using an SSH key is the simplest approach:

# Generate an SSH key without a passphrase for the backup user
ssh-keygen -t ed25519 -f ~/.ssh/backup_key -N ""

# Copy the public key to the remote server
ssh-copy-id -i ~/.ssh/backup_key.pub user@remote-server

Then create a backup script:

cat > /usr/local/bin/backup-web.sh << 'EOF'
#!/bin/bash
LOG_FILE="/var/log/rsync-backup.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')

echo "[$DATE] Starting backup..." >> "$LOG_FILE"

rsync -az --delete \
  -e "ssh -i /root/.ssh/backup_key -o StrictHostKeyChecking=no" \
  --exclude-from='/etc/rsync-exclude.txt' \
  /var/www/html/ \
  [email protected]:/backup/www/ \
  >> "$LOG_FILE" 2>&1

EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
  echo "[$DATE] Backup completed successfully" >> "$LOG_FILE"
else
  echo "[$DATE] Backup FAILED with exit code $EXIT_CODE" >> "$LOG_FILE"
fi
EOF

chmod +x /usr/local/bin/backup-web.sh

Add it to crontab to run every night at 2 AM:

crontab -e
# Add the following line:
0 2 * * * /usr/local/bin/backup-web.sh

How to Verify Your Backup Is Working

Exit Codes — Don't Ignore Them

After each run, rsync returns an exit code. Zero means success; anything else means something went wrong:

  • 0 — success
  • 1 — syntax error
  • 11 — I/O error reading or writing files
  • 23 — partial transfer failure (usually a permissions issue)
  • 30 — connection timeout

The backup script above already captures the exit code. For a quick manual check:

rsync -av /source/ /destination/
echo "Exit code: $?"

Comparing Source and Destination

Want to confirm both directories are in sync? Run rsync again with --dry-run — no output means they're identical:

rsync -av --dry-run /source/ /destination/
# No files listed → directories are fully in sync

Monitoring Logs

# Watch backup log in real time
tail -f /var/log/rsync-backup.log

# View the last 50 lines
tail -n 50 /var/log/rsync-backup.log

# Filter for failed backup runs only
grep "FAILED" /var/log/rsync-backup.log

Measuring Speed and Efficiency

Add --stats to get a summary table after the run — the numbers are often surprisingly impressive:

rsync -av --stats /var/www/html/ /mnt/backup/www/

# Sample output:
# Number of files: 1,234
# Number of created files: 5
# Number of deleted files: 2
# Number of regular files transferred: 12
# Total file size: 2.45G bytes
# Total transferred file size: 4.23M bytes  ← only 4MB instead of 2.45GB!
# Speedup is 593.93

The "Speedup" line says it all. A value of 593 means rsync transferred only 1/593rd of the data compared to a full copy — roughly 0.17% of the data actually went over the wire.

Common Errors and How to Fix Them

Permission denied: Files at the destination have different permissions than the user running rsync. Either run with sudo or adjust ownership on the destination directory.

"skipping non-regular file": rsync encountered a socket file or device file it can't copy. Adding --exclude="*.sock" takes care of it.

Slow transfer on a weak network connection: Enable -z for compression. But for files that are already compressed — like jpg, mp4, or zip — re-compressing does nothing; dropping -z will actually be faster.

Once you're comfortable with rsync, the next worthwhile step is exploring --backup to keep old versions before overwriting, or combining it with hardlinks for incremental backups — multiple historical snapshots without significant extra disk usage.

Share: