KVM Live Migration with virsh: Move VMs Between Hosts Without Downtime – ITFROMZERO

Table of Contents

When you need live migration — and why simply shutting down the VM isn’t always an option

There are situations where you must move a VM without shutting it down: the physical host needs a RAM upgrade, a kernel update requires a reboot, or CPU load on one host is maxed out while another sits idle. Shutting the VM down, transferring it, then booting it back up sounds straightforward — but for a database server or web server actively serving traffic, even 30 seconds of downtime is a real problem.

KVM live migration was built to handle exactly this. The mechanism is quite elegant: RAM state is incrementally copied to the destination host while the VM keeps running normally. In the final stage, the source VM pauses for just a few milliseconds to sync the remaining dirty pages. Users connected to the VM barely notice — ping loss is typically just 1–2 packets.

I run a homelab with Proxmox VE managing 12 VMs and containers — it’s my playground for testing everything before pushing to production. Proxmox has a convenient live migration UI, but under the hood it’s just calling virsh migrate. This guide focuses on raw KVM + libvirt so you understand each step rather than just clicking a button.

There are two types of live migration worth distinguishing upfront:

Shared storage migration: VM disk lives on NFS/Ceph shared between hosts → only RAM state is transferred, fast and minimally disruptive to the VM.
Block migration (non-shared): Both RAM and disk are transferred → significantly slower but requires no shared storage setup.

Preparing the environment

Hardware and network requirements

Before touching any configuration, verify these prerequisites — missing any one of them will cause migration to fail immediately:

Both hosts have KVM/libvirt installed and running
Same CPU vendor — Intel to Intel, AMD to AMD. Cross-vendor migration (Intel/AMD) has theoretical workarounds but fails in practice
Network connectivity between the two hosts, specifically TCP port 16509 (libvirtd) and the port range 49152–49215 (QEMU migration channels)
Bridge network must share the same name on both hosts — if host1 uses br0 but host2 uses virbr0, the VM will lose network connectivity after migration

# Check the bridge network name on each host
ip link show type bridge

Installing required packages

Run on both hosts:

# Ubuntu/Debian
sudo apt install -y qemu-kvm libvirt-daemon-system libvirt-clients \
    virtinst bridge-utils

# RHEL/CentOS/Rocky Linux
sudo dnf install -y qemu-kvm libvirt virt-install bridge-utils
sudo systemctl enable --now libvirtd

# Confirm libvirtd is running
sudo systemctl status libvirtd
sudo virsh list --all

Detailed configuration for live migration

Step 1 — Enable libvirtd for remote connections

By default, libvirtd only listens on a local socket. You need to enable TCP transport for migration to work between two hosts.

Edit /etc/libvirt/libvirtd.conf on both hosts:

sudo nano /etc/libvirt/libvirtd.conf

Add or uncomment the following lines:

# Enable TCP transport
listen_tcp = 1
listen_addr = "0.0.0.0"
tcp_port = "16509"

# Disable auth for internal testing
# (production: use sasl or TLS)
auth_tcp = "none"

Enable the --listen flag for the daemon:

# Ubuntu/Debian
sudo sed -i 's/#LIBVIRTD_ARGS=""/LIBVIRTD_ARGS="--listen"/' /etc/default/libvirtd

# RHEL/CentOS
sudo sed -i 's/#LIBVIRTD_ARGS="--listen"/LIBVIRTD_ARGS="--listen"/' /etc/sysconfig/libvirtd

sudo systemctl restart libvirtd

# Confirm the port is open
ss -tlnp | grep 16509

Step 2 — SSH key-based authentication

Virsh migration over SSH requires a passwordless connection. Run from the source host:

# Generate a dedicated SSH key for migration
ssh-keygen -t ed25519 -f ~/.ssh/id_migration -N ""

# Copy to the destination host (192.168.1.102)
ssh-copy-id -i ~/.ssh/id_migration.pub [email protected]

# Test the connection and list VMs on the destination
ssh -i ~/.ssh/id_migration [email protected] "virsh list --all"

Step 3 — Set up NFS shared storage

NFS shared storage is the foundation of this entire setup. VM disks must reside on an NFS mount that is mounted at the same path on both hosts — get this wrong and migration will fail with a disk-not-found error.

Install and export on the NFS server (can be a third host, or one of the two KVM hosts):

sudo apt install -y nfs-kernel-server
sudo mkdir -p /srv/kvm-shared

# Export the directory to the LAN
echo "/srv/kvm-shared 192.168.1.0/24(rw,sync,no_root_squash,no_subtree_check)" | \
    sudo tee -a /etc/exports
sudo exportfs -a
sudo systemctl restart nfs-kernel-server

Mount on both KVM hosts:

sudo apt install -y nfs-common
sudo mkdir -p /var/lib/libvirt/images/shared

# Mount NFS (192.168.1.100 is the NFS server IP)
sudo mount -t nfs 192.168.1.100:/srv/kvm-shared /var/lib/libvirt/images/shared

# Add to /etc/fstab for automatic mount after reboot
echo "192.168.1.100:/srv/kvm-shared /var/lib/libvirt/images/shared nfs defaults,_netdev 0 0" | \
    sudo tee -a /etc/fstab

Create a new VM with its disk placed on the NFS share from the start:

sudo virt-install \
  --name testvm \
  --ram 2048 \
  --vcpus 2 \
  --disk path=/var/lib/libvirt/images/shared/testvm.qcow2,size=20 \
  --os-variant ubuntu22.04 \
  --network bridge=br0 \
  --graphics none \
  --location /var/lib/libvirt/images/ubuntu-22.04-live-server-amd64.iso,kernel=casper/vmlinuz,initrd=casper/initrd

Performing live migration with virsh

Assume the source host is host1 (192.168.1.101) and the destination is host2 (192.168.1.102). The VM named testvm is running on host1.

Migrate over SSH

# Basic syntax
virsh migrate --live testvm qemu+ssh://[email protected]/system

# Add --verbose to see progress
virsh migrate --live --verbose testvm qemu+ssh://[email protected]/system

Migrate over TCP transport (faster on LAN)

virsh migrate --live --verbose \
  testvm \
  qemu+tcp://192.168.1.102/system

Block migration without shared storage

Use --copy-storage-all to transfer the disk as well. A 20GB disk over a gigabit LAN takes roughly 3–5 minutes; the VM stays running but I/O performance will degrade during this period:

virsh migrate --live \
  --copy-storage-all \
  --verbose \
  testvm \
  qemu+ssh://[email protected]/system

Tuning the final downtime window

At the end of migration, the VM must pause for a few milliseconds to sync remaining dirty pages. The default is 300ms — if the VM has a heavy workload, increasing this slightly helps migration complete rather than timing out:

# Allow up to 500ms downtime during the final stage
virsh migrate --live \
  --migrate-setmaxdowntime 500 \
  testvm \
  qemu+ssh://[email protected]/system

Verification and monitoring

Confirm the VM migrated successfully

# On host1 — testvm should no longer be here
virsh list --all

# On host2 — testvm should be running
virsh list --all
# Expected:
# Id   Name     State
# 1    testvm   running

Monitor migration progress in real time

# Run this on the source host while migration is in progress
watch -n 1 "virsh domjobinfo testvm"

# Output will show:
# Job type:         Unbounded
# Time elapsed:     3421 ms
# Data processed:   768 MiB
# Data remaining:   64 MiB
# Memory processed: 768 MiB
# Memory remaining: 64 MiB

Verify network connectivity is uninterrupted

# Get the VM's IP address
virsh domifaddr testvm

# Continuously ping from an external machine while migration runs
ping -i 0.2 <testvm-IP>
# A successful migration should show no more than 1-2 packet losses

Troubleshooting common errors

# View libvirt logs on the source host
sudo tail -f /var/log/libvirt/libvirtd.log

# QEMU log for a specific VM
sudo tail -f /var/log/libvirt/qemu/testvm.log

A few errors I’ve run into while testing in my homelab:

“Unable to allow access for disk path”: The VM disk is on local storage, not NFS. You need to move the disk to the NFS share first (shut down the VM, copy the file, virsh edit to update the path, then start it again).
“authentication failed”: Check that auth_tcp = "none" is set in libvirtd.conf and that you’ve restarted the daemon.
Migration timeout: The VM has a heavy I/O workload generating dirty pages faster than the migration can copy them. Try increasing --migrate-setmaxdowntime or reducing the VM’s load before migrating.
VM loses network after migration: Bridge names differ between the two hosts. Standardizing the bridge network name across both hosts is the definitive fix.