Configuring VRF on Linux: A ‘Divide and Conquer’ Solution for Routing Tables

Network tutorial - IT technology blog
Network tutorial - IT technology blog

When the Default Routing Table Becomes a Barrier

Have you ever faced the dilemma where two different customers require a VPN connection, but both use the 192.168.1.0/24 IP range? Typically, the Linux kernel maintains a single Main Routing Table. Every interface, from eth0 to tunnels, references this table to forward packets. When IP overlaps occur, the server becomes completely lost, unable to determine where to return packets.

Previously, I often used Policy-Based Routing (PBR) combined with ip rule. However, as systems scale to 50 or 100 customers, managing hundreds of rules becomes a true nightmare. Debugging routing issues at that point is extremely time-consuming.

VRF (Virtual Routing and Forwarding) emerged to solve this problem once and for all. It creates independent virtual routing entities. Think of VRF like VLANs at Layer 2, but applied to the routing table at Layer 3.

Practical Comparison: PBR and VRF

To choose the right tool, we need to clearly understand the differences between the two most common approaches today:

1. Policy-Based Routing (PBR)

  • Mechanism: Incoming packets are inspected (source IP, port, etc.) and assigned to a corresponding routing table.
  • Weakness: ip rule configurations can easily overlap and lack complete isolation at the interface level. The more rules you have, the more packet processing performance is affected.

2. VRF (Virtual Routing and Forwarding)

  • Mechanism: You hard-bind an interface to a “VRF device.” Each VRF owns its own distinct routing table, completely separated from the rest of the system.
  • Strength: Absolute isolation. A process can run isolated within a VRF without any knowledge of other networks.

Steps to Deploy VRF in a Real-World Environment

You need Linux kernel version 4.3 or higher to get started. Popular distributions like Ubuntu 22.04 or RHEL 9 now offer excellent support. In this example, I will create two VRFs for tenant-A and tenant-B.

Step 1: Enable the VRF Module

First, check if the kernel has loaded the necessary module:

sudo modprobe vrf
# Confirm again
lsmod | grep vrf

Step 2: Initialize VRF Devices

Each VRF in Linux acts as a “parent” virtual interface. You must assign a unique Table ID to each VRF to identify its routing table.

# Create VRF for Tenant A (Table 10)
sudo ip link add vrf-a type vrf table 10
sudo ip link set vrf-a up

# Create VRF for Tenant B (Table 20)
sudo ip link add vrf-b type vrf table 20
sudo ip link set vrf-b up

Step 3: Assign Interfaces and Handle IP Overlaps

This is the most important part. Suppose eth1 connects to Tenant A and eth2 connects to Tenant B. When calculating subnets for multiple tenants, I often use an IP Subnet Calculator to avoid errors with network and broadcast parameters.

# Configuration for Tenant A
sudo ip link set dev eth1 master vrf-a
sudo ip addr add 192.168.1.1/24 dev eth1
sudo ip link set eth1 up

# Configuration for Tenant B (Using overlapping IP ranges works perfectly)
sudo ip link set dev eth2 master vrf-b
sudo ip addr add 192.168.1.1/24 dev eth2
sudo ip link set eth2 up

Step 4: Verify Isolation

At this point, the standard ip route command will not display the tenants’ IP ranges. To view a specific routing table, you must specify the VRF name:

# Check routing for Tenant A
ip route show vrf vrf-a

# Or view directly via Table ID
ip route show table 10

How to Run Applications Inside a VRF

Many people complain about not being able to ping after configuration. This is because default commands always run against the Main Routing Table.

Executing the Ping Command

To test Tenant A’s connectivity, you must force the ping command through the corresponding VRF:

# Ping Tenant A's IP 192.168.1.100
ping -I vrf-a 192.168.1.100

Running Services (Nginx, SSH, Python)

The ip vrf exec mechanism is extremely powerful. It allows you to launch a process entirely within the context of that VRF.

# Run a web server serving only Tenant B
sudo ip vrf exec vrf-b python3 -m http.server 8080

As a result, only clients connecting from eth2 can access port 8080. Clients from Tenant A are completely blocked, even if they share the same IP range.

Hard-won Lessons When Using VRF

Through real-world deployment, I’ve identified three points that require special attention:

  1. Localhost Issues: Services within a VRF sometimes cannot connect to 127.0.0.1. Enable net.ipv4.tcp_l3mdev_accept = 1 so services can listen on all VRFs without needing to bind to each one manually.
  2. DNS Resolution: If the DNS server resides in the main routing table, processes within the VRF will not be able to resolve domain names. You should place a separate DNS server in each VRF or set up Inter-VRF routing.
  3. Iptables Traps: Packets passing through a VRF device will traverse the PREROUTING and POSTROUTING chains twice. Double-check your firewall rules to prevent packets from being dropped unexpectedly.

Conclusion

VRF is the perfect alternative to overusing virtual machines or containers just to separate routing tables. This approach makes your infrastructure more transparent, secure, and much easier to scale. If you are building a Gateway or VPN for multiple customers, apply VRF immediately to simplify your network management.

Share: