Deploying Kubernetes on vSphere with VMware Tanzu: Turning Virtualization into Professional Container Infrastructure

VMware tutorial - IT technology blog
VMware tutorial - IT technology blog

Breaking Down Barriers Between Sysadmins and Developers

Infrastructure admins are likely no strangers to the scenario where Operations (Ops) and Development (Dev) teams seem to speak different languages. Back when I managed an 8-host ESXi cluster running on Dell R740s, Ops just wanted to keep vCenter stable and monitor every Virtual Machine (VM). Meanwhile, Devs were constantly demanding Kubernetes (K8s) to push code faster.

Every time the Dev team needed a Cluster, I had to manually create each VM, install the OS, and then configure K8s by hand. This approach was extremely time-consuming and made resource management difficult. When system errors occurred, tracing issues between the virtualization layer and the container layer was a nightmare. VMware Tanzu was created to end this struggle. It transforms vSphere into a platform that runs both VMs and Containers side-by-side on a single management interface.

VMware Tanzu: When ESXi Becomes a Worker Node

In reality, VMware Tanzu is not just an additional piece of software. It is a solution suite that turns vSphere into a native Kubernetes infrastructure. Instead of running K8s on top of VMs in a traditional “nested” fashion, Tanzu integrates directly into the ESXi kernel via an agent called a Spherelet.

When you activate Tanzu, the Workload Management menu appears in vCenter. At this point, K8s Clusters are treated like standard resource objects. The real selling point here is that you can use vMotion, DRS, and High Availability (HA) for containerized applications without complex configurations.

Prerequisites: Don’t Let Your Infrastructure Fall Behind

Deploying Tanzu in an enterprise environment requires careful preparation. Based on my practical experience operating an 8-host cluster with about 200 VMs, I recommend the following minimum specifications:

  • vSphere License: You need vSphere 7.0 Update 1 or higher. The license must be a vSphere with Tanzu edition (Basic, Standard, or Advanced).
  • Network: This is often the most confusing part. If you don’t have the budget for NSX-T, use vSphere Distributed Switch (VDS). Combine this with an external Load Balancer like HAProxy to save costs.
  • Storage: A Storage Policy is mandatory. I encourage using vSAN to take full advantage of the vSphere Container Storage Interface (CSI) features.
  • Resource Overhead: Each Supervisor Cluster will consume about 16GB of RAM and 4 vCPUs for the control plane VMs. Ensure your hosts have at least 20% free resources to handle failovers.

4 Practical Deployment Steps

Step 1: Activate Workload Management

First, go to vCenter, navigate to Workload Management, and select Get Started. The system will scan for eligible physical clusters.

In the Networking section, if you choose VDS, you need to prepare 3 separate IP ranges: one for Management (about 5 IPs), one for Workload (preferably a /23 or /22), and a VIP IP range for the Load Balancer.

Step 2: Configure the HAProxy Appliance

Since K8s requires a single point of access (API Server), HAProxy acts as the gateway. After deploying the HAProxy OVA file, pay close attention to the Data Plane API configuration. This protocol allows vCenter to automatically push load balancing configurations to HAProxy every time a Dev creates a new Service in K8s.

Step 3: Set Up Namespaces and Quotas

Namespaces in Tanzu function much like Resource Pools. You can restrict the Dev-01 team to using a maximum of 64GB of RAM and 2TB of Storage.

# Log in to the Cluster using the specialized VMware kubectl tool
kubectl vsphere login --server=10.10.20.50 --vsphere-username [email protected] --insecure-skip-verify

Step 4: Initialize the Tanzu Kubernetes Grid (TKG) Cluster

This is where the power of automation shines. Instead of manual installation, you simply send a YAML file to vCenter. The system will automatically clone VM templates and configure the network.

Example configuration file for a Production environment:

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
metadata:
  name: tkg-prod-cluster
  namespace: production-space
spec:
  topology:
    controlPlane:
      count: 3 # Ensure High Availability for the control plane
      class: best-effort-medium
      storageClass: vsan-default-policy
    workers:
      count: 5 # Can be easily scaled up later
      class: best-effort-large
      storageClass: vsan-default-policy
  distribution:
    version: v1.21

It takes only about 10-15 minutes for vCenter to spin up a fully compliant enterprise K8s Cluster.

“Battle-Tested” Operational Lessons

After over a year of running Tanzu in a live environment, I have a few notes for you:

  1. Monitor Orphaned Volumes: K8s tends to create Persistent Volumes (PV) very quickly. Sometimes Devs delete a Cluster but the volumes remain attached to vSAN. Check the Cloud Native Storage section regularly to avoid running out of disk space.
  2. DNS is the most common failure point: 90% of failed deployments are due to DNS being unable to resolve the vCenter or ESXi hostnames. Ensure you have full A and PTR records before starting.
  3. Host Group Allocation: With an 8-host cluster, I usually use the VM-Host Anti-Affinity feature. This rule ensures that Control Plane VMs never reside on the same physical host, preventing a single host failure from taking down the entire Cluster.

Conclusion

Integrating Kubernetes into vSphere is not just about chasing new technology. It truly helps administrators maintain control over the infrastructure while meeting the speed demands of the Dev team. Once everything is running smoothly, managing hundreds of containers becomes as effortless as managing a few VMs used to be.

Share: