Automating VMware vSphere with Ansible: From ‘Manual Clicks’ to Infrastructure as Code

VMware tutorial - IT technology blog
VMware tutorial - IT technology blog

Stop Clicking Around in vCenter

I currently manage an 8-host ESXi cluster with over 150 VMs. Previously, whenever the Dev team requested 10-20 VMs for testing, I’d dread it. Repeating the steps—selecting a template, naming, adjusting RAM/CPU, and manual IP configuration—was incredibly time-consuming and prone to errors.

After switching to Infrastructure as Code (IaC) with Ansible, everything changed. I don’t need to install agents on VMs; I just write simple YAML files. The biggest advantage is Idempotency. This means you can run a script as many times as you want, and Ansible will only make changes if the actual state doesn’t match the description.

In this article, I’ll guide you through setting up “one-click VM provisioning,” reducing deployment time from an hour to less than two minutes.

Why Ansible is the Top Choice for VMware

Real-world needs go beyond just creating VMs. We also need to manage their lifecycle: upgrading RAM during spikes, adding disks, or decommissioning them after a project ends. Ansible solves this through specialized API modules.

You only need to define the “Desired State.” For example: “I need 5 Ubuntu VMs, 2 CPUs, 4GB RAM.” Ansible automatically compares this with vCenter. If any are missing, it creates them; if the configuration is wrong, it fixes it. This approach ensures infrastructure consistency, eliminating snowflake server configurations.

Environment Setup

To let Ansible “talk” to vCenter, you need a Control Node (usually Ubuntu or CentOS). Ensure this machine can reach the vCenter IP via port 443.

1. Install Supporting Libraries

Ansible uses Python libraries to interact with the VMware API. Install this pair:

pip install pyvmomi pyVim

2. Install VMware Collection

VMware modules are now bundled into a dedicated collection. Install it using this command:

ansible-galaxy collection install community.vmware

Security Note: Don’t use the [email protected] account. Create a dedicated Service Account (e.g., ansible-svc) and assign Role-based permissions just enough to create/delete VMs.

Writing a Playbook for Automated VM Creation

Here is the most common scenario: Cloning a VM from an existing template and configuring a static IP immediately.

Sample Playbook File (deploy_vm.yml)

---
- name: Deploy VM automatically on vSphere
  hosts: localhost
  gather_facts: false
  vars:
    vcenter_host: "vcenter.company.com"
    vcenter_user: "[email protected]"
    vcenter_pass: "Password-Should-Use-Ansible-Vault"

  tasks:
    - name: Clone VM from Template and configure hardware
      community.vmware.vmware_guest:
        hostname: "{{ vcenter_host }}"
        username: "{{ vcenter_user }}"
        password: "{{ vcenter_pass }}"
        validate_certs: false
        datacenter: "Hanoi-Datacenter"
        cluster: "Prod-Cluster"
        datastore: "PureStorage-LUN01"
        name: "web-app-prod-01"
        template: "ubuntu-22.04-gold-image"
        state: poweredon
        networks:
          - name: "VLAN-10-Web"
            ip: "10.10.10.50"
            netmask: "255.255.255.0"
            gateway: "10.10.10.1"
        hardware:
          memory_mb: 4096
          num_cpus: 2
        wait_for_ip_address: true
      delegate_to: localhost

Key Points to Remember:

  • delegate_to: localhost: Crucial. It forces Ansible to run the module on the control machine to call the API, rather than attempting to SSH into vCenter.
  • wait_for_ip_address: The Playbook will pause until VMware Tools successfully reports an IP. This prevents subsequent software installation tasks from failing due to connection issues.
  • state: poweredon: Ensures the VM is ready to go immediately after cloning.

Optimizing for Scale with Loops

If your boss asks for 5 VMs at once, don’t copy-paste the code 5 times. Use the with_items feature to turn your Playbook into a production line:

    - name: Create bulk VMs for the project
      community.vmware.vmware_guest:
        name: "{{ item.name }}"
        networks:
          - name: "VM Network"
            ip: "{{ item.ip }}"
      with_items:
        - { name: 'worker-01', ip: '10.10.10.61' }
        - { name: 'worker-02', ip: '10.10.10.62' }
        - { name: 'worker-03', ip: '10.10.10.63' }

Testing and Troubleshooting

When running the ansible-playbook deploy_vm.yml command, pay attention to the output colors. Yellow (Changed) means the system was modified. Green (OK) means everything is already compliant; no action was taken.

If you encounter an error (red), add the -vvv flag. Ansible will log every JSON request sent to vCenter in detail. Usually, errors stem from typos in Datastore or Resource Pool names compared to what’s shown in the vCenter UI.

Additionally, I often use the vmware_guest_info module to audit the entire infrastructure. It takes just 30 seconds to export a report listing 150 VMs with their RAM/CPU configs, instead of counting them manually in the vSphere Client.

Conclusion

Automating VMware isn’t just about saving effort. It completely eliminates “silly” human errors. Once you have a standard set of Playbooks, managing 8 hosts or 80 hosts becomes equally effortless.

My advice: Start small, like scheduling VM power on/off tasks. Once you’re comfortable, build more complex flows like automatically expanding hard drives when disks are full. Good luck mastering your infrastructure with code!

Share: