When Clicking Through the Console Is No Longer Enough
Early last year, our team ran into a painful incident: the staging environment was accidentally deleted, and it took nearly 3 days to rebuild from scratch. Not because the infrastructure was complex, but because nobody knew exactly what had been configured — everyone had clicked through things their own way, with no clear documentation. That’s when I started taking Infrastructure as Code seriously.
Six months later, production is running smoothly — and I have plenty to share.
Analysis: Why ClickOps Is a Dangerous Trap
The AWS Console is intuitive and gives instant results — anyone just starting out finds it convenient. The price you pay comes later, and it’s usually steep:
- Configuration drift: Dev, staging, and production gradually diverge without anyone noticing. I once discovered a security group in prod had a rule opening port 22 from
0.0.0.0/0— someone had done a “quick test” and forgotten to close it, and it had been sitting there for nearly 2 months. - Not reproducible: Want to spin up a new environment identical to production? The answer depends entirely on the memory of whoever set it up originally.
- Slow disaster recovery: If a region goes down and you need to migrate, how long will it take? With ClickOps, the answer is usually “nobody knows”.
- No meaningful audit trail: CloudTrail records who changed what, but not why — and more importantly, it doesn’t allow easy rollbacks.
The core problem: your infrastructure lives in people’s heads, not in code.
IaC Options — and Why Terraform Wins
Before committing to Terraform, I tried the alternatives:
AWS CloudFormation
Native to AWS, no extra installation required. But the YAML/JSON syntax is extremely verbose — a simple stack can easily run 300+ lines. More importantly: you’re completely locked in to AWS.
Ansible
Great for configuration management (installing packages, editing config files on existing servers), but not the right tool for provisioning infrastructure. Ansible lacks the concept of “state” — it has no way to track what resources currently exist.
Pulumi
Writing infrastructure in real Python/TypeScript sounds appealing. But the learning curve is steeper, the ecosystem is smaller, and I spent far more time hunting for concrete examples compared to Terraform.
Terraform
Multi-cloud (AWS, GCP, Azure…), HCL syntax is more readable than CloudFormation, excellent state management. The community is massive — nearly every AWS resource has ready-made examples on the Terraform Registry.
Installing Terraform and Initial Setup
Installing on Ubuntu/Debian — I use HashiCorp’s official apt repo to make future updates easy:
wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | \
sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] \
https://apt.releases.hashicorp.com $(lsb_release -cs) main" | \
sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
terraform --version
Next are AWS credentials. If you haven’t created an IAM user yet, now’s the time — follow least-privilege and grant only the permissions you actually need:
aws configure
# AWS Access Key ID: AKIA...
# AWS Secret Access Key: ...
# Default region name: ap-northeast-1
# Default output format: json
Your First Terraform Config: Creating an EC2 Instance
Here’s the most minimal config to get started — an EC2 instance in the Tokyo region, just enough for Terraform to understand what you want:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "ap-northeast-1" # Tokyo
}
resource "aws_instance" "web_server" {
ami = "ami-0d52744d6551d851e" # Ubuntu 22.04 LTS Tokyo
instance_type = "t3.micro"
tags = {
Name = "web-server-production"
Environment = "production"
ManagedBy = "terraform"
}
}
output "instance_ip" {
value = aws_instance.web_server.public_ip
}
The basic workflow is just 3 commands — the same 3 commands I use every day, for 6 months straight:
# Initialize — download providers locally
terraform init
# Preview changes (does NOT apply them)
terraform plan
# Apply changes
terraform apply
The output of terraform plan clearly shows which resources will be created (+), destroyed (-), or modified (~). I always review the plan carefully before applying — it’s the most critical safety net.
State File — The Most Important Part Most Tutorials Skip
Terraform stores infrastructure state in a file called terraform.tfstate. This file is the “source of truth” that tells Terraform which resources it’s currently managing.
The most common mistake: storing the state file locally. The consequences:
- No one else on the team can run Terraform
- Lose your machine = lose your state = Terraform no longer knows what it’s managing
- Two people running
terraform applysimultaneously causes conflicts — and nothing prevents that from happening
The solution: a remote backend with S3 + DynamoDB for state locking:
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "production/terraform.tfstate"
region = "ap-northeast-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
Create the S3 bucket and DynamoDB table first (a one-time manual setup):
aws s3 mb s3://my-terraform-state-bucket --region ap-northeast-1
aws s3api put-bucket-versioning \
--bucket my-terraform-state-bucket \
--versioning-configuration Status=Enabled
aws dynamodb create-table \
--table-name terraform-state-lock \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST \
--region ap-northeast-1
Organizing Terraform Code for Real-World Projects
After several refactors, here’s the structure I’m currently using — simple but scalable when you need to add more environments:
infrastructure/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ └── production/
│ ├── main.tf
│ ├── variables.tf
│ └── terraform.tfvars
└── modules/
├── vpc/
├── ec2/
└── rds/
Variables help avoid hardcoding and make it easy to differentiate between dev and production without duplicating code:
# variables.tf
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.micro"
}
variable "environment" {
description = "Environment name"
type = string
}
# terraform.tfvars (production)
instance_type = "t3.medium"
environment = "production"
Lessons After 6 Months in Production
Here’s what I learned — most of it the hard way:
- Run
terraform fmtandterraform validatebefore every commit — integrate them into a pre-commit hook or CI to prevent pushing syntax errors. - Never edit the state file manually — if you need to intervene, use
terraform state mvorterraform state rm. - Tag all resources:
ManagedBy = "terraform",Environment,Project. Three months later when you glance at the AWS Console, you’ll immediately know which resources are managed by what. - Review plans in CI/CD: Set up GitHub Actions to auto-run
terraform planon every PR. The output is automatically posted as a PR comment — reviewers see the infrastructure changes before merging, no manual coordination needed. - Separate state by environment: Keep
dev/terraform.tfstateandproduction/terraform.tfstateisolated — one mistyped command won’t take down production.
One small but handy trick: whenever I need to debug Terraform output or inspect JSON from terraform output -json, I paste it into toolcraft.app/en/tools/developer/json-formatter to quickly format it for readability — much faster than opening an IDE just to look at a JSON snippet.
Where Should You Start?
If you’re running anything on the cloud without IaC, you’re silently accumulating technical debt every single day. Terraform should be set up from the very beginning — not after you already have 50 hand-clicked resources that you’d then have to painstakingly import.
The easiest way to start: pick one simple resource from your current project — an S3 bucket, a security group — and write Terraform for it. The feeling of running terraform destroy then terraform apply and watching everything rebuild identically will convince you better than any article ever could.
