Infrastructure as Code
Terraform
Handbook
A comprehensive reference for engineers and platform teams — covering HCL syntax, providers, resources, state, modules, multi-environment patterns, and production deployment workflows with Terraform v1.9+.
Infrastructure as Code
AWS · Azure · GCP
Modules & Reuse
State Management
Multi-Environment
CI/CD Pipelines
OpenTofu Compatible
Terraform is an open-source Infrastructure as Code (IaC) tool created by HashiCorp. It lets you define cloud infrastructure — servers, databases, networks, DNS records, IAM roles, and more — in declarative configuration files using HCL (HashiCorp Configuration Language). You describe what you want; Terraform figures out how to create, update, or destroy it.
Instead of clicking through the AWS console or writing imperative shell scripts, you write .tf files that become a single source of truth for your infrastructure. Those files live in version control, get reviewed in pull requests, and can be applied consistently across environments.
Declarative
You describe the desired end state. Terraform computes and applies the delta — no need to write step-by-step imperative scripts.
Cloud-Agnostic
One tool, 3,000+ providers: AWS, Azure, GCP, Kubernetes, GitHub, Datadog, Cloudflare, and any REST API with a custom provider.
State-Aware
Terraform tracks what it has deployed in a state file, so it always knows the current vs. desired configuration — enabling safe incremental changes.
🔀
OpenTofu: HashiCorp changed Terraform's licence to BUSL 1.1 in 2023. OpenTofu is the fully open-source, MPL-2.0 fork maintained by the Linux Foundation. It is wire-compatible with Terraform — all HCL and provider code covered in this handbook works identically in both tools.
Reproducibility
Spin up identical infrastructure for dev, staging, and production. No more "works in staging but not prod" caused by manual drift. Every environment is defined in code.
Version Control & Audit
Infrastructure changes go through the same PR review, Git history, and audit trail as application code. Know who changed what, when, and why.
Automation & Speed
Provisioning a full environment that would take hours of clicking takes minutes via CI/CD. Engineers stop waiting on ops tickets.
Safe Change Management
terraform plan shows a human-readable diff before any change is applied. Teams review the plan in PRs, eliminating surprises during applies.
Cost Visibility
With tools like Infracost, you can estimate the monthly cost of infrastructure changes directly in PRs before resources are created.
Disaster Recovery
Rebuild your entire infrastructure from code after an incident. No scrambling through documentation — the code is the runbook.
💡
IaC vs. ClickOps: "ClickOps" — manually configuring resources in cloud consoles — doesn't scale. It's error-prone, undocumented, and impossible to reproduce reliably. Terraform replaces ClickOps with versioned, reviewable, automated infrastructure management.
✅ Use Terraform when…
- Provisioning cloud resources (VMs, databases, networks, DNS)
- Managing infrastructure across multiple cloud providers
- Running multiple environments (dev / staging / prod)
- Working in a team that needs reproducible, reviewable infra changes
- Setting up Kubernetes clusters, Helm releases, namespaces
- Automating account-level resources (IAM, billing alerts, S3 buckets)
- Disaster recovery and infrastructure rebuilding
❌ Prefer another tool when…
- Deploying application code (use GitHub Actions, ArgoCD, Helm charts)
- Configuring OS-level software on VMs (use Ansible, Chef)
- Real-time or event-driven infrastructure (use AWS Lambda triggers)
- One-off, ephemeral scripts you'll never run again
- Simple single-account, single-region greenfield projects where a CDK or Pulumi feel more natural
Terraform vs. the Alternatives
| Tool | Language | Approach | Best For |
| Terraform / OpenTofu | HCL | Declarative IaC | Multi-cloud, teams, repeatable infra |
| AWS CDK | TS / Python / Java | Imperative, generates CloudFormation | AWS-only, code-first teams |
| Pulumi | TS / Python / Go | Declarative, general-purpose languages | Developers preferring real languages over HCL |
| AWS CloudFormation | JSON / YAML | Declarative | AWS-only, deep AWS service support |
| Ansible | YAML | Imperative config mgmt | OS config, app deployment on existing VMs |
| Crossplane | Kubernetes CRDs | GitOps / K8s-native | Platform teams running everything through K8s |
Write HCL Configuration
You define resources, variables, and outputs in .tf files. This is your desired end state — the "what" you want to exist in the cloud.
terraform init
Downloads provider plugins (e.g., the AWS provider), initializes the backend for remote state, and prepares the working directory. Run once per project or after provider changes.
terraform plan
Terraform reads current state (from the state file) and queries the cloud API, then computes a diff: what needs to be created, updated, or destroyed. Shows this as a human-readable plan. Nothing changes yet.
terraform apply
Executes the plan. Terraform calls provider APIs to create, modify, or delete real infrastructure. Updates the state file to reflect the new reality. Prompts for confirmation (or pass -auto-approve in CI).
terraform destroy (when needed)
Tears down all resources managed by the configuration. Useful for cleaning up dev environments or decommissioning services entirely.
📡
The State File: Terraform stores a mapping of your configuration to real-world resource IDs in terraform.tfstate. This is Terraform's memory — it's how it knows that aws_instance.web maps to i-0abc1234. In teams, always store state remotely (S3, Terraform Cloud, Azure Blob) — never commit it to Git.
# Install Terraform
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
# Or install OpenTofu (open-source fork)
brew install opentofu
# Verify installation
terraform version
# → Terraform v1.9.x
wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | \
sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] \
https://apt.releases.hashicorp.com $(lsb_release -cs) main" | \
sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
winget install HashiCorp.Terraform
Recommended: tfenv (version manager)
# Install tfenv — manage multiple Terraform versions per project
brew install tfenv
# Pin version in .terraform-version file (committed to repo)
echo "1.9.5" > .terraform-version
tfenv install # installs pinned version
tfenv use # activates it
Terraform will load all .tf files in a directory. Convention splits configuration into purpose-named files, not because Terraform requires it, but because it makes navigation and review far easier.
my-project/
├── environments/
│ ├── dev.tfvars
│ ├── staging.tfvars
│ └── prod.tfvars
├── modules/
│ └── vpc/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── main.tf
├── variables.tf
├── outputs.tf
├── providers.tf
├── locals.tf
├── versions.tf
├── .terraform-version
├── .gitignore
└── terraform.tfvars
⚠️
Never commit to Git: .terraform/ (provider binaries), *.tfstate and *.tfstate.backup (contain secrets), and *.tfvars files containing sensitive values. Add these to .gitignore immediately.
HCL is a declarative language with a simple block-based syntax. It is human-readable and supports expressions, loops, conditionals, and functions.
# ── Block structure: type "label" "name" { arguments } ──
resource "aws_instance" "web_server" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
# Nested blocks have no = sign
tags {
Name = "web-server"
Environment = "production"
}
}
# ── Strings ──
variable "region" {
default = "us-east-1" # double quotes only
}
# ── String interpolation ──
locals {
bucket_name = "my-app-${var.environment}-assets"
}
# ── Multi-line strings (heredoc) ──
resource "aws_iam_policy" "example" {
policy = <<-EOT
{
"Version": "2012-10-17",
"Statement": [{ "Effect": "Allow" }]
}
EOT
}
# ── Lists and Maps ──
locals {
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
tags = {
Project = "my-app"
ManagedBy = "Terraform"
}
}
# ── Booleans and Numbers ──
resource "aws_db_instance" "db" {
multi_az = true
storage_encrypted = true
allocated_storage = 20
port = 5432
}
# ── Conditional expression (ternary) ──
locals {
instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
}
# ── for_each loop (creates one resource per map entry) ──
resource "aws_s3_bucket" "buckets" {
for_each = toset(["logs", "assets", "backups"])
bucket = "my-app-${each.key}"
}
# ── count (creates N copies of a resource) ──
resource "aws_instance" "workers" {
count = 3
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
tags = { Name = "worker-${count.index}" }
}
# ── Dynamic blocks (generate nested blocks programmatically) ──
resource "aws_security_group" "sg" {
name = "my-sg"
dynamic "ingress" {
for_each = var.allowed_ports
content {
from_port = ingress.value
to_port = ingress.value
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
}
Every cloud or service Terraform talks to requires a provider — a plugin that knows how to authenticate and translate HCL resource declarations into API calls. Providers are downloaded during terraform init from the Terraform Registry.
# Always pin provider versions to avoid unexpected upgrades
terraform {
required_version = ">= 1.9.0" # minimum Terraform CLI version
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0" # allows 5.x, not 6.x
}
random = {
source = "hashicorp/random"
version = ">= 3.4"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
}
}
# AWS provider — uses environment variables for auth
# AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, or an IAM role
provider "aws" {
region = var.aws_region # pass region as a variable, not hardcoded
# Optional: tag every resource created by this provider
default_tags {
tags = {
ManagedBy = "Terraform"
Project = var.project_name
Environment = var.environment
}
}
}
# Multiple provider configurations — e.g., multi-region
provider "aws" {
alias = "us_west"
region = "us-west-2"
}
# Use the aliased provider for specific resources
resource "aws_s3_bucket" "west_bucket" {
provider = aws.us_west
bucket = "my-west-backup-bucket"
}
# Remote backend — REQUIRED for team use (store state in S3)
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "environments/prod/terraform.tfstate"
region = "us-east-1"
encrypt = true # encrypt state at rest
dynamodb_table = "terraform-state-lock" # prevent concurrent applies
}
}
A resource block declares one infrastructure object. The type string (aws_instance, google_sql_database) tells Terraform which provider resource to manage. The name (web_server) is a local label used to reference it within HCL.
# ── Syntax: resource "TYPE" "LOCAL_NAME" { ... } ──
# TYPE = provider_resourcetype (e.g. aws_s3_bucket)
# LOCAL_NAME = arbitrary name for this config (e.g. "app_assets")
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = { Name = "main-vpc" }
}
# ── Referencing attributes from another resource ──
# Format: resource_type.local_name.attribute
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id # ← cross-resource reference
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-1a"
# Terraform automatically creates this AFTER aws_vpc.main
# because of the implicit dependency on aws_vpc.main.id
}
# ── Explicit dependency (when implicit isn't enough) ──
resource "aws_instance" "app" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
subnet_id = aws_subnet.public.id
# Explicitly wait for an S3 bucket policy before creating the EC2
# even though there's no attribute reference between them
depends_on = [aws_s3_bucket_policy.app_policy]
}
# ── Lifecycle rules — control create/update/destroy behaviour ──
resource "aws_db_instance" "postgres" {
identifier = "my-app-db"
engine = "postgres"
engine_version = "15"
instance_class = "db.t3.micro"
allocated_storage = 20
skip_final_snapshot = false
lifecycle {
# Create replacement BEFORE destroying old — zero-downtime replacement
create_before_destroy = true
# Ignore changes to these attributes (e.g., managed by another team)
ignore_changes = [password]
# Prevent accidental destruction of critical resources
prevent_destroy = true
}
}
# ── Basic variable declaration ──
variable "aws_region" {
description = "AWS region to deploy into"
type = string # string | number | bool
default = "us-east-1"
}
variable "environment" {
description = "Deployment environment"
type = string
# No default = required. Terraform will prompt if not provided.
# Validation — enforce valid values
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
# ── Collection types ──
variable "allowed_ports" {
type = list(number)
default = [80, 443]
}
variable "instance_config" {
type = object({
instance_type = string
disk_size_gb = number
enable_monitoring = bool
})
default = {
instance_type = "t3.micro"
disk_size_gb = 20
enable_monitoring = false
}
}
# ── Sensitive variable — value masked in plan/apply output ──
variable "db_password" {
description = "Database admin password"
type = string
sensitive = true # will not appear in logs or plan output
}
# Outputs are displayed after `terraform apply`
# and accessible by other modules via module.name.output_name
output "web_server_ip" {
description = "Public IP of the web server EC2 instance"
value = aws_instance.web.public_ip
}
output "db_endpoint" {
description = "RDS connection endpoint"
value = aws_db_instance.postgres.endpoint
sensitive = true # mask in CLI output (still stored in state)
}
output "subnet_ids" {
description = "List of all public subnet IDs"
value = [for s in aws_subnet.public : s.id]
}
Supplying Variable Values
| Method | Precedence | Use Case |
default in variable block | Lowest | Sensible defaults for optional config |
terraform.tfvars file | ↑ | Local defaults committed to repo (no secrets) |
*.auto.tfvars file | ↑ | Automatically loaded, per-environment overrides |
-var-file=prod.tfvars CLI flag | ↑ | Explicit environment targeting in CI |
-var="key=value" CLI flag | ↑ | One-off overrides |
TF_VAR_name environment variable | Highest | Secrets injected by CI/CD, never stored in files |
# locals{} computes derived values once — reference as local.name
# Use them to avoid repeating complex expressions throughout your config
locals {
# String manipulation
name_prefix = "${var.project}-${var.environment}"
bucket_name = "${local.name_prefix}-assets-${random_id.suffix.hex}"
# Common tags merged from a base map + resource-specific tags
common_tags = merge(
{
Project = var.project
Environment = var.environment
ManagedBy = "Terraform"
},
var.extra_tags
)
# Conditional value
is_production = var.environment == "prod"
instance_type = local.is_production ? "t3.large" : "t3.micro"
# List comprehension — transform each AZ into a CIDR block
subnet_cidrs = [for i, az in var.availability_zones :
cidrsubnet(var.vpc_cidr, 8, i)
]
# Map comprehension — build resource map from list
subnet_map = {
for az in var.availability_zones :
az => cidrsubnet(var.vpc_cidr, 8, index(var.availability_zones, az))
}
}
# ── Useful built-in functions ──
locals {
# String functions
lower_name = lower(var.project_name) # → "myproject"
trimmed = trimspace(" hello ") # → "hello"
replaced = replace(var.name, " ", "-") # spaces → hyphens
# Collection functions
sorted_azs = sort(data.aws_availability_zones.available.names)
first_az = element(local.sorted_azs, 0)
flat_list = flatten([[1,2],[3,4]]) # → [1,2,3,4]
unique_items = toset(["a", "b", "a"]) # → {"a","b"}
# Encoding functions
b64_script = base64encode(file("scripts/init.sh"))
json_policy = jsonencode({ Version = "2012-10-17" })
# Numeric functions
max_capacity = max(2, var.min_count, 10)
subnet_count = length(var.availability_zones)
}
Data sources let you query existing resources — ones Terraform didn't create, or ones managed by a different state file — and use their attributes in your configuration. They are read-only; Terraform never modifies them.
# ── Syntax: data "TYPE" "NAME" { filters } ──
# Reference as: data.TYPE.NAME.attribute
# Look up the latest Ubuntu 22.04 AMI — no hardcoding AMI IDs!
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"] # Canonical's AWS account ID
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
# Use the AMI in a resource — always up-to-date
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id # ← resolved at plan time
instance_type = "t3.micro"
}
# ── Read a VPC created outside this Terraform config ──
data "aws_vpc" "existing" {
filter {
name = "tag:Name"
values = ["production-vpc"]
}
}
# ── Read a secret from AWS Secrets Manager ──
data "aws_secretsmanager_secret_version" "db_creds" {
secret_id = "prod/myapp/db-credentials"
}
locals {
db_creds = jsondecode(data.aws_secretsmanager_secret_version.db_creds.secret_string)
}
# ── Read remote Terraform state (cross-stack references) ──
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "my-terraform-state"
key = "networking/terraform.tfstate"
region = "us-east-1"
}
}
# Now use outputs from the networking stack
resource "aws_instance" "app" {
subnet_id = data.terraform_remote_state.networking.outputs.public_subnet_id
}
What State Contains
- Mapping of resource addresses to real-world IDs
- Dependency graph between resources
- Metadata for providers and modules
- All attribute values — including secrets
Remote State Backends
s3 — AWS S3 + DynamoDB locking (most common)
azurerm — Azure Blob Storage
gcs — Google Cloud Storage
terraform cloud — HashiCorp managed
pg — PostgreSQL
# View all resources tracked in state
terraform state list
# Inspect detailed state for a specific resource
terraform state show aws_instance.web
# Move a resource to a new address (rename without destroy/recreate)
terraform state mv aws_instance.old_name aws_instance.new_name
# Remove a resource from state WITHOUT destroying the real resource
# Useful when you want to "forget" a resource (e.g., hand it off)
terraform state rm aws_s3_bucket.old_bucket
# Import an existing real resource into Terraform state
# Lets you adopt infra not created by Terraform
terraform import aws_s3_bucket.assets my-existing-bucket-name
# Pull remote state to local (for inspection)
terraform state pull > current_state.json
# Replace a specific resource (force destroy + recreate)
terraform apply -replace=aws_instance.web
# Refresh state — sync state with real-world (detect drift)
terraform apply -refresh-only
🔐
State Security: The state file contains plaintext copies of all resource attributes, including passwords, private keys, and connection strings. Always encrypt your remote backend (S3: encrypt = true), restrict IAM access to the state bucket, and never commit *.tfstate to version control.
Modules are directories of .tf files that encapsulate a logical unit of infrastructure (a VPC, an ECS cluster, a monitoring stack). They accept inputs (variables), create resources, and expose outputs. Modules enable DRY infrastructure: write once, instantiate many times across environments.
# ── module block syntax ──
# source can be: local path, Terraform Registry, Git URL, S3
# Local module
module "vpc" {
source = "./modules/vpc" # relative path to module directory
# Pass values to the module's variables
vpc_cidr = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b"]
environment = var.environment
}
# Public Terraform Registry module (hashicorp/terraform-aws-modules)
module "rds" {
source = "terraform-aws-modules/rds/aws"
version = "~> 6.0" # always pin module versions!
identifier = "my-app-db"
engine = "postgres"
engine_version = "15"
instance_class = "db.t3.micro"
allocated_storage = 20
db_subnet_group_name = module.vpc.db_subnet_group_name
vpc_security_group_ids = [module.vpc.database_sg_id]
}
# Git-sourced module — pin to a tag for stability
module "monitoring" {
source = "git::https://github.com/my-org/tf-modules.git//monitoring?ref=v2.3.1"
}
# Access module outputs with module.name.output_name
resource "aws_route53_record" "app" {
name = "app.example.com"
records = [module.vpc.nat_gateway_ip] # ← module output
}
Workspaces allow you to have multiple independent state files for the same configuration — one per environment. Each workspace is isolated: terraform apply in dev workspace only changes dev resources.
# List available workspaces (* = current)
terraform workspace list
# * default
# dev
# staging
# prod
# Create and switch to a new workspace
terraform workspace new staging
terraform workspace select prod
# Current workspace name is available in HCL as terraform.workspace
# Use it to vary configuration per environment:
locals {
# Look up settings from a map keyed by workspace name
env_config = {
dev = {
instance_type = "t3.micro"
min_capacity = 1
max_capacity = 2
multi_az = false
}
prod = {
instance_type = "t3.large"
min_capacity = 3
max_capacity = 10
multi_az = true
}
}
# terraform.workspace is a built-in string — "dev", "prod", etc.
config = local.env_config[terraform.workspace]
}
resource "aws_instance" "app" {
instance_type = local.config.instance_type
}
💡
Workspaces vs. separate directories: For small teams, workspaces work well. For larger organisations, many teams prefer separate directory structures per environment (envs/dev/, envs/prod/) for stronger isolation and explicit per-environment configuration. Both are valid patterns.
| Command | Description | Common Flags |
| terraform init |
Initialise working directory — downloads providers, configures backend |
-upgrade update providers · -reconfigure reinitialise backend |
| terraform validate |
Check configuration syntax and internal consistency (no API calls) |
-json machine-readable output |
| terraform fmt |
Auto-format .tf files to canonical HCL style |
-check exit non-zero if not formatted (CI) · -recursive |
| terraform plan |
Compute and show the execution plan (no changes applied) |
-out=plan.tfplan save plan · -var-file=prod.tfvars · -target=resource |
| terraform apply |
Apply the plan — make real infrastructure changes |
plan.tfplan apply saved plan · -auto-approve skip confirmation · -parallelism=10 |
| terraform destroy |
Destroy all resources in state |
-auto-approve · -target=resource destroy specific resource |
| terraform output |
Print output values from state |
-json structured output · -raw output_name single value |
| terraform state list |
List all resources tracked in state |
— |
| terraform state show |
Show all attributes of a tracked resource |
aws_instance.web |
| terraform import |
Import existing infrastructure into state |
resource_address real_resource_id |
| terraform taint |
Mark resource for replacement on next apply (v1.x: use -replace flag instead) |
Deprecated — use apply -replace |
| terraform graph |
Output dependency graph in DOT format (visualise with Graphviz) |
— |
| terraform console |
Interactive REPL for evaluating HCL expressions and functions |
— |
| terraform providers lock |
Update the dependency lock file (.terraform.lock.hcl) |
-platform=linux_amd64 cross-platform locks |
# ════════════════════════════════════════════════════
# DATA SOURCES — query existing / external resources
# ════════════════════════════════════════════════════
# Dynamically fetch available AZs in the current region
data "aws_availability_zones" "available" {
state = "available"
}
# Always resolve the latest Ubuntu LTS AMI rather than hardcoding
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"]
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
}
}
# ════════════════════════════════════
# COMPUTED LOCALS
# ════════════════════════════════════
locals {
name_prefix = "${var.project}-${var.environment}"
# Use only the first 2 AZs for subnets (cost-effective for non-prod)
azs = slice(data.aws_availability_zones.available.names, 0, 2)
# Standard tags applied to every resource via default_tags in provider
tags = {
Project = var.project
Environment = var.environment
ManagedBy = "Terraform"
Repository = "github.com/my-org/infra"
}
}
# ════════════════════════════════════
# NETWORKING
# ════════════════════════════════════
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true # required for private DNS resolution
enable_dns_support = true
tags = { Name = "${local.name_prefix}-vpc" }
}
# One public subnet per AZ (for the load balancer)
resource "aws_subnet" "public" {
for_each = toset(local.azs)
vpc_id = aws_vpc.main.id
# cidrsubnet(base, newbits, netnum) — carve /24 from the /16
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, index(local.azs, each.key))
availability_zone = each.key
map_public_ip_on_launch = true
tags = { Name = "${local.name_prefix}-public-${each.key}" }
}
# Internet gateway — allows VPC to talk to the internet
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.main.id
tags = { Name = "${local.name_prefix}-igw" }
}
# Route table — send 0.0.0.0/0 traffic out through the IGW
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
}
# Associate the route table with each public subnet
resource "aws_route_table_association" "public" {
for_each = aws_subnet.public
subnet_id = each.value.id
route_table_id = aws_route_table.public.id
}
# ════════════════════════════════════
# SECURITY GROUPS
# ════════════════════════════════════
# ALB security group — accept HTTP/HTTPS from the world
resource "aws_security_group" "alb" {
name = "${local.name_prefix}-alb-sg"
vpc_id = aws_vpc.main.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1" # -1 = all protocols
cidr_blocks = ["0.0.0.0/0"]
}
}
# App security group — only accept traffic from the ALB
resource "aws_security_group" "app" {
name = "${local.name_prefix}-app-sg"
vpc_id = aws_vpc.main.id
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
# Reference the ALB SG — only ALB can send traffic to app
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# ════════════════════════════════════
# EC2 INSTANCES (Auto Scaling Group)
# ════════════════════════════════════
# Launch template — defines what each EC2 looks like
resource "aws_launch_template" "app" {
name_prefix = "${local.name_prefix}-"
image_id = data.aws_ami.ubuntu.id # from data source above
instance_type = var.instance_type
# User data — bootstrap script encoded in base64
user_data = base64encode(<<-EOF
#!/bin/bash
apt-get update -y
apt-get install -y nginx
systemctl enable --now nginx
echo "Hello from ${local.name_prefix}" > /var/www/html/index.html
EOF)
network_interfaces {
associate_public_ip_address = true
security_groups = [aws_security_group.app.id]
}
lifecycle {
# Create a new launch template version before destroying the old
create_before_destroy = true
}
}
# Auto Scaling Group — manages the fleet of EC2 instances
resource "aws_autoscaling_group" "app" {
name = "${local.name_prefix}-asg"
min_size = var.min_capacity
max_size = var.max_capacity
desired_capacity = var.min_capacity
vpc_zone_identifier = [for s in aws_subnet.public : s.id]
launch_template {
id = aws_launch_template.app.id
version = "$Latest"
}
# Auto-register instances with the ALB target group
target_group_arns = [aws_lb_target_group.app.arn]
# Replace instances one-by-one during rolling updates
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
}
}
# ════════════════════════════════════
# LOAD BALANCER
# ════════════════════════════════════
resource "aws_lb" "app" {
name = "${local.name_prefix}-alb"
internal = false # internet-facing
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = [for s in aws_subnet.public : s.id]
}
resource "aws_lb_target_group" "app" {
name = "${local.name_prefix}-tg"
port = 8080
protocol = "HTTP"
vpc_id = aws_vpc.main.id
health_check {
path = "/health"
healthy_threshold = 2
unhealthy_threshold = 3
interval = 30
}
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.app.arn
port = "443"
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = var.acm_certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
# ════════════════════════════════════
# S3 BUCKET (asset storage)
# ════════════════════════════════════
resource "random_id" "bucket_suffix" {
byte_length = 4 # generates an 8-char hex suffix for unique naming
}
resource "aws_s3_bucket" "assets" {
# Bucket names must be globally unique across all AWS accounts
bucket = "${local.name_prefix}-assets-${random_id.bucket_suffix.hex}"
}
# Block ALL public access — assets served via CloudFront, not direct S3
resource "aws_s3_bucket_public_access_block" "assets" {
bucket = aws_s3_bucket.assets.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# Enable versioning — protects against accidental deletes
resource "aws_s3_bucket_versioning" "assets" {
bucket = aws_s3_bucket.assets.id
versioning_configuration {
status = "Enabled"
}
}
# Every module should declare its inputs here
# with types, descriptions, and validation where relevant
variable "domain_name" {
description = "Domain name for the static site (e.g. docs.example.com)"
type = string
}
variable "acm_certificate_arn" {
description = "ACM certificate ARN (must be in us-east-1 for CloudFront)"
type = string
}
variable "price_class" {
description = "CloudFront price class — controls which edge locations are used"
type = string
default = "PriceClass_100" # US/EU only — cheapest
validation {
condition = contains(["PriceClass_100", "PriceClass_200", "PriceClass_All"], var.price_class)
error_message = "Must be PriceClass_100, PriceClass_200, or PriceClass_All."
}
}
variable "tags" {
description = "Additional tags to apply to all resources in this module"
type = map(string)
default = {}
}
locals {
s3_origin_id = "S3-${var.domain_name}"
}
# S3 bucket to store the static files
resource "aws_s3_bucket" "site" {
# Replace dots with hyphens — S3 bucket names can't contain dots with HTTPS
bucket = replace(var.domain_name, ".", "-")
tags = var.tags
}
# Origin Access Control — restricts S3 to only CloudFront requests
resource "aws_cloudfront_origin_access_control" "site" {
name = var.domain_name
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
# CloudFront distribution in front of S3
resource "aws_cloudfront_distribution" "site" {
enabled = true
is_ipv6_enabled = true
default_root_object = "index.html"
aliases = [var.domain_name]
price_class = var.price_class
tags = var.tags
origin {
domain_name = aws_s3_bucket.site.bucket_regional_domain_name
origin_id = local.s3_origin_id
origin_access_control_id = aws_cloudfront_origin_access_control.site.id
}
default_cache_behavior {
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = local.s3_origin_id
viewer_protocol_policy = "redirect-to-https" # force HTTPS
forwarded_values {
query_string = false
cookies { forward = "none" }
}
min_ttl = 0
default_ttl = 86400 # 24 hours
max_ttl = 31536000 # 1 year
}
viewer_certificate {
acm_certificate_arn = var.acm_certificate_arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
restrictions {
geo_restriction { restriction_type = "none" }
}
}
# Expose the values callers need — e.g., to create DNS records
output "cloudfront_domain_name" {
description = "CloudFront distribution domain — set as CNAME in DNS"
value = aws_cloudfront_distribution.site.domain_name
}
output "bucket_name" {
description = "S3 bucket name — use this to sync your static files"
value = aws_s3_bucket.site.bucket
}
output "distribution_id" {
description = "CloudFront distribution ID — use to trigger cache invalidations"
value = aws_cloudfront_distribution.site.id
}
# Non-prod: small + cheap
environment = "dev"
aws_region = "us-east-1"
instance_type = "t3.micro"
min_capacity = 1
max_capacity = 2
multi_az_rds = false
db_instance = "db.t3.micro"
enable_waf = false
# Production: resilient + monitored
environment = "prod"
aws_region = "us-east-1"
instance_type = "t3.large"
min_capacity = 3
max_capacity = 20
multi_az_rds = true
db_instance = "db.r6g.large"
enable_waf = true
# ── Deploy to dev ──
terraform workspace select dev
terraform plan -var-file="environments/dev.tfvars" -out=dev.tfplan
terraform apply dev.tfplan
# ── Deploy to prod ──
terraform workspace select prod
terraform plan -var-file="environments/prod.tfvars" -out=prod.tfplan
terraform apply prod.tfplan
# ── Secrets: never in .tfvars — inject via environment variables ──
export TF_VAR_db_password="$(aws secretsmanager get-secret-value \
--secret-id prod/myapp/db-password --query SecretString --output text)"
terraform apply -var-file="environments/prod.tfvars"
| Practice | Severity | Why |
| Always use remote state with locking |
MUST |
Prevents two engineers running apply simultaneously, corrupting state |
| Pin provider and module versions |
MUST |
Unpinned deps cause silent breaking changes when upstream releases |
Never commit *.tfstate or secrets to Git |
MUST |
State contains plaintext secrets; Git history is permanent |
Run terraform fmt -recursive in CI |
MUST |
Consistent formatting makes diffs readable in PRs |
| Save plans and apply the saved plan in CI |
MUST |
Prevents plan/apply drift — what you reviewed is exactly what runs |
Use for_each over count for resources |
SHOULD |
count uses integer indices — removing item 0 destroys and recreates everything. for_each uses stable keys |
| Tag every resource with environment and project |
SHOULD |
Enables cost allocation, filtering, and resource lifecycle management |
Use prevent_destroy = true on critical resources |
SHOULD |
Protects databases, state buckets, and other non-replaceable infra |
| Keep modules small and single-purpose |
SHOULD |
Large modules are hard to reuse and slow to plan |
Don't use -target in production |
AVOID |
Partial applies break dependency tracking and lead to state inconsistency |
Avoid null_resource and local-exec provisioners |
AVOID |
Run arbitrary scripts once, hard to reproduce, breaks idempotency |
| Don't hardcode region, account ID, or AMI IDs |
AVOID |
Use variables and data sources — enables portability across regions/accounts |
🔬
Use terraform validate and tflint: validate catches syntax errors before plan. tflint (open-source linter) catches AWS-specific issues like invalid instance types, deprecated arguments, and missing required tags — things Terraform itself won't warn about until runtime.
name: Terraform
on:
pull_request:
branches: [main]
push:
branches: [main]
permissions:
contents: read
pull-requests: write # post plan output as PR comment
id-token: write # for OIDC auth with AWS (no static keys!)
env:
TF_VERSION: "1.9.5"
AWS_REGION: "us-east-1"
jobs:
terraform:
name: Terraform Plan / Apply
runs-on: ubuntu-latest
environment: ${{ github.ref == 'refs/heads/main' && 'production' || 'preview' }}
steps:
# 1. Checkout code
- uses: actions/checkout@v4
# 2. Authenticate to AWS via OIDC — no long-lived credentials stored in GitHub
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
aws-region: ${{ env.AWS_REGION }}
# 3. Install Terraform CLI
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
# 4. Format check — fails if code isn't formatted
- name: Terraform Format
run: terraform fmt -check -recursive
# 5. Download providers, configure backend
- name: Terraform Init
run: terraform init -backend-config="environments/${{ github.event_name == 'push' && 'prod' || 'dev' }}.backend.hcl"
# 6. Validate HCL syntax
- name: Terraform Validate
run: terraform validate
# 7. Compute plan and save it
- name: Terraform Plan
id: plan
run: |
terraform plan \
-var-file="environments/prod.tfvars" \
-out=tf.plan \
-no-color 2>&1 | tee plan_output.txt
continue-on-error: true # post plan even on failure
# 8. Comment plan on the PR for human review
- name: Post Plan to PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const plan = require('fs').readFileSync('plan_output.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Terraform Plan\n\`\`\`\n${plan.slice(0, 60000)}\n\`\`\``
});
# 9. Apply ONLY on merge to main — applies the exact saved plan
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve tf.plan
🔑
Use OIDC, not access keys: Configure your AWS IAM role to trust GitHub's OIDC provider. This eliminates the need for long-lived AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY secrets stored in GitHub — a major security improvement. The credentials are short-lived and scoped to the workflow run.
Import Existing Resource
# 1. Write the resource block in HCL
# 2. Run import to link it to state
terraform import \
aws_s3_bucket.my_bucket \
existing-bucket-name
Force Replace a Resource
# Destroy + recreate a specific resource
terraform apply \
-replace="aws_instance.web" \
-var-file="prod.tfvars"
Targeted Plan/Apply
# Only plan/apply specific resource
# ⚠ Avoid in production — use for debugging
terraform plan \
-target=aws_security_group.app
terraform apply \
-target=module.rds
Read an Output Value
# Print all outputs
terraform output
# Print a single output (raw, no quotes)
terraform output -raw web_server_ip
# JSON for scripting
terraform output -json | jq .db_endpoint.value
Unlock State After Crash
# If apply crashed, state may be locked
# Get lock ID from the error message
terraform force-unlock LOCK_ID
Test HCL Expressions
# Interactive REPL — great for testing
# functions before putting them in config
terraform console
> cidrsubnet("10.0.0.0/16", 8, 2)
"10.0.2.0/24"
> lower("My-App")
"my-app"
Useful Built-in Functions Quick Reference
| Function | Example | Result |
cidrsubnet(base, bits, num) | cidrsubnet("10.0.0.0/16", 8, 1) | "10.0.1.0/24" |
toset(list) | toset(["a","b","a"]) | {"a","b"} |
merge(map, map) | merge({a=1},{b=2}) | {a=1,b=2} |
flatten(list_of_lists) | flatten([[1,2],[3]]) | [1,2,3] |
contains(list, val) | contains(["a","b"], "a") | true |
lookup(map, key, default) | lookup({a=1}, "b", 0) | 0 |
jsonencode(value) | jsonencode({key="val"}) | "{\"key\":\"val\"}" |
file(path) | file("scripts/init.sh") | File content as string |
base64encode(str) | base64encode("hello") | "aGVsbG8=" |
length(coll) | length(["a","b","c"]) | 3 |
element(list, idx) | element(["a","b","c"], 0) | "a" |
slice(list, from, to) | slice(["a","b","c"], 0, 2) | ["a","b"] |
try(expr, fallback) | try(var.optional.field, "default") | value or fallback if error |
coalesce(a, b, c) | coalesce("", null, "x") | "x" (first non-empty) |
formatdate(fmt, ts) | formatdate("YYYY-MM", timestamp()) | "2024-07" |
📚
Further Reading: Official docs at developer.hashicorp.com/terraform. Explore the Terraform Registry (registry.terraform.io) for community providers and modules. For linting, use tflint. For cost estimation in PRs, use Infracost. For security scanning, use Checkov or Trivy.