Terraform Handbook — Reference Guide

🌍

What is Terraform?

Infrastructure as Code — declare, version, and automate your cloud

Terraform is an open-source Infrastructure as Code (IaC) tool created by HashiCorp. It lets you define cloud infrastructure — servers, databases, networks, DNS records, IAM roles, and more — in declarative configuration files using HCL (HashiCorp Configuration Language). You describe what you want; Terraform figures out how to create, update, or destroy it.

Instead of clicking through the AWS console or writing imperative shell scripts, you write .tf files that become a single source of truth for your infrastructure. Those files live in version control, get reviewed in pull requests, and can be applied consistently across environments.

Declarative

You describe the desired end state. Terraform computes and applies the delta — no need to write step-by-step imperative scripts.

Cloud-Agnostic

One tool, 3,000+ providers: AWS, Azure, GCP, Kubernetes, GitHub, Datadog, Cloudflare, and any REST API with a custom provider.

State-Aware

Terraform tracks what it has deployed in a state file, so it always knows the current vs. desired configuration — enabling safe incremental changes.

🔀

OpenTofu: HashiCorp changed Terraform's licence to BUSL 1.1 in 2023. OpenTofu is the fully open-source, MPL-2.0 fork maintained by the Linux Foundation. It is wire-compatible with Terraform — all HCL and provider code covered in this handbook works identically in both tools.

✅

Why Use Terraform?

The business and engineering case for IaC

Reproducibility

Spin up identical infrastructure for dev, staging, and production. No more "works in staging but not prod" caused by manual drift. Every environment is defined in code.

Version Control & Audit

Infrastructure changes go through the same PR review, Git history, and audit trail as application code. Know who changed what, when, and why.

Automation & Speed

Provisioning a full environment that would take hours of clicking takes minutes via CI/CD. Engineers stop waiting on ops tickets.

Safe Change Management

terraform plan shows a human-readable diff before any change is applied. Teams review the plan in PRs, eliminating surprises during applies.

Cost Visibility

With tools like Infracost, you can estimate the monthly cost of infrastructure changes directly in PRs before resources are created.

Disaster Recovery

Rebuild your entire infrastructure from code after an incident. No scrambling through documentation — the code is the runbook.

💡

IaC vs. ClickOps: "ClickOps" — manually configuring resources in cloud consoles — doesn't scale. It's error-prone, undocumented, and impossible to reproduce reliably. Terraform replaces ClickOps with versioned, reviewable, automated infrastructure management.

🕐

When to Use Terraform

Right tool for the right job

✅ Use Terraform when…

Provisioning cloud resources (VMs, databases, networks, DNS)
Managing infrastructure across multiple cloud providers
Running multiple environments (dev / staging / prod)
Working in a team that needs reproducible, reviewable infra changes
Setting up Kubernetes clusters, Helm releases, namespaces
Automating account-level resources (IAM, billing alerts, S3 buckets)
Disaster recovery and infrastructure rebuilding

❌ Prefer another tool when…

Deploying application code (use GitHub Actions, ArgoCD, Helm charts)
Configuring OS-level software on VMs (use Ansible, Chef)
Real-time or event-driven infrastructure (use AWS Lambda triggers)
One-off, ephemeral scripts you'll never run again
Simple single-account, single-region greenfield projects where a CDK or Pulumi feel more natural

Terraform vs. the Alternatives

Tool	Language	Approach	Best For
Terraform / OpenTofu	HCL	Declarative IaC	Multi-cloud, teams, repeatable infra
AWS CDK	TS / Python / Java	Imperative, generates CloudFormation	AWS-only, code-first teams
Pulumi	TS / Python / Go	Declarative, general-purpose languages	Developers preferring real languages over HCL
AWS CloudFormation	JSON / YAML	Declarative	AWS-only, deep AWS service support
Ansible	YAML	Imperative config mgmt	OS config, app deployment on existing VMs
Crossplane	Kubernetes CRDs	GitOps / K8s-native	Platform teams running everything through K8s

⚙️

How Terraform Works

The core loop: Write → Init → Plan → Apply → Destroy

1

Write HCL Configuration

You define resources, variables, and outputs in .tf files. This is your desired end state — the "what" you want to exist in the cloud.

2

terraform init

Downloads provider plugins (e.g., the AWS provider), initializes the backend for remote state, and prepares the working directory. Run once per project or after provider changes.

3

terraform plan

Terraform reads current state (from the state file) and queries the cloud API, then computes a diff: what needs to be created, updated, or destroyed. Shows this as a human-readable plan. Nothing changes yet.

4

terraform apply

Executes the plan. Terraform calls provider APIs to create, modify, or delete real infrastructure. Updates the state file to reflect the new reality. Prompts for confirmation (or pass -auto-approve in CI).

5

terraform destroy (when needed)

Tears down all resources managed by the configuration. Useful for cleaning up dev environments or decommissioning services entirely.

📡

The State File: Terraform stores a mapping of your configuration to real-world resource IDs in terraform.tfstate. This is Terraform's memory — it's how it knows that aws_instance.web maps to i-0abc1234. In teams, always store state remotely (S3, Terraform Cloud, Azure Blob) — never commit it to Git.

📦

Installation

Get Terraform running on macOS, Linux, and Windows

bash — macOS (Homebrew)

# Install Terraform
brew tap hashicorp/tap
brew install hashicorp/tap/terraform

# Or install OpenTofu (open-source fork)
brew install opentofu

# Verify installation
terraform version
# → Terraform v1.9.x

bash — Linux (apt)

wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | \
  sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] \
  https://apt.releases.hashicorp.com $(lsb_release -cs) main" | \
  sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform

bash — Windows (winget)

winget install HashiCorp.Terraform

Recommended: tfenv (version manager)

bash

# Install tfenv — manage multiple Terraform versions per project
brew install tfenv

# Pin version in .terraform-version file (committed to repo)
echo "1.9.5" > .terraform-version

tfenv install   # installs pinned version
tfenv use       # activates it

📁

Project Structure

How to organise Terraform files for maintainability

Terraform will load all .tf files in a directory. Convention splits configuration into purpose-named files, not because Terraform requires it, but because it makes navigation and review far easier.

my-project/ ├── environments/ # per-env variable files │ ├── dev.tfvars │ ├── staging.tfvars │ └── prod.tfvars ├── modules/ # reusable, shared modules │ └── vpc/ │ ├── main.tf │ ├── variables.tf │ └── outputs.tf ├── main.tf # resources + module calls ├── variables.tf # all input variable declarations ├── outputs.tf # values exposed after apply ├── providers.tf # provider + backend configuration ├── locals.tf # computed local values ├── versions.tf # required_providers + terraform block ├── .terraform-version # pins Terraform version (tfenv) ├── .gitignore # exclude .terraform/ and *.tfstate └── terraform.tfvars # default variable values (NOT secrets)

⚠️

Never commit to Git: .terraform/ (provider binaries), *.tfstate and *.tfstate.backup (contain secrets), and *.tfvars files containing sensitive values. Add these to .gitignore immediately.

📝

HCL Syntax Basics

HashiCorp Configuration Language — the building blocks

HCL is a declarative language with a simple block-based syntax. It is human-readable and supports expressions, loops, conditionals, and functions.

hcl — syntax fundamentals

# ── Block structure: type "label" "name" { arguments } ──
resource "aws_instance" "web_server" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"

  # Nested blocks have no = sign
  tags {
    Name        = "web-server"
    Environment = "production"
  }
}

# ── Strings ──
variable "region" {
  default = "us-east-1"   # double quotes only
}

# ── String interpolation ──
locals {
  bucket_name = "my-app-${var.environment}-assets"
}

# ── Multi-line strings (heredoc) ──
resource "aws_iam_policy" "example" {
  policy = <<-EOT
    {
      "Version": "2012-10-17",
      "Statement": [{ "Effect": "Allow" }]
    }
  EOT
}

# ── Lists and Maps ──
locals {
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  tags = {
    Project     = "my-app"
    ManagedBy   = "Terraform"
  }
}

# ── Booleans and Numbers ──
resource "aws_db_instance" "db" {
  multi_az              = true
  storage_encrypted     = true
  allocated_storage     = 20
  port                  = 5432
}

# ── Conditional expression (ternary) ──
locals {
  instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
}

# ── for_each loop (creates one resource per map entry) ──
resource "aws_s3_bucket" "buckets" {
  for_each = toset(["logs", "assets", "backups"])

  bucket = "my-app-${each.key}"
}

# ── count (creates N copies of a resource) ──
resource "aws_instance" "workers" {
  count         = 3
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
  tags = { Name = "worker-${count.index}" }
}

# ── Dynamic blocks (generate nested blocks programmatically) ──
resource "aws_security_group" "sg" {
  name = "my-sg"

  dynamic "ingress" {
    for_each = var.allowed_ports
    content {
      from_port   = ingress.value
      to_port     = ingress.value
      protocol    = "tcp"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }
}

🔌

Providers

Plugins that translate HCL into cloud API calls

Every cloud or service Terraform talks to requires a provider — a plugin that knows how to authenticate and translate HCL resource declarations into API calls. Providers are downloaded during terraform init from the Terraform Registry.

hcl — versions.tf

# Always pin provider versions to avoid unexpected upgrades
terraform {
  required_version = ">= 1.9.0"   # minimum Terraform CLI version

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"        # allows 5.x, not 6.x
    }
    random = {
      source  = "hashicorp/random"
      version = ">= 3.4"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.0"
    }
  }
}

hcl — providers.tf

# AWS provider — uses environment variables for auth
# AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, or an IAM role
provider "aws" {
  region = var.aws_region   # pass region as a variable, not hardcoded

  # Optional: tag every resource created by this provider
  default_tags {
    tags = {
      ManagedBy   = "Terraform"
      Project     = var.project_name
      Environment = var.environment
    }
  }
}

# Multiple provider configurations — e.g., multi-region
provider "aws" {
  alias  = "us_west"
  region = "us-west-2"
}

# Use the aliased provider for specific resources
resource "aws_s3_bucket" "west_bucket" {
  provider = aws.us_west
  bucket   = "my-west-backup-bucket"
}

# Remote backend — REQUIRED for team use (store state in S3)
terraform {
  backend "s3" {
    bucket         = "my-terraform-state-bucket"
    key            = "environments/prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true                    # encrypt state at rest
    dynamodb_table = "terraform-state-lock"  # prevent concurrent applies
  }
}

🧱

Resources

The fundamental building blocks — each maps to a real cloud object

A resource block declares one infrastructure object. The type string (aws_instance, google_sql_database) tells Terraform which provider resource to manage. The name (web_server) is a local label used to reference it within HCL.

hcl — resource syntax and references

# ── Syntax: resource "TYPE" "LOCAL_NAME" { ... } ──
# TYPE    = provider_resourcetype (e.g. aws_s3_bucket)
# LOCAL_NAME = arbitrary name for this config (e.g. "app_assets")

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = { Name = "main-vpc" }
}

# ── Referencing attributes from another resource ──
# Format: resource_type.local_name.attribute
resource "aws_subnet" "public" {
  vpc_id            = aws_vpc.main.id            # ← cross-resource reference
  cidr_block        = "10.0.1.0/24"
  availability_zone = "us-east-1a"

  # Terraform automatically creates this AFTER aws_vpc.main
  # because of the implicit dependency on aws_vpc.main.id
}

# ── Explicit dependency (when implicit isn't enough) ──
resource "aws_instance" "app" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.public.id

  # Explicitly wait for an S3 bucket policy before creating the EC2
  # even though there's no attribute reference between them
  depends_on = [aws_s3_bucket_policy.app_policy]
}

# ── Lifecycle rules — control create/update/destroy behaviour ──
resource "aws_db_instance" "postgres" {
  identifier          = "my-app-db"
  engine              = "postgres"
  engine_version      = "15"
  instance_class      = "db.t3.micro"
  allocated_storage   = 20
  skip_final_snapshot = false

  lifecycle {
    # Create replacement BEFORE destroying old — zero-downtime replacement
    create_before_destroy = true

    # Ignore changes to these attributes (e.g., managed by another team)
    ignore_changes        = [password]

    # Prevent accidental destruction of critical resources
    prevent_destroy       = true
  }
}

📊

Variables & Outputs

Parameterise configuration and expose values after apply

hcl — variables.tf

# ── Basic variable declaration ──
variable "aws_region" {
  description = "AWS region to deploy into"
  type        = string              # string | number | bool
  default     = "us-east-1"
}

variable "environment" {
  description = "Deployment environment"
  type        = string
  # No default = required. Terraform will prompt if not provided.

  # Validation — enforce valid values
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod."
  }
}

# ── Collection types ──
variable "allowed_ports" {
  type    = list(number)
  default = [80, 443]
}

variable "instance_config" {
  type = object({
    instance_type = string
    disk_size_gb  = number
    enable_monitoring = bool
  })
  default = {
    instance_type     = "t3.micro"
    disk_size_gb      = 20
    enable_monitoring = false
  }
}

# ── Sensitive variable — value masked in plan/apply output ──
variable "db_password" {
  description = "Database admin password"
  type        = string
  sensitive   = true    # will not appear in logs or plan output
}

hcl — outputs.tf

# Outputs are displayed after `terraform apply`
# and accessible by other modules via module.name.output_name

output "web_server_ip" {
  description = "Public IP of the web server EC2 instance"
  value       = aws_instance.web.public_ip
}

output "db_endpoint" {
  description = "RDS connection endpoint"
  value       = aws_db_instance.postgres.endpoint
  sensitive   = true   # mask in CLI output (still stored in state)
}

output "subnet_ids" {
  description = "List of all public subnet IDs"
  value       = [for s in aws_subnet.public : s.id]
}

Supplying Variable Values

Method	Precedence	Use Case
`default` in variable block	Lowest	Sensible defaults for optional config
`terraform.tfvars` file	↑	Local defaults committed to repo (no secrets)
`*.auto.tfvars` file	↑	Automatically loaded, per-environment overrides
`-var-file=prod.tfvars` CLI flag	↑	Explicit environment targeting in CI
`-var="key=value"` CLI flag	↑	One-off overrides
`TF_VAR_name` environment variable	Highest	Secrets injected by CI/CD, never stored in files

🔡

Locals & Expressions

Computed values and built-in functions

hcl — locals.tf

# locals{} computes derived values once — reference as local.name
# Use them to avoid repeating complex expressions throughout your config
locals {
  # String manipulation
  name_prefix    = "${var.project}-${var.environment}"
  bucket_name    = "${local.name_prefix}-assets-${random_id.suffix.hex}"

  # Common tags merged from a base map + resource-specific tags
  common_tags = merge(
    {
      Project     = var.project
      Environment = var.environment
      ManagedBy   = "Terraform"
    },
    var.extra_tags
  )

  # Conditional value
  is_production = var.environment == "prod"
  instance_type = local.is_production ? "t3.large" : "t3.micro"

  # List comprehension — transform each AZ into a CIDR block
  subnet_cidrs = [for i, az in var.availability_zones :
    cidrsubnet(var.vpc_cidr, 8, i)
  ]

  # Map comprehension — build resource map from list
  subnet_map = {
    for az in var.availability_zones :
    az => cidrsubnet(var.vpc_cidr, 8, index(var.availability_zones, az))
  }
}

# ── Useful built-in functions ──
locals {
  # String functions
  lower_name   = lower(var.project_name)           # → "myproject"
  trimmed      = trimspace("  hello  ")             # → "hello"
  replaced     = replace(var.name, " ", "-")       # spaces → hyphens

  # Collection functions
  sorted_azs   = sort(data.aws_availability_zones.available.names)
  first_az     = element(local.sorted_azs, 0)
  flat_list    = flatten([[1,2],[3,4]])              # → [1,2,3,4]
  unique_items = toset(["a", "b", "a"])             # → {"a","b"}

  # Encoding functions
  b64_script   = base64encode(file("scripts/init.sh"))
  json_policy  = jsonencode({ Version = "2012-10-17" })

  # Numeric functions
  max_capacity = max(2, var.min_count, 10)
  subnet_count = length(var.availability_zones)
}

🔍

Data Sources

Read existing infrastructure without managing it

Data sources let you query existing resources — ones Terraform didn't create, or ones managed by a different state file — and use their attributes in your configuration. They are read-only; Terraform never modifies them.

hcl — data sources

# ── Syntax: data "TYPE" "NAME" { filters } ──
# Reference as: data.TYPE.NAME.attribute

# Look up the latest Ubuntu 22.04 AMI — no hardcoding AMI IDs!
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]   # Canonical's AWS account ID

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

# Use the AMI in a resource — always up-to-date
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id   # ← resolved at plan time
  instance_type = "t3.micro"
}

# ── Read a VPC created outside this Terraform config ──
data "aws_vpc" "existing" {
  filter {
    name   = "tag:Name"
    values = ["production-vpc"]
  }
}

# ── Read a secret from AWS Secrets Manager ──
data "aws_secretsmanager_secret_version" "db_creds" {
  secret_id = "prod/myapp/db-credentials"
}

locals {
  db_creds = jsondecode(data.aws_secretsmanager_secret_version.db_creds.secret_string)
}

# ── Read remote Terraform state (cross-stack references) ──
data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    bucket = "my-terraform-state"
    key    = "networking/terraform.tfstate"
    region = "us-east-1"
  }
}

# Now use outputs from the networking stack
resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.networking.outputs.public_subnet_id
}

💾

State Management

Terraform's memory — remote backends, locking, and drift

What State Contains

Mapping of resource addresses to real-world IDs
Dependency graph between resources
Metadata for providers and modules
All attribute values — including secrets

Remote State Backends

s3 — AWS S3 + DynamoDB locking (most common)
azurerm — Azure Blob Storage
gcs — Google Cloud Storage
terraform cloud — HashiCorp managed
pg — PostgreSQL

bash — state manipulation commands

# View all resources tracked in state
terraform state list

# Inspect detailed state for a specific resource
terraform state show aws_instance.web

# Move a resource to a new address (rename without destroy/recreate)
terraform state mv aws_instance.old_name aws_instance.new_name

# Remove a resource from state WITHOUT destroying the real resource
# Useful when you want to "forget" a resource (e.g., hand it off)
terraform state rm aws_s3_bucket.old_bucket

# Import an existing real resource into Terraform state
# Lets you adopt infra not created by Terraform
terraform import aws_s3_bucket.assets my-existing-bucket-name

# Pull remote state to local (for inspection)
terraform state pull > current_state.json

# Replace a specific resource (force destroy + recreate)
terraform apply -replace=aws_instance.web

# Refresh state — sync state with real-world (detect drift)
terraform apply -refresh-only

🔐

State Security: The state file contains plaintext copies of all resource attributes, including passwords, private keys, and connection strings. Always encrypt your remote backend (S3: encrypt = true), restrict IAM access to the state bucket, and never commit *.tfstate to version control.

📦

Modules

Reusable, composable infrastructure components

Modules are directories of .tf files that encapsulate a logical unit of infrastructure (a VPC, an ECS cluster, a monitoring stack). They accept inputs (variables), create resources, and expose outputs. Modules enable DRY infrastructure: write once, instantiate many times across environments.

hcl — calling a module

# ── module block syntax ──
# source can be: local path, Terraform Registry, Git URL, S3

# Local module
module "vpc" {
  source = "./modules/vpc"     # relative path to module directory

  # Pass values to the module's variables
  vpc_cidr             = "10.0.0.0/16"
  availability_zones   = ["us-east-1a", "us-east-1b"]
  environment          = var.environment
}

# Public Terraform Registry module (hashicorp/terraform-aws-modules)
module "rds" {
  source  = "terraform-aws-modules/rds/aws"
  version = "~> 6.0"             # always pin module versions!

  identifier            = "my-app-db"
  engine                = "postgres"
  engine_version        = "15"
  instance_class        = "db.t3.micro"
  allocated_storage     = 20
  db_subnet_group_name  = module.vpc.db_subnet_group_name
  vpc_security_group_ids = [module.vpc.database_sg_id]
}

# Git-sourced module — pin to a tag for stability
module "monitoring" {
  source = "git::https://github.com/my-org/tf-modules.git//monitoring?ref=v2.3.1"
}

# Access module outputs with module.name.output_name
resource "aws_route53_record" "app" {
  name = "app.example.com"
  records = [module.vpc.nat_gateway_ip]   # ← module output
}

🗂️

Workspaces

Multiple state files from one configuration directory

Workspaces allow you to have multiple independent state files for the same configuration — one per environment. Each workspace is isolated: terraform apply in dev workspace only changes dev resources.

bash — workspace commands

# List available workspaces (* = current)
terraform workspace list
# * default
#   dev
#   staging
#   prod

# Create and switch to a new workspace
terraform workspace new staging
terraform workspace select prod

# Current workspace name is available in HCL as terraform.workspace
# Use it to vary configuration per environment:

hcl — using terraform.workspace in config

locals {
  # Look up settings from a map keyed by workspace name
  env_config = {
    dev = {
      instance_type   = "t3.micro"
      min_capacity    = 1
      max_capacity    = 2
      multi_az        = false
    }
    prod = {
      instance_type   = "t3.large"
      min_capacity    = 3
      max_capacity    = 10
      multi_az        = true
    }
  }

  # terraform.workspace is a built-in string — "dev", "prod", etc.
  config = local.env_config[terraform.workspace]
}

resource "aws_instance" "app" {
  instance_type = local.config.instance_type
}

💡

Workspaces vs. separate directories: For small teams, workspaces work well. For larger organisations, many teams prefer separate directory structures per environment (envs/dev/, envs/prod/) for stronger isolation and explicit per-environment configuration. Both are valid patterns.

💻

Core CLI Commands

The complete Terraform command reference

Command	Description	Common Flags
terraform init	Initialise working directory — downloads providers, configures backend	`-upgrade` update providers · `-reconfigure` reinitialise backend
terraform validate	Check configuration syntax and internal consistency (no API calls)	`-json` machine-readable output
terraform fmt	Auto-format `.tf` files to canonical HCL style	`-check` exit non-zero if not formatted (CI) · `-recursive`
terraform plan	Compute and show the execution plan (no changes applied)	`-out=plan.tfplan` save plan · `-var-file=prod.tfvars` · `-target=resource`
terraform apply	Apply the plan — make real infrastructure changes	`plan.tfplan` apply saved plan · `-auto-approve` skip confirmation · `-parallelism=10`
terraform destroy	Destroy all resources in state	`-auto-approve` · `-target=resource` destroy specific resource
terraform output	Print output values from state	`-json` structured output · `-raw output_name` single value
terraform state list	List all resources tracked in state	—
terraform state show	Show all attributes of a tracked resource	`aws_instance.web`
terraform import	Import existing infrastructure into state	`resource_address real_resource_id`
terraform taint	Mark resource for replacement on next apply (v1.x: use `-replace` flag instead)	Deprecated — use `apply -replace`
terraform graph	Output dependency graph in DOT format (visualise with Graphviz)	—
terraform console	Interactive REPL for evaluating HCL expressions and functions	—
terraform providers lock	Update the dependency lock file (`.terraform.lock.hcl`)	`-platform=linux_amd64` cross-platform locks

☁️

Sample: AWS Web App Infrastructure

A fully commented, production-style example — VPC, EC2, ALB, RDS, S3

hcl — main.tf (annotated)

# ════════════════════════════════════════════════════
# DATA SOURCES — query existing / external resources
# ════════════════════════════════════════════════════

# Dynamically fetch available AZs in the current region
data "aws_availability_zones" "available" {
  state = "available"
}

# Always resolve the latest Ubuntu LTS AMI rather than hardcoding
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

# ════════════════════════════════════
# COMPUTED LOCALS
# ════════════════════════════════════

locals {
  name_prefix = "${var.project}-${var.environment}"

  # Use only the first 2 AZs for subnets (cost-effective for non-prod)
  azs = slice(data.aws_availability_zones.available.names, 0, 2)

  # Standard tags applied to every resource via default_tags in provider
  tags = {
    Project     = var.project
    Environment = var.environment
    ManagedBy   = "Terraform"
    Repository  = "github.com/my-org/infra"
  }
}

# ════════════════════════════════════
# NETWORKING
# ════════════════════════════════════

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true   # required for private DNS resolution
  enable_dns_support   = true
  tags                 = { Name = "${local.name_prefix}-vpc" }
}

# One public subnet per AZ (for the load balancer)
resource "aws_subnet" "public" {
  for_each = toset(local.azs)

  vpc_id                  = aws_vpc.main.id
  # cidrsubnet(base, newbits, netnum) — carve /24 from the /16
  cidr_block              = cidrsubnet(aws_vpc.main.cidr_block, 8, index(local.azs, each.key))
  availability_zone       = each.key
  map_public_ip_on_launch = true
  tags                    = { Name = "${local.name_prefix}-public-${each.key}" }
}

# Internet gateway — allows VPC to talk to the internet
resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id
  tags   = { Name = "${local.name_prefix}-igw" }
}

# Route table — send 0.0.0.0/0 traffic out through the IGW
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }
}

# Associate the route table with each public subnet
resource "aws_route_table_association" "public" {
  for_each       = aws_subnet.public
  subnet_id      = each.value.id
  route_table_id = aws_route_table.public.id
}

# ════════════════════════════════════
# SECURITY GROUPS
# ════════════════════════════════════

# ALB security group — accept HTTP/HTTPS from the world
resource "aws_security_group" "alb" {
  name   = "${local.name_prefix}-alb-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"          # -1 = all protocols
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# App security group — only accept traffic from the ALB
resource "aws_security_group" "app" {
  name   = "${local.name_prefix}-app-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    from_port       = 8080
    to_port         = 8080
    protocol        = "tcp"
    # Reference the ALB SG — only ALB can send traffic to app
    security_groups = [aws_security_group.alb.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# ════════════════════════════════════
# EC2 INSTANCES (Auto Scaling Group)
# ════════════════════════════════════

# Launch template — defines what each EC2 looks like
resource "aws_launch_template" "app" {
  name_prefix   = "${local.name_prefix}-"
  image_id      = data.aws_ami.ubuntu.id      # from data source above
  instance_type = var.instance_type

  # User data — bootstrap script encoded in base64
  user_data = base64encode(<<-EOF
    #!/bin/bash
    apt-get update -y
    apt-get install -y nginx
    systemctl enable --now nginx
    echo "Hello from ${local.name_prefix}" > /var/www/html/index.html
  EOF)

  network_interfaces {
    associate_public_ip_address = true
    security_groups             = [aws_security_group.app.id]
  }

  lifecycle {
    # Create a new launch template version before destroying the old
    create_before_destroy = true
  }
}

# Auto Scaling Group — manages the fleet of EC2 instances
resource "aws_autoscaling_group" "app" {
  name                = "${local.name_prefix}-asg"
  min_size            = var.min_capacity
  max_size            = var.max_capacity
  desired_capacity    = var.min_capacity
  vpc_zone_identifier = [for s in aws_subnet.public : s.id]

  launch_template {
    id      = aws_launch_template.app.id
    version = "$Latest"
  }

  # Auto-register instances with the ALB target group
  target_group_arns = [aws_lb_target_group.app.arn]

  # Replace instances one-by-one during rolling updates
  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
    }
  }
}

# ════════════════════════════════════
# LOAD BALANCER
# ════════════════════════════════════

resource "aws_lb" "app" {
  name               = "${local.name_prefix}-alb"
  internal           = false     # internet-facing
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = [for s in aws_subnet.public : s.id]
}

resource "aws_lb_target_group" "app" {
  name     = "${local.name_prefix}-tg"
  port     = 8080
  protocol = "HTTP"
  vpc_id   = aws_vpc.main.id

  health_check {
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 3
    interval            = 30
  }
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.app.arn
  port              = "443"
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = var.acm_certificate_arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.app.arn
  }
}

# ════════════════════════════════════
# S3 BUCKET (asset storage)
# ════════════════════════════════════

resource "random_id" "bucket_suffix" {
  byte_length = 4    # generates an 8-char hex suffix for unique naming
}

resource "aws_s3_bucket" "assets" {
  # Bucket names must be globally unique across all AWS accounts
  bucket = "${local.name_prefix}-assets-${random_id.bucket_suffix.hex}"
}

# Block ALL public access — assets served via CloudFront, not direct S3
resource "aws_s3_bucket_public_access_block" "assets" {
  bucket                  = aws_s3_bucket.assets.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

# Enable versioning — protects against accidental deletes
resource "aws_s3_bucket_versioning" "assets" {
  bucket = aws_s3_bucket.assets.id
  versioning_configuration {
    status = "Enabled"
  }
}

🧩

Sample: Writing a Custom Module

Encapsulate an S3 + CloudFront static site into a reusable module

hcl — modules/static-site/variables.tf

# Every module should declare its inputs here
# with types, descriptions, and validation where relevant

variable "domain_name" {
  description = "Domain name for the static site (e.g. docs.example.com)"
  type        = string
}

variable "acm_certificate_arn" {
  description = "ACM certificate ARN (must be in us-east-1 for CloudFront)"
  type        = string
}

variable "price_class" {
  description = "CloudFront price class — controls which edge locations are used"
  type        = string
  default     = "PriceClass_100"   # US/EU only — cheapest

  validation {
    condition     = contains(["PriceClass_100", "PriceClass_200", "PriceClass_All"], var.price_class)
    error_message = "Must be PriceClass_100, PriceClass_200, or PriceClass_All."
  }
}

variable "tags" {
  description = "Additional tags to apply to all resources in this module"
  type        = map(string)
  default     = {}
}

hcl — modules/static-site/main.tf

locals {
  s3_origin_id = "S3-${var.domain_name}"
}

# S3 bucket to store the static files
resource "aws_s3_bucket" "site" {
  # Replace dots with hyphens — S3 bucket names can't contain dots with HTTPS
  bucket = replace(var.domain_name, ".", "-")
  tags   = var.tags
}

# Origin Access Control — restricts S3 to only CloudFront requests
resource "aws_cloudfront_origin_access_control" "site" {
  name                              = var.domain_name
  origin_access_control_origin_type = "s3"
  signing_behavior                  = "always"
  signing_protocol                  = "sigv4"
}

# CloudFront distribution in front of S3
resource "aws_cloudfront_distribution" "site" {
  enabled             = true
  is_ipv6_enabled     = true
  default_root_object = "index.html"
  aliases             = [var.domain_name]
  price_class         = var.price_class
  tags                = var.tags

  origin {
    domain_name              = aws_s3_bucket.site.bucket_regional_domain_name
    origin_id                = local.s3_origin_id
    origin_access_control_id = aws_cloudfront_origin_access_control.site.id
  }

  default_cache_behavior {
    allowed_methods  = ["GET", "HEAD"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = local.s3_origin_id
    viewer_protocol_policy = "redirect-to-https"   # force HTTPS

    forwarded_values {
      query_string = false
      cookies { forward = "none" }
    }

    min_ttl     = 0
    default_ttl = 86400     # 24 hours
    max_ttl     = 31536000  # 1 year
  }

  viewer_certificate {
    acm_certificate_arn      = var.acm_certificate_arn
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1.2_2021"
  }

  restrictions {
    geo_restriction { restriction_type = "none" }
  }
}

hcl — modules/static-site/outputs.tf

# Expose the values callers need — e.g., to create DNS records
output "cloudfront_domain_name" {
  description = "CloudFront distribution domain — set as CNAME in DNS"
  value       = aws_cloudfront_distribution.site.domain_name
}

output "bucket_name" {
  description = "S3 bucket name — use this to sync your static files"
  value       = aws_s3_bucket.site.bucket
}

output "distribution_id" {
  description = "CloudFront distribution ID — use to trigger cache invalidations"
  value       = aws_cloudfront_distribution.site.id
}

🌿

Sample: Multi-Environment Setup

Dev, staging, and prod from one config with .tfvars files

✓ environments/dev.tfvars

# Non-prod: small + cheap
environment    = "dev"
aws_region     = "us-east-1"
instance_type  = "t3.micro"
min_capacity   = 1
max_capacity   = 2
multi_az_rds   = false
db_instance    = "db.t3.micro"
enable_waf     = false

→ environments/prod.tfvars

# Production: resilient + monitored
environment    = "prod"
aws_region     = "us-east-1"
instance_type  = "t3.large"
min_capacity   = 3
max_capacity   = 20
multi_az_rds   = true
db_instance    = "db.r6g.large"
enable_waf     = true

bash — applying to different environments

# ── Deploy to dev ──
terraform workspace select dev
terraform plan  -var-file="environments/dev.tfvars"  -out=dev.tfplan
terraform apply dev.tfplan

# ── Deploy to prod ──
terraform workspace select prod
terraform plan  -var-file="environments/prod.tfvars"  -out=prod.tfplan
terraform apply prod.tfplan

# ── Secrets: never in .tfvars — inject via environment variables ──
export TF_VAR_db_password="$(aws secretsmanager get-secret-value \
  --secret-id prod/myapp/db-password --query SecretString --output text)"
terraform apply -var-file="environments/prod.tfvars"

⭐

Best Practices

Conventions that keep Terraform configs maintainable at scale

Practice	Severity	Why
Always use remote state with locking	MUST	Prevents two engineers running `apply` simultaneously, corrupting state
Pin provider and module versions	MUST	Unpinned deps cause silent breaking changes when upstream releases
Never commit `*.tfstate` or secrets to Git	MUST	State contains plaintext secrets; Git history is permanent
Run `terraform fmt -recursive` in CI	MUST	Consistent formatting makes diffs readable in PRs
Save plans and apply the saved plan in CI	MUST	Prevents plan/apply drift — what you reviewed is exactly what runs
Use `for_each` over `count` for resources	SHOULD	`count` uses integer indices — removing item 0 destroys and recreates everything. `for_each` uses stable keys
Tag every resource with environment and project	SHOULD	Enables cost allocation, filtering, and resource lifecycle management
Use `prevent_destroy = true` on critical resources	SHOULD	Protects databases, state buckets, and other non-replaceable infra
Keep modules small and single-purpose	SHOULD	Large modules are hard to reuse and slow to plan
Don't use `-target` in production	AVOID	Partial applies break dependency tracking and lead to state inconsistency
Avoid `null_resource` and local-exec provisioners	AVOID	Run arbitrary scripts once, hard to reproduce, breaks idempotency
Don't hardcode region, account ID, or AMI IDs	AVOID	Use variables and data sources — enables portability across regions/accounts

🔬

Use terraform validate and tflint: validate catches syntax errors before plan. tflint (open-source linter) catches AWS-specific issues like invalid instance types, deprecated arguments, and missing required tags — things Terraform itself won't warn about until runtime.

🚀

CI/CD Integration

Automate plan review and apply in GitHub Actions

yaml — .github/workflows/terraform.yml

name: Terraform

on:
  pull_request:
    branches: [main]
  push:
    branches: [main]

permissions:
  contents: read
  pull-requests: write    # post plan output as PR comment
  id-token: write         # for OIDC auth with AWS (no static keys!)

env:
  TF_VERSION: "1.9.5"
  AWS_REGION:  "us-east-1"

jobs:
  terraform:
    name: Terraform Plan / Apply
    runs-on: ubuntu-latest
    environment: ${{ github.ref == 'refs/heads/main' && 'production' || 'preview' }}

    steps:
      # 1. Checkout code
      - uses: actions/checkout@v4

      # 2. Authenticate to AWS via OIDC — no long-lived credentials stored in GitHub
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
          aws-region:     ${{ env.AWS_REGION }}

      # 3. Install Terraform CLI
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: ${{ env.TF_VERSION }}

      # 4. Format check — fails if code isn't formatted
      - name: Terraform Format
        run: terraform fmt -check -recursive

      # 5. Download providers, configure backend
      - name: Terraform Init
        run: terraform init -backend-config="environments/${{ github.event_name == 'push' && 'prod' || 'dev' }}.backend.hcl"

      # 6. Validate HCL syntax
      - name: Terraform Validate
        run: terraform validate

      # 7. Compute plan and save it
      - name: Terraform Plan
        id: plan
        run: |
          terraform plan \
            -var-file="environments/prod.tfvars" \
            -out=tf.plan \
            -no-color 2>&1 | tee plan_output.txt
        continue-on-error: true    # post plan even on failure

      # 8. Comment plan on the PR for human review
      - name: Post Plan to PR
        if: github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            const plan = require('fs').readFileSync('plan_output.txt', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Terraform Plan\n\`\`\`\n${plan.slice(0, 60000)}\n\`\`\``
            });

      # 9. Apply ONLY on merge to main — applies the exact saved plan
      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -auto-approve tf.plan

🔑

Use OIDC, not access keys: Configure your AWS IAM role to trust GitHub's OIDC provider. This eliminates the need for long-lived AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY secrets stored in GitHub — a major security improvement. The credentials are short-lived and scoped to the workflow run.

🗒️

Cheat Sheet

Quick-reference snippets for daily Terraform work

Import Existing Resource

# 1. Write the resource block in HCL
# 2. Run import to link it to state
terraform import \
  aws_s3_bucket.my_bucket \
  existing-bucket-name

Force Replace a Resource

# Destroy + recreate a specific resource
terraform apply \
  -replace="aws_instance.web" \
  -var-file="prod.tfvars"

Targeted Plan/Apply

# Only plan/apply specific resource
# ⚠ Avoid in production — use for debugging
terraform plan \
  -target=aws_security_group.app
terraform apply \
  -target=module.rds

Read an Output Value

# Print all outputs
terraform output

# Print a single output (raw, no quotes)
terraform output -raw web_server_ip

# JSON for scripting
terraform output -json | jq .db_endpoint.value

Unlock State After Crash

# If apply crashed, state may be locked
# Get lock ID from the error message
terraform force-unlock LOCK_ID

Test HCL Expressions

# Interactive REPL — great for testing
# functions before putting them in config
terraform console

> cidrsubnet("10.0.0.0/16", 8, 2)
"10.0.2.0/24"
> lower("My-App")
"my-app"

Useful Built-in Functions Quick Reference

Function	Example	Result
`cidrsubnet(base, bits, num)`	`cidrsubnet("10.0.0.0/16", 8, 1)`	`"10.0.1.0/24"`
`toset(list)`	`toset(["a","b","a"])`	`{"a","b"}`
`merge(map, map)`	`merge({a=1},{b=2})`	`{a=1,b=2}`
`flatten(list_of_lists)`	`flatten([[1,2],[3]])`	`[1,2,3]`
`contains(list, val)`	`contains(["a","b"], "a")`	`true`
`lookup(map, key, default)`	`lookup({a=1}, "b", 0)`	`0`
`jsonencode(value)`	`jsonencode({key="val"})`	`"{\"key\":\"val\"}"`
`file(path)`	`file("scripts/init.sh")`	File content as string
`base64encode(str)`	`base64encode("hello")`	`"aGVsbG8="`
`length(coll)`	`length(["a","b","c"])`	`3`
`element(list, idx)`	`element(["a","b","c"], 0)`	`"a"`
`slice(list, from, to)`	`slice(["a","b","c"], 0, 2)`	`["a","b"]`
`try(expr, fallback)`	`try(var.optional.field, "default")`	value or fallback if error
`coalesce(a, b, c)`	`coalesce("", null, "x")`	`"x"` (first non-empty)
`formatdate(fmt, ts)`	`formatdate("YYYY-MM", timestamp())`	`"2024-07"`

📚

Further Reading: Official docs at developer.hashicorp.com/terraform. Explore the Terraform Registry (registry.terraform.io) for community providers and modules. For linting, use tflint. For cost estimation in PRs, use Infracost. For security scanning, use Checkov or Trivy.