Cloud
Networking
From IP fundamentals and CIDR masks to VPC architecture, Kubernetes traffic flows, and hardened access control. Covers bastion hosts, private subnets, security groups, and the principle of minimum exposure.
IP Addresses & Notation
An IPv4 address is a 32-bit number written as four octets (0–255) separated by dots. Every device on a network needs a unique IP to send and receive traffic. Cloud resources — VMs, containers, load balancers — all get IPs assigned from the address ranges you define.
An IP address in binary
The address 192.168.1.10 is actually four 8-bit numbers. Each bit position doubles in value: 128, 64, 32, 16, 8, 4, 2, 1.
Private vs Public IP Ranges
RFC 1918 defines three blocks of IPv4 addresses reserved for private (non-routable) use. Cloud VPCs are always built from these. Traffic from private IPs cannot reach the internet directly — it must go through a NAT gateway or internet gateway.
| Range | CIDR | Addresses | Typical Use |
|---|---|---|---|
| Class A Private | 10.0.0.0/8 |
16,777,216 | Large VPCs, enterprise networks. Most common in cloud. |
| Class B Private | 172.16.0.0/12 |
1,048,576 | Docker default bridge network, mid-size VPCs. |
| Class C Private | 192.168.0.0/16 |
65,536 | Home/office routers, small networks, dev environments. |
| Loopback | 127.0.0.0/8 |
16,777,216 | Local machine only. 127.0.0.1 = "this device". |
| Link-local | 169.254.0.0/16 |
65,536 | Cloud instance metadata endpoint (e.g., AWS 169.254.169.254). |
169.254.169.254 to fetch its own identity, IAM role credentials, and bootstrap data. This is how EC2 instances get AWS credentials without hardcoding them. It's also a famous SSRF attack target — restrict it in your security groups and block it from containers that don't need it.Subnets & CIDR Masks
CIDR (Classless Inter-Domain Routing) notation expresses an IP range as a base address plus a prefix length — the number of fixed bits in the network portion. The remaining bits define host addresses. 10.0.1.0/24 means the first 24 bits are fixed (the network), leaving 8 bits for hosts (256 addresses, 254 usable).
10.0.1.1 – 10.0.1.254 = 254 usable host addresses
10.0.1.255 = broadcast address (not assignable)
CIDR Reference Table
| CIDR | Subnet Mask | Total IPs | Usable Hosts | Typical Use |
|---|---|---|---|---|
/8 | 255.0.0.0 | 16,777,216 | 16,777,214 | Entire VPC allocation (very large) |
/16 | 255.255.0.0 | 65,536 | 65,534 | Standard VPC size (AWS default) |
/20 | 255.255.240.0 | 4,096 | 4,091 | Large subnet (AWS default per-AZ) |
/24 | 255.255.255.0 | 256 | 251* | Standard subnet — most common |
/27 | 255.255.255.224 | 32 | 27* | Small subnet for specific tiers |
/28 | 255.255.255.240 | 16 | 11* | Tiny subnet — NAT GW, bastion |
/32 | 255.255.255.255 | 1 | 1 | Single host — security group rules |
* AWS reserves 5 IPs per subnet: network address, VPC router, DNS, future use, broadcast.
How to Read a Subnet Mask
A subnet mask has all 1s in the network portion and all 0s in the host portion. Bitwise AND of any IP with the mask gives the network address. This is how routers know whether a destination is local (same subnet) or remote (needs routing).
/25 (128 IPs), not /26 (64). Once a subnet is created in AWS/GCP/Azure, its CIDR cannot be changed. Avoid overlap with on-premises ranges if you'll need VPN or Direct Connect in the future.How Packets Travel
A network packet travels from source to destination through a layered decision process. At each hop, a device examines the destination IP and decides: deliver locally (Layer 2 / ARP) or forward to the next hop (Layer 3 routing). Understanding this is the foundation for cloud VPC design.
When source and destination are in the same subnet, the OS uses ARP (Address Resolution Protocol) to discover the destination's MAC address, then sends the frame directly. No router involved. In cloud VPCs, the hypervisor handles this — virtual NICs communicate without leaving the host in many implementations.
When subnets differ, the OS forwards the packet to its default gateway (the router's IP on that subnet, e.g. 10.0.1.1). The router reads the destination IP, looks up its routing table, and forwards to the next hop. This repeats until the packet reaches its destination or is dropped.
Network Address Translation allows private IPs to reach the internet. The NAT gateway replaces the packet's private source IP with its own public IP, maintains a translation table, and rewrites the return traffic. The internet sees only the NAT gateway's IP — internal topology is hidden.
Before any packet is sent, the OS resolves the hostname to an IP via DNS. In a VPC, there's a built-in resolver at 169.254.169.253 (AWS) or the second IP of your VPC range. Kubernetes has its own internal DNS (CoreDNS) that resolves service names to cluster IPs.
TCP Connection: The Three-Way Handshake
VPC Architecture
A Virtual Private Cloud (VPC) is a logically isolated network you define in the cloud. You choose the CIDR block, divide it into subnets across availability zones, and control all routing and access rules. Think of it as your private data center network, defined entirely in software.
Has a route to the Internet Gateway. Resources here can have public IPs. Use for: load balancers, NAT gateways, bastion hosts. Never put databases or app servers here.
Routes outbound traffic through NAT gateway (can reach internet, but internet cannot reach in). Use for: app servers, EKS/GKE nodes, microservices. The right home for compute.
No route to internet at all — not even NAT. Only reachable from within the VPC. Use for: databases, secrets stores, HSMs. If a DB doesn't need internet, remove the route.
Ingress & Egress
Ingress is traffic flowing into a resource. Egress is traffic flowing out. The distinction matters enormously for security rules: you control what can reach your resources (ingress) and what your resources can reach (egress). In cloud networking, these are configured at the security group, NACL, and routing layer.
- Traffic arriving at your resource
- Examples: user HTTP request, SSH connection, DB query from app
- Security groups check this for ALLOW rules
- Load balancer → app server is ingress to the app server
- SYN packet that starts a TCP connection
- Rate-limited by WAF, DDoS protection
- Traffic leaving your resource
- Examples: app calling external API, downloading packages, DNS queries
- Often overlooked — attackers exploit unrestricted egress for C2/exfil
- Goes through NAT gateway if resource is in private subnet
- Filter via firewall, proxy, or DNS allowlisting
- Lock down to known destinations (egress allowlist)
Internet Gateway vs NAT Gateway
| Component | Direction | Who Has a Public IP | Use For |
|---|---|---|---|
| Internet Gateway (IGW) | Bidirectional | The resource itself must have a public IP | Public subnets — load balancers, bastion hosts that need inbound from internet |
| NAT Gateway | Outbound only | The NAT gateway has the public IP; private resources stay private | Private subnet resources that need to call the internet (e.g., download packages, call APIs) |
| VPC Endpoint (AWS) | Internal only | No public IP involved — traffic stays on AWS backbone | Access S3, DynamoDB, SSM, STS without internet routing. Cheaper + private. |
| Private Link | Internal only | No public IP — interface endpoint in your VPC | Access other VPCs or partner services privately across the AWS network. |
Route Tables & Gateways
Every subnet in a VPC is associated with a route table — a list of destination CIDR → target mappings. When a packet leaves a resource, the VPC looks up the most specific matching route. The "local" route (your VPC CIDR) is always present and non-deletable.
Direct routing between two VPCs (same or different accounts/regions). Add routes in both VPCs pointing to each other's CIDR via the peering connection. Non-transitive — A↔B and B↔C does not give A↔C. Use Transit Gateway for hub-and-spoke topologies.
Central routing hub connecting multiple VPCs, on-premises networks (via VPN/DX), and accounts. Transitive routing — any spoke can reach any other. Centralize egress, inspection, and routing policy. Far simpler than a mesh of peering connections.
Security Groups & NACLs
AWS (and equivalents in Azure/GCP) provides two firewall mechanisms at different layers. Security Groups are stateful, virtual firewalls attached to individual resources (ENIs). Network ACLs are stateless, subnet-level guards applied before security groups. They complement each other.
| Dimension | Security Group | Network ACL (NACL) |
|---|---|---|
| Level | Instance / ENI (virtual NIC) | Subnet perimeter |
| Statefulness | Stateful — return traffic auto-allowed | Stateless — must allow both directions explicitly |
| Default behavior | Deny all inbound, allow all outbound | Allow all inbound and outbound (by default) |
| Rules | Allow rules only (no explicit deny) | Allow and Deny rules, evaluated in number order |
| Rule evaluation | All rules evaluated; most permissive wins | First matching rule wins (numbered, stop-on-match) |
| Source/Dest | IP CIDR or another Security Group ID | IP CIDR only |
| Primary use | Instance-level micro-segmentation, app port control | Subnet-level subnet-to-subnet blocking, emergency deny |
Security Group Best Practices
/32) and remove the rule immediately after.Kubernetes Network Model
Kubernetes imposes a networking model with three requirements: every Pod gets its own unique IP, all Pods can communicate with all other Pods without NAT, and nodes can communicate with all Pods. This flat model is implemented by the CNI (Container Network Interface) plugin — Cilium, Calico, Flannel, or the cloud-native option (Amazon VPC CNI, GKE Dataplane V2).
Pods get real VPC IP addresses (from your subnet CIDR). No overlay network — pod-to-pod traffic uses VPC routing directly. Security Groups can be applied per-pod. Requires enough IPs in your subnets — plan for max_pods_per_node × node_count free IPs.
CNI plugins that create a virtual overlay network. Pods get IPs from a pod CIDR (separate from VPC CIDR). Cilium uses eBPF for kernel-level enforcement — fastest dataplane. Both support Kubernetes NetworkPolicy and extended policy (CiliumNetworkPolicy).
kube-proxy (iptables rules) to handle Service IP routing. Cilium in eBPF mode replaces kube-proxy entirely with kernel-level load balancing — faster, more observable, supports network policies at the same layer. Recommended for production EKS clusters.Kubernetes Services & DNS
Pods are ephemeral — they die and respawn with new IPs. A Service provides a stable virtual IP (ClusterIP) and DNS name. Traffic to the service is load-balanced across healthy matching pods. CoreDNS resolves my-service.my-namespace.svc.cluster.local to the ClusterIP.
| Service Type | Reachable From | How | Use For |
|---|---|---|---|
| ClusterIP | Inside cluster only | Virtual IP in cluster CIDR (e.g. 172.20.x.x) | Internal service-to-service communication. Default type. |
| NodePort | Node IP + port (30000-32767) | Forwards node's port to pod | Dev/testing only. Exposes a port on every node — don't use in prod. |
| LoadBalancer | Internet or VPC (depends on annotation) | Provisions cloud LB (ALB, NLB, GLB) | External access. Use with internal: true annotation for VPC-only LBs. |
| ExternalName | Inside cluster only | CNAME alias to external DNS name | Abstract external dependencies (e.g., RDS endpoint) behind a service name. |
| Headless | Inside cluster only | DNS returns pod IPs directly, no ClusterIP | StatefulSets, databases with per-pod addressing (e.g., Kafka, Cassandra). |
Kubernetes Ingress & External Access
An Ingress resource defines HTTP/HTTPS routing rules — which hostname and path routes to which Service. The Ingress Controller (nginx, AWS ALB Controller, Traefik, Istio Gateway) watches Ingress resources and programs the actual load balancer or reverse proxy.
Set scheme: internal on the ALB annotation. The load balancer gets a private IP in your VPC subnets. Only reachable from within the VPC, VPN, or Direct Connect. Used for inter-service APIs or admin tools that should never be internet-accessible.
Adds a sidecar proxy to every pod for mTLS, traffic shaping, retries, and observability. Use Gateway and VirtualService resources instead of Ingress. Provides deep L7 control: canary deployments, circuit breaking, and identity-aware authorization across all service traffic.
Kubernetes Network Policies
By default, all pods in a Kubernetes cluster can communicate with all other pods. NetworkPolicy resources restrict this — defining exactly which pods can talk to which other pods, on which ports. Without network policies, a compromised pod has unrestricted access to every other pod in the cluster.
Least-Port Principle
Every open port and permitted IP range is a potential attack surface. The principle of least access applied to networking: open only the exact ports needed for the exact source ranges that need them. Anything else is denied by default. Document every exception.
Port Allowlisting Framework
| Port | Protocol | Allowed From | Justification Required |
|---|---|---|---|
443 | HTTPS | 0.0.0.0/0 — load balancer only | Public web traffic — yes, restricted to LB SG |
80 | HTTP | 0.0.0.0/0 — load balancer only | Redirect to HTTPS — yes, restricted to LB SG |
22 | SSH | Bastion SG only | Admin access — via bastion, never from internet |
5432 | PostgreSQL | App server SG only | DB queries — restricted to app tier SG |
6379 | Redis | App server SG only | Cache — restricted to app tier SG |
3306 | MySQL | App server SG only | DB queries — never to internet |
8080/8443 | HTTP/S internal | LB SG or specific SG | App port — never directly internet-exposed |
3389 | RDP | BLOCKED — use SSM | Never expose RDP to internet or VPC without PAM |
0-65535 | ALL | NEVER 0.0.0.0/0 | Never allow all ports from anywhere |
IP Allowlisting
- Admin portals: restrict to office IP ranges + VPN (
/32or/24) - CI/CD pipelines: restrict to known runner IPs
- Monitoring agents: restrict to monitoring CIDR
- Cross-account: restrict to specific account's VPC CIDR
- Webhook receivers: restrict to vendor's published CIDR list (GitHub, Stripe, etc.)
- 0.0.0.0/0 on non-HTTP ports — instant scanning target
- Dev rules left in prod — audit SGs quarterly
- Wide CIDR for "internal" — use SG references instead of subnets
- No egress rules — restrict outbound, not just inbound
- Shared SGs across tiers — each tier gets its own SG
Bastion Hosts
A bastion host (jump server) is a hardened, internet-facing instance whose sole purpose is to be the single entry point for administrative SSH/RDP access into your private network. Instead of exposing all your servers to SSH from the internet, you expose only the bastion — a small, auditable attack surface. All admin access flows through it.
Bastion Hardening Checklist
- Minimal OS — no unnecessary packages or services
- SSH key-based auth only — disable password authentication
- MFA on SSH login (e.g.,
google-authenticatorPAM module) - Restrict inbound SSH to known IPs only (
/32of your office/VPN) - No internet egress — bastion doesn't need outbound internet
- Automatic session logging (audit trail of all commands)
- No persistent local credentials — use IAM roles or certificate auth
- Auto-patch via SSM or unattended-upgrades
- AWS SSM Session Manager — browser/CLI shell into private instances with no SSH port open at all. Uses IAM auth, full audit log, no bastion needed.
- Teleport — open-source bastion replacement with SSO, RBAC, session recording, and K8s access proxy.
- Boundary (HashiCorp) — dynamic credential injection, just-in-time access, identity-based tunneling.
- ZTNA / Client VPN — give engineers network-level access to the VPC, then connect directly. No jump host needed.
Private-Only Architecture
The gold standard for sensitive workloads: no resources have public IPs, no inbound traffic from the internet, all access is through VPN or private endpoints. Traffic between resources and AWS services (S3, STS, ECR) goes through VPC Endpoints — never touching the internet.
com.amazonaws.region.s3— S3 Gateway endpoint (free)com.amazonaws.region.ecr.api— pull container images privatelycom.amazonaws.region.ecr.dkr— ECR Docker registrycom.amazonaws.region.ssm— SSM agent communicationcom.amazonaws.region.ec2messages— SSM run commandcom.amazonaws.region.sts— assume IAM roles privatelycom.amazonaws.region.secretsmanager— fetch secretscom.amazonaws.region.logs— CloudWatch Logs
- Set
endpointPrivateAccess: true,endpointPublicAccess: false kubectlfrom within VPC only (or VPN)- CI/CD pipelines must be inside VPC or use VPN/transit gateway
- Enable
auditlogs to CloudWatch — every API call logged - Restrict authorized networks even on private endpoint
- Use IRSA (IAM Roles for Service Accounts) — no node-level access keys
Quick Reference
Architecture Decision Tree
Essential Ports Reference
| Port | Service | Open to Internet? | Notes |
|---|---|---|---|
22 | SSH | NEVER | Use bastion/SSM. Source: bastion SG only. |
80 | HTTP | LB ONLY | Only on load balancer. Redirect to 443. |
443 | HTTPS | LB ONLY | Only on load balancer. TLS terminates here. |
3306 | MySQL | NEVER | App SG only. Isolated subnet. |
5432 | PostgreSQL | NEVER | App SG only. Isolated subnet. |
6379 | Redis | NEVER | App SG only. Isolated subnet. |
27017 | MongoDB | NEVER | App SG only. Isolated subnet. |
8080 | HTTP Alt | NEVER | Internal app port. LB SG → app SG only. |
2379/2380 | etcd (K8s) | NEVER | Control plane internal only. |
6443 | K8s API Server | PRIVATE | VPN/bastion access only. Never public. |
3389 | RDP | NEVER | Use SSM Fleet Manager for Windows. |
53 | DNS (UDP/TCP) | INTERNAL | VPC resolver. Block external DNS on nodes; force through VPC DNS. |
CIDR Quick Math
10.x.0.0/16 for the VPC (65k IPs). Allocate /24 subnets for app and DB tiers. Allocate /28 subnets for NAT gateways and bastion hosts (they only need a few IPs). Reserve 10.x.128.0/17 for future growth. Leave adjacent /16 blocks free if you'll ever need VPC peering — overlapping CIDRs cannot peer.