Software Architecture Patterns: The Decision Maker's Handbook
Enterprise-grade architecture guidance focused on trade-offs, organizational constraints, and operational reality.
Table of Contents
This handbook is structured as a decision system, not a trend catalog.
Module 1: The Monolith & The Modular Monolith
- The Standard Monolith
- The Modular Monolith
- When to Use
- Cost Profile
Module 2: Microservices Architecture
- The Concept
- Trade-offs and Fallacies of Distributed Computing
- When to Use
- Cost Profile
Module 3: Event-Driven Architecture (EDA)
- The Concept
- Choreography vs. Orchestration
- When to Use
- Cost Profile
Module 4: Serverless Architecture (FaaS)
- The Concept
- Trade-offs: cold starts, lock-in, stateless design
- When to Use
- Cost Profile
Module 5: The Architect's Decision Matrix
- Comparative matrix across team size, complexity, scalability, and cost
Module 6: Common Pitfalls & Anti-Patterns
- The Distributed Monolith
- Resume-Driven Development
- Ignoring the Database
- Premature Scale Architecture
Module 1: The Monolith & The Modular Monolith
The Standard Monolith
A standard monolith is a single codebase deployed as a single deployment unit, usually backed by a shared database. Runtime boundaries are process boundaries, not network boundaries.
Real-world analogy: Think of a high-performing restaurant kitchen run by one team in one room. The grill station, pastry station, and plating station are specialized, but they can coordinate instantly without radio calls, shipping queues, or customs checks.
- What you gain: Simple local development, one CI/CD pipeline, easy refactoring across features, and strong transactional consistency.
- What you sacrifice: As the codebase and team grow, release coordination can become painful and deployment windows become risk-heavy.
The Modular Monolith
The modular monolith keeps the deployment simplicity of a monolith while introducing strict internal boundaries so parts of the system evolve with high autonomy.
Why the comeback is real: Many organizations overpaid for premature microservices and are now consolidating. The modular monolith recovers delivery speed by removing network overhead while preserving clear domain ownership and future extraction options.
Real-world analogy: Instead of splitting one building into many distant offices connected by highways, you build fireproof departments in the same building with controlled doors and documented handoff rules.
How to enforce strict boundaries in one codebase
- Package by business capability: Organize modules around domains (Billing, Catalog, Identity), not technical layers only.
- Use explicit module contracts: Each module exposes a small public API and hides internal implementation.
- Ban cross-module backdoor access: Disallow direct imports to internal folders via lint rules and architecture tests.
- Use separate schemas/tables by module where practical: Shared database engine is acceptable, but ownership boundaries must be explicit.
- Prefer domain events internally: Publish in-process events between modules to reduce tight coupling.
- Treat module boundaries as service boundaries: If a module cannot be extracted later without a rewrite, boundaries are too weak.
Illustrative boundary contract (conceptual)
// Modular monolith layout (single deployable, strict boundaries) src/ modules/ billing/ public-api.ts // only import entry point for other modules application/ domain/ infrastructure/ internal/ // inaccessible outside billing catalog/ public-api.ts identity/ public-api.ts // Rule: cross-module imports must use modules/*/public-api.ts only
When to Use
- Early-stage startups: Speed of iteration matters more than horizontal autonomy.
- Small teams (under 15 engineers): Communication overhead of distributed systems usually outweighs benefits.
- Highly coupled data domains: Strong consistency and transactional workflows are easier and safer.
Cost Profile
Explicit costs (easy to see):
- Low infrastructure cost: fewer runtimes, fewer data stores, smaller platform footprint.
- Low DevOps complexity: one pipeline, one deployment artifact, simpler observability baseline.
Hidden costs (often ignored until later):
- Deployment blast radius: one bad release can impact the full product surface.
- Downtime pressure during growth: deployment windows and rollback strategies become more critical as revenue and traffic rise.
- Scaling mismatch: hot paths and cold paths scale together even when only one domain needs extra capacity.
- Team contention risk: as headcount grows, merge conflicts and release coordination can slow delivery.
Start with a modular monolith by default unless you can prove independent scaling and independent deployment are urgent constraints today, not hypothetical constraints for next year.
Module 2: Microservices Architecture
Microservices break a system into independent, deployable services aligned to business capabilities. Each service should own its data and API contract, often resulting in polyglot persistence across the estate.
The Concept
- Service per business capability: For example Orders, Payments, Catalog, Identity, Fulfillment.
- Independent deployability: Teams can release one service without redeploying the entire platform.
- Data ownership per service: A service owns its schema and persistence technology.
- Polyglot persistence: Different services may use PostgreSQL, DynamoDB, Redis, or Elasticsearch based on workload shape.
Real-world analogy: Think of an airport ecosystem. Security, baggage, boarding, fueling, and air traffic control are specialized organizations with their own systems and operating procedures. The system scales because each unit is autonomous, but coordination overhead is constant and non-trivial.
The Trade-offs
Fallacies of Distributed Computing are not theory: the network is not reliable, latency is not zero, bandwidth is finite, topology changes, and transport has cost. Every cross-service call is an operational risk surface.
- Latency compounds: A single request path may trigger 8 to 20 downstream calls, each adding tail latency.
- Eventual consistency becomes default: Read-after-write behavior often requires explicit compensation in UX and workflows.
- Distributed transactions are hard: Two-phase commit is rarely practical at scale; most teams implement Saga-based compensation instead.
- Versioning and contract drift: API evolution becomes a product discipline with strict compatibility management.
- Operational blast radius shifts: You reduce code-level coupling but increase runtime failure modes (timeouts, retries, cascading failures).
When to Use
- Large organizations (50+ engineers): Multiple teams need autonomous release cadence.
- Independent scaling requirements: Some domains need 10x scale without scaling everything else.
- Regulatory or isolation boundaries: Different capabilities require strict security or compliance controls.
- Mature platform engineering function: You can sustain CI/CD templates, observability, service mesh policies, and reliability engineering.
Cost Profile
Explicit costs:
- Extremely high infrastructure cost: replicated databases, API gateways, service discovery, load balancers, queues, and cache layers.
- High DevOps/platform cost: Kubernetes expertise, GitOps, secrets management, canary tooling, and incident response maturity.
Hidden costs:
- Cognitive overload: Engineers must understand distributed tracing, retry semantics, and failure choreography.
- Testing tax: Contract tests, environment parity, and cross-service integration tests increase cycle time.
- On-call burden: More deployables means more alerts, more failure domains, and more MTTR pressure.
Module 3: Event-Driven Architecture (EDA)
EDA lets services communicate asynchronously by publishing and consuming events through brokers like Kafka, RabbitMQ, or AWS EventBridge, instead of chaining synchronous REST calls for every interaction.
The Concept
- Producers emit facts: For example OrderPlaced, PaymentCaptured, InventoryReserved.
- Consumers react independently: Multiple downstream services subscribe without producer-side coupling.
- Temporal decoupling: Producer and consumer do not need to be online at the same time.
- Elastic buffering: Brokers absorb spikes and smooth consumer processing rates.
Real-world analogy: A newsroom wire feed. A central feed publishes breaking events; finance, sports, and regional desks each consume and act based on their own priorities and timelines.
Choreography vs. Orchestration
- Choreography: Services react to events autonomously. There is no central conductor. This improves decoupling but can become hard to reason about at scale.
- Orchestration: A central coordinator decides workflow steps and sends commands. This improves visibility and control but introduces a control-plane dependency.
When to Use
- High decoupling requirements: Teams and services must evolve with minimal direct API dependencies.
- Real-time processing: Streaming analytics, fraud signals, and near real-time personalization.
- Massive unpredictable spikes: Checkout events, flash sales, and campaign traffic bursts.
- Audit trail needs: Durable event logs required for replay, compliance, or forensic analysis.
Cost Profile
Explicit costs:
- Medium-to-high infrastructure cost: managed brokers, partition strategy, retention policies, replication, and secure connectivity.
- Schema governance overhead: event contracts need strict versioning and compatibility controls.
Hidden costs:
- Debugging complexity is very high: tracing one business transaction across five consumers is difficult without robust tracing and correlation IDs.
- Operational edge cases: duplicate delivery, out-of-order events, poison messages, and replay handling require careful design.
- Data correctness risk: idempotency and deduplication bugs can silently corrupt business workflows.
Module 4: Serverless Architecture (FaaS)
Serverless architecture uses fully managed ephemeral compute (such as AWS Lambda or Azure Functions) with managed services for storage, messaging, and identity. Teams ship business logic while the platform handles most infrastructure operations.
The Concept
- Event-triggered functions: HTTP, queue, file upload, timer, or pub/sub events execute short-lived code.
- Managed platform primitives: database, auth, messaging, observability, and scaling are delegated to cloud services.
- Operational abstraction: No server patching or node pool management for most workloads.
Real-world analogy: Hiring an on-demand specialist crew instead of running a full-time factory. You pay only when they are actively doing work, which is excellent for bursty demand and expensive for constant heavy throughput.
The Trade-offs
- Cold starts: idle functions may incur startup latency that impacts p95 and p99 response times.
- Vendor lock-in: platform-native triggers and managed services can make portability expensive.
- Statelessness requirement: execution context is ephemeral; persistent state must be externalized.
- Runtime constraints: execution time limits, package size limits, and networking constraints influence design.
When to Use
- Unpredictable workloads: traffic is spiky or seasonal.
- Integration and glue logic: event handlers, automations, file processors, webhooks.
- Teams with minimal DevOps capacity: product teams prioritize feature delivery over platform management.
- Rapid experimentation: prototypes and new product surfaces where speed-to-market dominates.
Cost Profile
Explicit costs:
- Pay-for-usage model: extremely cheap at low or intermittent traffic.
- Low infrastructure operations cost: patching and capacity planning are largely outsourced to cloud provider.
Hidden costs:
- Runaway cost at steady massive scale: sustained high-throughput workloads can become materially more expensive than container or monolith alternatives.
- Observability blind spots: short-lived executions complicate end-to-end diagnostics unless telemetry is engineered up front.
- Platform coupling debt: migration away from provider-native event model can become a large re-architecture project.
Module 5: The Architect's Decision Matrix
Use this matrix as a starting point. Then adjust for your constraints: regulatory requirements, uptime targets, budget, team skill profile, and migration horizon.
| Architecture | Team Size Required | Deployment Complexity | Scalability | Infrastructure Cost (Low Traffic vs High Traffic) | Best Real-World Use Case |
|---|---|---|---|---|---|
| Monolith / Modular Monolith | Small to medium (3-15 engineers) | Low initially; medium at larger codebase size | Good vertical scaling; selective horizontal scaling with effort | Low at low traffic; moderate at high traffic | SaaS products finding product-market fit with tightly coupled domains |
| Microservices | Large (50+ engineers, multiple autonomous teams) | Very high | Excellent independent scaling by domain | High at low traffic; very high at high traffic due to platform overhead | Large platforms with independent team ownership and release autonomy requirements |
| Event-Driven Architecture | Medium to large (15+ engineers) | High | Excellent for burst handling and decoupled throughput | Medium at low traffic; high at high traffic | E-commerce, fintech, telemetry, and real-time processing pipelines |
| Serverless (FaaS) | Small to medium (2-20 engineers) | Low-to-medium for simple systems; medium at scale | Excellent auto-scaling for bursty workloads | Very low at low traffic; can become very high at sustained high traffic | Event-driven backends, workflow automation, APIs with volatile demand |
Choose the cheapest architecture that safely satisfies your next 18 to 24 months of business and team constraints. Re-architect with evidence, not anticipation.
Module 6: Common Pitfalls & Anti-Patterns
Most architecture failures come from mismatch: organizational readiness and workload reality do not match the chosen pattern.
1) The Distributed Monolith
Teams split services physically but keep them logically coupled. Deployments still need lockstep coordination, and one service outage cascades through the graph.
Typical symptoms:
- Multiple services must be deployed together for one feature release.
- Synchronous dependency chains dominate critical paths.
- Shared DTOs and libraries force upgrade lockstep.
- Incidents require all teams in the same war room.
Why it hurts: You pay distributed-system costs without gaining deployment independence.
2) Resume-Driven Development
Organizations adopt Kubernetes, service mesh, and microservices because they are fashionable, not because business constraints demand them.
Typical symptoms:
- Complex stack for a product with modest traffic (for example 500 DAU).
- Platform effort consumes roadmap velocity.
- Incident frequency rises while feature throughput drops.
Why it hurts: Architecture prestige is mistaken for product leverage; teams spend budget on complexity instead of customer outcomes.
3) Ignoring the Database
Compute is split into services while data remains in one shared, tightly coupled database. Service boundaries become cosmetic.
Typical symptoms:
- "Independent" services depend on cross-schema joins.
- One schema migration can break multiple services.
- Data ownership is unclear; no team can safely evolve the model.
Why it hurts: True autonomy requires data ownership boundaries. Without them, you get the coordination pain of both monolith and microservices.
4) Premature Scale Architecture
Teams over-engineer their initial architecture to handle anticipated scale that never materializes. This trades velocity for hypothetical future headroom.
Typical symptoms:
- Multi-region active-active setup for a product with 200 users.
- CQRS and event sourcing applied system-wide before any read/write scale pressure exists.
- Sharding and partitioning strategies built before the data model is proven.
- Weeks spent on infrastructure that would take hours with a simpler design.
Why it hurts: Premature optimization of architecture increases time to first learning, burns runway on speculation, and locks you into designs that may be wrong for the actual product shape that emerges. Start simple; complexity is always cheaper to add than to remove.