Modular Monolith
& DevOps Architecture
A complete architectural blueprint for an Angular + .NET Core modular monolith — covering application structure, 20-repo strategy, GitOps, CI/CD pipeline, Helm, ArgoCD, Kubernetes, and production observability. Every decision is explained with its rationale.
Modular Monolith DesignWhy not microservices — and why not a big ball of mud
A modular monolith delivers microservice-level domain separation without the operational overhead of inter-service networking, distributed tracing, and service mesh. For most teams up to ~30 engineers, it is the pragmatic default. It deploys as one unit, shares one database (with schema-level isolation per module), and can be sliced into microservices later along well-established seams — if that day ever comes.
- One deploy artifact — simpler CI/CD
- In-process calls — no network latency between modules
- Shared transaction boundary — ACID across modules
- Easier debugging — single process, single log stream
- Lower infra cost — fewer containers, no service mesh
- Enforce module boundaries via compiler — no cross-module internal references
- Each module owns its own database schema
- Cross-module communication via interfaces, not concrete classes
- Architecture tests (ArchUnitNET) fail the build on violations
- Module = vertical slice: API → Application → Domain → Infra
Module BoundariesDomain-driven slicing of the .NET Core backend
Each module is a separate C# project within the same solution. Modules communicate only through published interfaces or an internal event bus — never by referencing each other's internals.
ICurrentUser interface to other modules.ProductUpdated.Shared project with only value objects, interfaces, and contracts that multiple modules genuinely need. No business logic lives there. It is a contract library, not a dumping ground.
Angular FrontendStandalone SPA aligned to backend module boundaries
The Angular frontend follows a feature-module structure that mirrors the backend's domain modules. This is not cosmetic — it means each Angular feature team owns its slice end-to-end.
features/— one folder per backend modulecore/— auth, HTTP interceptors, guardsshared/— UI components, design systemshell/— app router, layout, nav
- Lazy-loaded feature modules — Angular standalone APIs
- Built to static assets — served via Nginx container
- Runtime config via
/assets/config.json— no rebuild per env - CSP headers enforced at Nginx level
20-Repo StrategyHow to organize 20 GitHub repos without chaos
With 20 repos and a modular monolith, a true monorepo (Nx/Turborepo) introduces significant tooling overhead for non-JS stacks. A polyrepo with strong conventions is simpler to reason about, grants cleaner CODEOWNERS boundaries, and each repo maps to a clear deployment artifact or library.
Repo Categories
uses: org/platform-shared-ci/.github/workflows/build.yml@main. This means updating the pipeline in one place propagates to all 20 repos instantly.
Branch StrategyEnvironment-aligned branching — Gitflow with trunk evolution
Pure trunk-based development works for teams with very high test confidence and feature flags. For a 20-repo modular monolith with a QA cycle and staging environment, modified Gitflow gives clearer environment promotion and safer hotfix paths. develop → release → main maps directly to dev → staging → production.
| Branch | Maps to Environment | Deploy Trigger | Protection Rules |
|---|---|---|---|
| main | Production | Manual approval gate after merge | 2 reviewers, CI must pass, no force-push |
| release/* | Staging | Auto on push to release branch | 1 reviewer, CI must pass |
| develop | Dev / Integration | Auto on merge | 1 reviewer, CI must pass |
| feature/* | None (PR preview optional) | CI only, no deploy | CI must pass before merge |
| hotfix/* | Production fast-track | Manual gate, expedited | 1 reviewer + senior lead |
GitOps Repo DecisionShould you have a dedicated gitops-config repo? Yes. Here's why.
The gitops-config repo is the single source of truth for what is deployed where. It contains Helm values files per environment, image tag references, and resource overrides. It does NOT contain application code, Dockerfiles, or business logic.
- Clean audit trail — every production change is a git commit
- ArgoCD watches only this repo — reduces blast radius of app repo access
- Environment promotion = PR + review, not a kubectl command
- Different access rights: devs push to app repos; only CD pipeline + SRE push to gitops repo
- No secrets ever live here — only references (Vault paths, Sealed Secrets)
├─ apps/
│ ├─ app-frontend/
│ ├─ dev.values.yaml
│ ├─ staging.values.yaml
│ └─ prod.values.yaml
│ └─ app-backend/ …
├─ clusters/
│ ├─ dev/ staging/ prod/
└─ argocd-apps/
└─ ApplicationSet manifests
sha-abc123 → CI opens an automated PR to gitops-config updating dev.values.yaml → PR auto-merges for dev, requires review for staging/prod → ArgoCD detects the diff → reconciles the cluster. No human types kubectl commands.
Pipeline Overviewlocal dev → pre-commit → CI → image push → GitOps update → ArgoCD sync → K8s → alerts
Local DevelopmentFast feedback loop without a Kubernetes cluster on your laptop
Docker Compose spins up the full stack locally — .NET backend, Angular dev server (or built Nginx container), PostgreSQL, Redis, and any sidecar services. Developers should be able to docker compose up and have a working environment in under 3 minutes.
- app-backend — hot-reload with
dotnet watch - app-frontend — Angular dev server with proxy to backend
- postgres — schema-per-module, seeded with test data
- redis — distributed cache
- mailhog — local email catcher
- seq — local structured log viewer
- One
.env.localfile, never committed tools-seed-datarepo provides realistic dev data- DB migrations run automatically on startup
- No need for cloud credentials for basic development
- Feature flags default to "on" in local env
Pre-commit HooksCatch problems before they reach CI — not after
CI minutes are expensive and slow. A pre-commit hook that runs in 10 seconds catches 80% of the obvious issues that would otherwise waste a 4-minute CI run. The golden rule: if it can run locally fast, it should run locally first.
| Hook | Tool | Checks |
|---|---|---|
| Format | dotnet format --verify-no-changes · ng lint + Prettier | Code style, whitespace, import order |
| Secrets scan | detect-secrets / gitleaks | API keys, connection strings, tokens in diff |
| Commit message | commitlint + conventional commits | feat(auth): add MFA support [JIRA-123] |
| Unit test smoke | dotnet test --filter Category=Unit (fast subset) | Run only tests touching changed projects |
| File size guard | Custom shell check | Reject files > 1MB committed by accident |
--no-verify. If a check needs 2+ minutes, it belongs in CI, not pre-commit.
CI: Build + Test + ScanGitHub Actions — everything happens here before an image is produced
Source code is already in GitHub. GitHub Actions is the zero-friction choice — no separate CI server to maintain, native secret management, reusable workflows, and tight integration with PRs, branch protection, and environments. The platform-shared-ci repo provides reusable workflow templates called by all 20 repos.
CI Pipeline Stages (GitHub Actions)
develop, release/*, or main. Also on push to develop (for auto-deploy to dev). Feature branch pushes run CI without deploy.dotnet restore + dotnet build --configuration Release /warnaserror — warnings are errors. Frontend: npm ci + ng build --configuration=production. Both run in parallel jobs.dotnet test with XPlat Code Coverage. Angular with ng test --watch=false --code-coverage. Coverage threshold enforced: fail CI below 80% on business logic projects. Test results posted as PR comment.[Integration] run against a real database. Runs on PRs to release/* and main — skipped on feature-branch PRs to keep feedback fast.dotnet format --verify-no-changes enforced. Angular ESLint checked. Security hotspots block merge on main.main (including integration tests) should complete in under 15 minutes. Use matrix builds and parallel jobs aggressively.
Image PushWhen, where, and how images are published
Since source is in GitHub, GitHub Container Registry (GHCR) is the natural choice — no extra credentials, native GITHUB_TOKEN auth, image visibility tied to repo visibility, and no additional monthly cost on GitHub Enterprise. For regulated environments needing image scanning at registry level, AWS ECR is an alternative with Clair/Inspector integration.
sha-<7-char-git-sha>— primary immutable tag, alwaysdevelop— latest develop build (mutable)staging— latest staging-ready buildv1.4.2— on git tag, semantic version- Never push
:latest— it's a footgun in production
- Trivy scan on image before push — fail on CRITICAL CVEs
- Multi-stage Dockerfile — final image is distroless or Alpine
- Run as non-root user inside container
- Image SBOM (Software Bill of Materials) attached to image
- Image signed with cosign — verified at deploy time by K8s admission
GitOps UpdateThe CI pipeline opens a PR against gitops-config — it never directly deploys
The application CI pipeline's job ends when the image is pushed. It does NOT run kubectl apply or helm upgrade. Instead, it opens a PR (or auto-merges for dev) to the gitops-config repo, updating the image tag in the relevant values file. This separation is intentional: it decouples what was built from what is deployed, and makes every environment change auditable in git.
gitops-config/apps/app-backend/dev.values.yaml. No human action needed. ArgoCD detects the change and syncs within ~3 minutes.staging.values.yaml. Requires 1 review from a team lead. Merge triggers ArgoCD sync to staging cluster.prod.values.yaml requires 2 senior approvals + a deployment window comment. Merge is the approval record. ArgoCD sync follows — with manual sync gate option for major releases.gitops-config — restoring the previous image tag. ArgoCD reconciles. No custom rollback scripts needed. Full audit trail preserved.
ArgoCD — GitOps Continuous DeliveryWhy ArgoCD over Flux or push-based CD
ArgoCD is the industry standard for Kubernetes GitOps. Its UI makes drift visible at a glance. It is pull-based — the cluster pulls desired state from git rather than CI pushing manifests in. This means no CI system ever needs kubectl credentials to the production cluster, which is a major security win. Compared to Flux, ArgoCD's UI and ApplicationSet patterns are better suited for 20-repo organizations.
ArgoCD Setup Decisions
ApplicationSet with a Git generator to auto-create ArgoCD Application objects from directories in gitops-config/apps/. Adding a new service = adding a folder. No manual ArgoCD configuration needed per service.
KubernetesCluster design, namespace strategy, and workload configuration
- Dev cluster — shared, smaller nodes, preemptible/spot instances
- Staging cluster — mirrors prod sizing, same region, used for load tests
- Prod cluster — dedicated, multi-AZ, managed (EKS/AKS/GKE)
- Separate cluster per environment — not namespace-only separation
apps— all application workloadsplatform— ArgoCD, cert-manager, ingress-nginxmonitoring— Prometheus, Grafana, Lokisecrets— External Secrets Operator
Standard Workload Configuration per Service
| Concern | Decision | Reason |
|---|---|---|
| Health checks | Liveness + readiness + startup probes | Startup probe prevents premature kill during .NET startup |
| Resource limits | CPU request ≠ limit, Memory request = limit | CPU bursting OK; OOM kill is preferable to slow response |
| HPA | Horizontal Pod Autoscaler on CPU + custom metric (request rate) | Scale out under load before hitting CPU ceiling |
| PodDisruptionBudget | minAvailable: 1 on prod | Prevents cluster upgrades taking down all pods |
| Anti-affinity | Prefer spread across nodes and AZs | Single node failure shouldn't take down service |
| Image pull policy | IfNotPresent with immutable tags | Never pull :latest — non-deterministic |
| Network policy | Default deny, explicit allow per service | Module A cannot talk to Module B's DB pod |
Observability & AlertsMetrics → Logs → Traces → Alerts — the four pillars
OpenTelemetry.Exporter.Prometheus. Angular performance via RUM. Grafana dashboards live in ops-monitoring repo — provisioned as code.ops-monitoring repo and are applied via Helm chart.What to Alert On (Signals, Not Symptoms)
| Signal | Threshold | Severity |
|---|---|---|
| HTTP 5xx error rate | > 1% over 5 min window | P2 — Slack + PagerDuty |
| HTTP p99 latency | > 2s sustained 10 min | P2 — Slack |
| Pod crash loop | 3+ restarts in 15 min | P1 — PagerDuty immediate |
| ArgoCD sync failed | Any failure on prod | P2 — Slack |
| Certificate expiry | < 30 days remaining | P3 — Slack warning |
| DB connection pool saturation | > 80% pool used | P2 — Slack |
| Disk usage on PVC | > 80% of claim | P3 — Slack |
Why HelmThe definitive answer to "why not just raw YAML?"
With 20 repos and 3 environments (dev, staging, prod), raw Kubernetes YAML means maintaining 3× duplicated manifests per service — 60+ manifest sets. Helm solves environment parameterization: one chart template, three values files, one source of truth.
- Templating — one chart, values files per environment. No YAML duplication.
- Release management —
helm historyshows every deploy with rollback support - Dependency management — subchart dependencies versioned like npm packages
- ArgoCD native — ArgoCD understands Helm natively, renders values before applying
- Community charts — PostgreSQL, Redis, Ingress-NGINX, Cert-Manager all ship as production-grade Helm charts
├─ Chart.yaml — name, version, appVersion
├─ values.yaml — sensible defaults
└─ templates/
├─ deployment.yaml
├─ service.yaml
├─ ingress.yaml
├─ hpa.yaml
├─ configmap.yaml
└─ pdb.yaml
app-backend/charts/). The gitops-config repo references these charts by version and overrides only environment-specific values. This keeps chart logic close to the code it deploys, while keeping environment config separate.Kustomize is a valid alternative (and is ArgoCD-native too). It is better when you have simple overlays and no need for parameterized logic. Helm wins when you have complex conditionals (HPA only in prod, different ingress annotations per env, feature flag toggles). For this architecture with 3 environments and 20 services, Helm's templating capability justifies the learning curve.
Tool Decisions SummaryEvery choice with its rationale — no cargo-culting
| Stage | Tool Chosen | Why This, Not Alternatives |
|---|---|---|
| Source control | GitHub | Stated requirement. GitHub Actions native CI, GHCR, branch protection, CODEOWNERS all in one platform. |
| CI/CD engine | GitHub Actions | Source is GitHub — zero friction. Reusable workflows (platform-shared-ci) serve as DRY CI templates. No separate Jenkins/GitLab to maintain. |
| Container registry | GHCR | Native GitHub auth, no extra service. ECR if AWS-native is required with Inspector scanning. |
| Image security scan | Trivy | Open source, fast, scans OS packages + language deps, integrates with GitHub Actions via action, widely adopted. |
| Static analysis | SonarCloud + Roslynator | SonarCloud for security hotspots + code smells + PR decoration. Roslynator for .NET-specific patterns at build time. Both are complementary. |
| GitOps controller | ArgoCD | Best UI for drift visibility, ApplicationSet for multi-app management, mature RBAC, active community. Flux is valid alternative but ArgoCD UI wins for teams new to GitOps. |
| K8s packaging | Helm | Environment parameterization, rich community charts, ArgoCD native. Kustomize for simpler single-environment projects. |
| Secrets | External Secrets Operator | Pulls secrets from AWS Secrets Manager / Azure Key Vault at runtime. Sealed Secrets if no cloud secret store. Never store secrets in gitops-config. |
| Metrics | Prometheus + Grafana | Industry standard. OpenTelemetry SDK in .NET exports to Prometheus. Grafana dashboards as code in ops-monitoring repo. |
| Logs | Loki + Grafana | Same Grafana instance queries both metrics and logs. Loki is cost-effective vs Elasticsearch for structured log querying. |
| Tracing | Tempo | Grafana native, same observability stack. OpenTelemetry traces from .NET → Tempo. Jaeger is alternative. |
| Alerting | Alertmanager → PagerDuty + Slack | Prometheus Alertmanager routes by severity. PagerDuty for P1/P2 on-call escalation. Slack for P3/P4 visibility. |
| Local dev | Docker Compose | No Kubernetes overhead on developer laptops. Full stack in one command. Tilt.dev optional for hot-reload with K8s. |
| E2E testing | Playwright | Cross-browser, fast, great Angular support, component testing mode. Lives in test-e2e repo, runs against deployed staging. |
| Image signing | cosign (Sigstore) | Keyless signing via GitHub OIDC. Supply chain security without managing private keys. Verified at admission by Kyverno. |
Environment StrategyDev, Staging, Production — and how they differ
| Environment | Cluster | Deploy Trigger | ArgoCD Sync | Data | Purpose |
|---|---|---|---|---|---|
| Local | Docker Compose | Developer runs manually | N/A | Seeded synthetic | Fast iteration, debugging |
| Dev | Shared K8s (small) | Merge to develop | Auto + self-heal | Seeded synthetic | Integration smoke tests |
| Staging | Dedicated K8s (prod-size) | Release branch push + PR approval | Auto, no self-heal | Anonymized prod copy | QA, load testing, UAT |
| Production | Dedicated K8s (HA, multi-AZ) | Merge to main + manual gitops PR | Manual sync only | Live production data | Serve real users |
kubectl exec access to production pods. All production changes go through the gitops-config PR process. Break-glass emergency access exists for SRE leads only, with session recording and automatic ticket creation. Every kubectl command in production is a post-incident finding waiting to happen.