Enterprise Architecture Reference

GraphQL
Handbook

A practical, decision-first guide to GraphQL — covering what it is, when to adopt it (and when not to), common ARB objections with honest answers, schema design patterns, and production examples.

Query Language Type System Federation ARB Guide Subscriptions

🔷 What Is GraphQL

A query language for your API and a runtime for executing those queries — not a database, not a REST replacement in all cases, not a silver bullet.

GraphQL is a specification (not a framework or library) originally developed by Facebook in 2012 and open-sourced in 2015. It gives clients the power to ask for exactly the data they need — no more, no less — and describes the shape of that data via a strongly-typed schema.

Unlike REST, where the server defines fixed endpoints returning fixed shapes, GraphQL exposes a single endpoint through which clients express their own data requirements declaratively. The server schema acts as a contract between client and server teams.

Single Endpoint

One HTTP endpoint (typically POST /graphql). The operation type — query, mutation, subscription — is part of the request body, not the URL.

Typed Schema

Every field has a type. Types, queries, mutations, and relationships are declared in Schema Definition Language (SDL). The schema is introspectable at runtime.

Client-Driven

Clients declare the shape of the response they want. No over-fetching 50 fields when you need 3. No under-fetching that forces multiple round-trips.

GraphQL vs REST vs gRPC — at a glance

Dimension	GraphQL	REST	gRPC
Transport	HTTP/1.1, HTTP/2, WS	HTTP/1.1, HTTP/2	HTTP/2 (required)
Schema / Contract	Strongly typed SDL	OpenAPI (optional)	Protobuf (required)
Over/Under-fetching	Eliminated by design	Common problem	Controlled via proto
Versioning	Schema evolution (no v2)	/v1, /v2 or headers	Proto evolution
Real-time	Subscriptions (WS/SSE)	Polling or SSE	Streaming native
Browser native	Yes (HTTP + JSON)	Yes	Requires gRPC-Web proxy
Learning curve	Medium (SDL + concepts)	Low	High (protobuf, streaming)
Caching	Complex (client-side)	HTTP cache native	Manual
Best fit	Multi-client APIs, BFF	Public APIs, simple CRUD	Internal microservices

🧱 Core Concepts

The five building blocks you need to understand before designing or reviewing a GraphQL API.

Schema Definition Language (SDL)

GraphQL SDL schema.graphql

# Scalar types: String, Int, Float, Boolean, ID (built-in)
# Custom scalars for domain types
scalar DateTime
scalar Money

# Object type — a node in your graph
type Order {
  id:          ID!                  # ! = non-null
  placedAt:    DateTime!
  status:      OrderStatus!
  total:       Money!
  customer:    Customer!           # relationship — resolved separately
  lineItems:   [LineItem!]!         # non-null list of non-null items
}

# Enum
enum OrderStatus {
  PENDING
  PROCESSING
  SHIPPED
  DELIVERED
  CANCELLED
}

# Input type — used for mutation arguments (never reuse output types)
input PlaceOrderInput {
  customerId: ID!
  items:      [OrderItemInput!]!
  couponCode: String              # nullable = optional
}

# Root types
type Query {
  order(id: ID!):          Order
  orders(filter: OrderFilter, page: PageInput): OrderConnection!
  me:                       Customer
}

type Mutation {
  placeOrder(input: PlaceOrderInput!): PlaceOrderPayload!
  cancelOrder(id: ID!):              CancelOrderPayload!
}

type Subscription {
  orderStatusChanged(orderId: ID!): OrderStatusEvent!
}

Queries — asking for data

✓ Client query — asks exactly what it needs

# Mobile app — minimal payload
query GetOrderSummary($id: ID!) {
  order(id: $id) {
    id
    status
    total
    placedAt
  }
}

# Dashboard — richer shape, same query
query GetOrderDetail($id: ID!) {
  order(id: $id) {
    id
    status
    total
    customer { name email }
    lineItems {
      sku
      quantity
      unitPrice
    }
  }
}

✕ REST equivalent — over-fetches for mobile

# REST: GET /orders/:id
# Returns everything regardless of client
{
  "id": "ord_123",
  "status": "SHIPPED",
  "total": 149.99,
  "customer": {
    "id": "...",
    "name": "...",
    "email": "...",
    "address": { /* ...8 fields */ },
    "preferences": { /* ...12 fields */ }
  },
  "lineItems": [ /* ...full objects */ ],
  "auditLog": [ /* mobile doesn't need this */ ]
}

Resolvers — where data actually comes from

Each field in the schema maps to a resolver function. Resolvers can fetch from databases, microservices, caches, or compute values inline. The schema is the contract; resolvers are the implementation.

TypeScript — Apollo Server resolver

const resolvers = {
  Query: {
    // Root resolver — fetches a single order
    order: async (_parent, { id }, { dataSources }) => {
      return dataSources.orderAPI.getById(id);
    },
  },

  Order: {
    // Field resolver — lazy-loads customer only when requested
    customer: async (order, _args, { dataSources }) => {
      return dataSources.customerAPI.getById(order.customerId);
    },
  },

  Mutation: {
    placeOrder: async (_parent, { input }, { dataSources, user }) => {
      // Resolvers are the right place for auth checks & validation
      if (!user) throw new AuthenticationError('Login required');
      const order = await dataSources.orderAPI.place(input, user.id);
      return { order, success: true };
    },
  },
};

⚠️

N+1 problem: Naive resolvers trigger one DB query per item in a list. A query returning 100 orders and loading each customer fires 101 queries. Use DataLoader (batching + per-request memoization) to collapse these into a single query. This is the single most important performance concept in GraphQL.

✅ When to Use GraphQL

GraphQL earns its complexity overhead in specific scenarios. Use this checklist before proposing adoption.

Multiple clients with different needs

Web, mobile, TV, partner portals — each consuming a different shape from the same domain data. GraphQL eliminates the BFF proliferation problem: one schema, clients self-select their payload.

Rapid frontend iteration

Product teams iterating on UI without waiting for backend changes to add/remove fields. Clients evolve queries independently as long as the schema supports the fields.

Aggregating multiple services

GraphQL Federation or schema stitching lets you present a unified graph across Order, Customer, Inventory, and Billing services — without a bespoke aggregation microservice per use case.

Deeply nested or graph-like data

Social graphs, product catalogs with variants, org hierarchies, knowledge graphs — any domain where relationships matter and REST leads to N+1 round-trips or brittle ?include= parameters.

Teams wanting schema-first contracts

The SDL is a machine-readable, version-controlled contract. Frontend and backend teams can agree on schema, generate types, and work in parallel before implementation is complete.

Real-time requirements alongside queries

When you need live data updates (order tracking, dashboards, notifications) alongside regular data fetching — GraphQL subscriptions unify the protocol rather than adding a separate WebSocket or SSE layer.

Fitness signals — score your project

Signal	Weight	Interpretation
3+ distinct clients consuming the same domain data	STRONG YES	Core value proposition of GraphQL
Frontend teams blocked by backend for field additions	STRONG YES	Schema evolution solves this exactly
Multiple REST calls chained to build a single view	STRONG YES	GraphQL collapses to a single round-trip
Existing microservices team wants to own sub-graphs	YES	Federation pattern applies well
Partner / public API with diverse consumers	CONSIDER	Powerful but introspection exposure risk
Purely internal service, single consumer	SKIP	gRPC or REST is simpler and faster
Simple CRUD with predictable access patterns	SKIP	REST + OpenAPI is the right tool

🚫 When NOT to Use GraphQL

These are the honest contraindications — and what to reach for instead. Being clear here is how you win ARB trust.

Simple CRUD services

A service that does GET /users/:id, POST /users, PUT /users/:id, DELETE /users/:id has no ambiguity. REST is explicit, cacheable, and universally understood. GraphQL adds resolver infrastructure for zero benefit.

Use instead: REST + OpenAPI / Swagger

High-throughput, low-latency internal services

GraphQL over HTTP adds parsing overhead, query validation, and resolver chains. For internal microservice-to-microservice calls where you control both sides and need sub-millisecond latency, gRPC is 3–5× faster with less overhead.

Use instead: gRPC / Protocol Buffers

Public APIs requiring HTTP caching

GraphQL's single POST endpoint is opaque to HTTP caches (CDNs, proxies, Varnish). Implementing GET-based persisted queries helps, but it's a workaround. REST with proper Cache-Control semantics is far simpler to cache at the edge.

Use instead: REST, or GraphQL with Automatic Persisted Queries (APQ)

Teams without schema discipline

GraphQL's value is the contract. If no one owns the schema, if resolvers call each other recursively, if deprecations aren't enforced — you get a worse REST API. The tool requires process maturity, not just technology adoption.

Use instead: REST until the team establishes API governance

File upload heavy workflows

GraphQL has no standard for binary/multipart payloads. The community-spec workaround (multipart/form-data with operations JSON) is clunky and not supported by all clients. REST is the right fit for file-centric APIs.

Use instead: REST (presigned URLs + S3 / blob store)

Single-consumer APIs

GraphQL's flexibility (clients shaping responses) provides no value when there is exactly one client. You're adding schema infrastructure, resolver overhead, and tooling cost with no payoff. The problem GraphQL solves simply doesn't exist.

Use instead: REST or gRPC depending on sync vs performance needs

⛔

The "we might need it someday" trap. GraphQL is often proposed preemptively for flexibility. Flexibility has a cost — schema ownership, DataLoader wiring, federation topology, security analysis of arbitrary queries. Start with REST; migrate to GraphQL when the multi-client problem is actually present, not anticipated.

🏛️ ARB Pushback — Objections & Answers

These are the objections most Architecture Review Boards raise about GraphQL proposals. Honest, documented answers — not spin.

ARB #1

"GraphQL allows clients to run arbitrary queries — this is a security risk we can't accept."

This is a real concern, and it's addressable. Without controls, a deeply nested query can trigger recursive resolver chains that exhaust CPU and memory (a "query complexity" attack or "aliasing attack").

Mitigations that must be in place before production:

Query depth limiting — reject queries beyond N levels deep (e.g., depth 7). Configurable in Apollo Server, Hot Chocolate, Strawberry.
Query complexity scoring — assign a cost to each field; reject if total exceeds budget.
Persisted queries / trusted documents — for known clients (web/mobile apps), only allow a pre-registered whitelist of operation hashes. Arbitrary queries are disabled entirely.
Disable introspection in production — schema exposure aids attackers. Introspection is for developer tooling only.
Rate limiting per client / IP — same as any API.

With persisted queries, clients cannot send arbitrary operations at all. The "arbitrary query" concern is a non-issue for first-party clients using this pattern.

ARB #2

"How do we version this API? What happens when the schema changes?"

GraphQL is designed to evolve without versioning. The philosophy is "schema evolution, not version proliferation." New fields are additive and non-breaking. Old fields are @deprecated(reason: "...") and kept alive until usage drops to zero.

Practices that make evolution safe:

Never remove a field without a deprecation period (monitor usage via field-level tracing).
Never change a field's type from non-null to nullable or change its semantic meaning.
Use tooling like GraphQL Inspector in CI to detect breaking changes before merge.
Federation supports schema registry with compatibility checks (Apollo's Schema Registry / Cosmo).

When a truly breaking change is unavoidable, the recommended pattern is adding a new root field alongside the old one — not a /v2 endpoint — and migrating clients before removing the old field.

ARB #3

"We can't cache GraphQL responses — this will kill our CDN strategy."

Partially true, fully solvable. Standard HTTP GET caching doesn't work for POST /graphql. However:

Automatic Persisted Queries (APQ) — client sends a hash; server returns the response. On the second hit, the hash is sent via GET, enabling full CDN caching.
Response caching at the resolver level — Apollo Server, Hot Chocolate, and Strawberry support @cacheControl hints per field/type, which cache at the application layer.
DataLoader — per-request memoization eliminates redundant DB calls within a single operation (not HTTP caching, but equally important for performance).

If CDN-level caching of individual resource responses is the primary requirement, REST is genuinely simpler. GraphQL caching is application-level, not network-level by default.

ARB #4

"What's the operational overhead? Our team knows REST; this adds unknown cost."

Honest answer: there is real overhead. Teams new to GraphQL should expect 4–8 additional weeks of ramp-up for the first production deployment.

Incremental costs vs REST:

Schema design requires more upfront thought (but pays back with fewer API change requests later).
DataLoader patterns must be learned and enforced to prevent N+1 in production.
Observability requires field-level tracing (Apollo Studio, Cosmo, or OpenTelemetry with custom resolvers).
Security analysis of query complexity and depth must be built into your gateway config.

This cost is justified when GraphQL's strengths (multi-client, federation, schema contracts) apply. If the project doesn't clearly exhibit those needs, the overhead is pure tax.

ARB #5

"Our API must be public-facing and consumed by third parties. Is GraphQL appropriate?"

Use with caution for truly public APIs. GraphQL works well for partner APIs with known consumers, but for an open developer ecosystem, REST has significant advantages:

REST is universally understood; GraphQL client libraries vary by platform.
OpenAPI tooling for REST SDKs, documentation, and testing is more mature.
Introspection must be disabled in prod (complicates external developer experience).

Best of both: expose a REST public API for the ecosystem; use GraphQL internally or for known partners through a dedicated partner portal with controlled tooling.

📐 Schema Design

The schema is the most important artifact in a GraphQL system. Design mistakes here are expensive to undo.

Relay-compatible Connections (pagination)

Use cursor-based pagination via the Relay Connection spec. Offset-based pagination breaks when records are inserted/deleted mid-query. The pattern is widely supported by GraphQL clients and federation routers.

GraphQL SDL — Connections

# Standard connection pattern — don't invent your own pagination
type OrderConnection {
  edges:    [OrderEdge!]!
  pageInfo: PageInfo!
  totalCount: Int!
}

type OrderEdge {
  cursor: String!
  node:   Order!
}

type PageInfo {
  hasNextPage:     Boolean!
  hasPreviousPage: Boolean!
  startCursor:     String
  endCursor:       String
}

type Query {
  # first/after for forward, last/before for backward pagination
  orders(
    first:  Int
    after:  String
    last:   Int
    before: String
    filter: OrderFilter
  ): OrderConnection!
}

Mutation payload pattern

Mutations should return rich payload types — not just the affected entity. This allows returning errors inline (without abusing HTTP status codes), metadata, and related objects the client needs to update its local state.

GraphQL SDL — Mutation Payloads

# Return a payload type, not the raw entity
type PlaceOrderPayload {
  order:    Order               # null if failed
  errors:   [UserError!]!       # business validation errors
  success:  Boolean!
}

type UserError {
  field:   String               # which field caused the error
  message: String!
  code:    ErrorCode!
}

enum ErrorCode {
  INVALID_INPUT
  INSUFFICIENT_STOCK
  PAYMENT_DECLINED
  UNAUTHORIZED
}

# Client can now handle errors inline without try/catch on HTTP codes:
# mutation { placeOrder(...) { success errors { field message } order { id } } }

Schema design rules

Rule	Reason
Never expose database IDs directly — use opaque `ID` scalars	Prevents clients from guessing/iterating IDs; allows backend migration
Separate input types from output types	Output types may have computed fields and resolvers not valid as inputs
Name mutations with verb + noun: `createOrder`, `cancelOrder`	Clarity over REST-style resource thinking
Non-null (`!`) everything that will never be null in the domain	Clients get compile-time guarantees; avoids defensive null checks everywhere
Add `@deprecated(reason)` before removing any field — never remove immediately	Prevents client breakage; reason tells clients what to migrate to
Avoid generic types like `JSON` scalar or `Map<String, Any>`	Loses type safety; use typed union/interface instead
Model domain concepts, not database tables	GraphQL is a product API, not a DB reflection; align to business language

⚡ Queries & Mutations

Client operation patterns and resolver implementation best practices.

Fragments — reusable field sets

GraphQL — Client Operations

# Define reusable field sets with fragments
fragment OrderCard on Order {
  id
  status
  total
  placedAt
}

fragment OrderDetail on Order {
  ...OrderCard
  customer { name email }
  lineItems { sku quantity unitPrice }
}

# Use fragments in multiple operations — keeps queries DRY
query ListOrders($after: String) {
  orders(first: 20, after: $after) {
    edges { node { ...OrderCard } }
    pageInfo { hasNextPage endCursor }
  }
}

query GetOrder($id: ID!) {
  order(id: $id) { ...OrderDetail }
}

# Mutation with inline fragment on payload
mutation PlaceOrder($input: PlaceOrderInput!) {
  placeOrder(input: $input) {
    success
    order { ...OrderCard }
    errors { field message code }
  }
}

DataLoader — eliminating N+1

TypeScript — DataLoader pattern

import DataLoader from 'dataloader';

// Batch function: receives array of IDs, returns array of results in same order
const customerLoader = new DataLoader<string, Customer>(
  async (customerIds) => {
    // ONE query for all IDs — not one per ID
    const customers = await db.customers.findMany({
      where: { id: { in: [...customerIds] } }
    });

    // Must return in the same order as the input keys
    const map = new Map(customers.map(c => [c.id, c]));
    return customerIds.map(id => map.get(id) ?? new Error(`Customer ${id} not found`));
  }
);

// In resolver — DataLoader coalesces all calls within one tick into one batch
const resolvers = {
  Order: {
    customer: (order, _args, { loaders }) => {
      return loaders.customer.load(order.customerId); // batched automatically
    },
  },
};

📡 Subscriptions

Real-time event streams — when to use them and operational implications.

Good fit for subscriptions

Order status tracking (customer-facing)
Live dashboards and analytics counters
Collaborative editing presence indicators
Notification feeds and alert streams
IoT sensor data aggregation

Poor fit — use alternatives

Polling-based reports (use REST + polling)
Single-event webhooks (use REST webhooks)
High-frequency binary streams (use WebRTC or raw WS)
When WebSocket infra isn't production-ready
Serverless-only deployments (WS requires persistent connections)

TypeScript — Subscription resolver (Apollo)

import { withFilter } from 'graphql-subscriptions';

const resolvers = {
  Subscription: {
    orderStatusChanged: {
      // subscribe — returns an async iterator from your pub/sub system
      subscribe: withFilter(
        (_parent, _args, { pubsub }) =>
          pubsub.asyncIterator('ORDER_STATUS_CHANGED'),

        // filter — only send events relevant to the subscriber
        (payload, variables) =>
          payload.orderStatusChanged.orderId === variables.orderId
      ),
    },
  },
};

// Publishing from your domain layer (e.g., in OrderService)
await pubsub.publish('ORDER_STATUS_CHANGED', {
  orderStatusChanged: {
    orderId: order.id,
    newStatus: order.status,
    updatedAt: new Date().toISOString(),
  },
});

⚠️

Operational caution: Subscriptions require long-lived WebSocket connections. In horizontally-scaled deployments, your pub/sub system (Redis, Kafka, NATS) must route events to whichever instance holds the subscriber's connection. This is non-trivial infrastructure. Use Redis adapter for Apollo or evaluate managed options (Hasura, Grafbase) before rolling your own.

🖼️ Example: Backend-for-Frontend (BFF) Pattern

The most common GraphQL adoption pattern in enterprise — a GraphQL layer in front of existing REST or gRPC services, purpose-built for your product surfaces.

Rather than forcing each client (web, iOS, Android) to orchestrate calls to Order Service, Customer Service, and Inventory Service independently, a GraphQL BFF acts as an orchestration layer. Clients make one request; the BFF fans out to upstream services and assembles the response.

📱 clients Web / iOS / Android

→

🔷 GraphQL BFF Single endpoint

→

📦 upstream Order Service

👤 upstream Customer Service

🏷️ upstream Inventory Service

TypeScript — BFF resolvers calling REST upstream services

// DataSource wrapping each upstream service
class OrderAPI extends RESTDataSource {
  baseURL = 'https://order-service.internal/';

  async getById(id: string): Promise<Order> {
    return this.get(`orders/${id}`);
  }

  async listByCustomer(customerId: string): Promise<Order[]> {
    return this.get(`orders`, { params: { customerId } });
  }
}

// Resolvers orchestrate across data sources — client shapes the result
const resolvers = {
  Query: {
    me: async (_parent, _args, { dataSources, user }) => {
      return dataSources.customerAPI.getById(user.id);
    },
  },

  Customer: {
    // Lazy — only runs if client requests customer.orders
    orders: (customer, _args, { dataSources }) =>
      dataSources.orderAPI.listByCustomer(customer.id),
  },

  Order: {
    // Lazy — only runs if client requests order.inventoryStatus
    inventoryStatus: (order, _args, { dataSources }) =>
      dataSources.inventoryAPI.getStatus(order.sku),
  },
};

🕸️ Example: GraphQL Federation

Federation allows multiple teams to own sub-graphs that compose into a unified supergraph — without a central BFF team becoming a bottleneck.

Each service publishes its own partial schema. A router (Apollo Router, Cosmo, or WunderGraph) composes them into a unified schema at query-planning time. Teams work independently; the contract is the federation spec.

GraphQL SDL — Subgraph A: Orders Service orders-subgraph/schema.graphql

# Orders team owns this subgraph
extend schema @link(url: "https://specs.apollo.dev/federation/v2.3")

type Order @key(fields: "id") {
  id:       ID!
  status:   OrderStatus!
  total:    Float!
  placedAt: String!
  # Reference to Customer — resolved by Customers subgraph
  customer: Customer!
}

# Stub — orders knows customerId but not Customer fields
type Customer @key(fields: "id", resolvable: false) {
  id: ID!
}

type Query {
  order(id: ID!): Order
  orders(filter: OrderFilter): [Order!]!
}

GraphQL SDL — Subgraph B: Customers Service customers-subgraph/schema.graphql

# Customers team owns this subgraph
extend schema @link(url: "https://specs.apollo.dev/federation/v2.3")

type Customer @key(fields: "id") {
  id:      ID!
  name:    String!
  email:   String!
  # Customers subgraph also extends Order to add customer-scoped fields
  orders:  [Order!]!
}

# Reference resolver — receives { id } from Orders, returns full Customer
# Router calls this to "join" the entities at query-plan time

type Query {
  me:           Customer
  customer(id: ID!): Customer
}

ℹ️

Router options: Apollo Router (Rust, open-source core) is the most mature. WunderGraph Cosmo is a newer open-source alternative with a managed platform. Hot Chocolate (.NET) has first-class federation support. For .NET teams, Strawberry Shake + Hot Chocolate federation is the recommended stack.

🛡️ Security

GraphQL-specific attack surfaces and mandatory mitigations for production.

Threat	Mitigation	Library
Query depth attack — deeply nested query exhausts resolvers	Enforce max depth (e.g., 10 levels)	`graphql-depth-limit`, built-in to Hot Chocolate
Query complexity attack — expensive field combinations	Assign costs per field; reject over budget	`graphql-cost-analysis`, Apollo cost directives
Introspection disclosure — schema exposed to attackers	Disable introspection in production; enable only in dev/staging	Apollo Server: `introspection: false`
Unbounded results — query returns millions of rows	Enforce max `first`/`limit` arguments at schema level	Custom validation rule or schema directive
Alias flooding — duplicate fields with different aliases	Count aliased fields in complexity scoring	Custom validation or complexity library
Authorization bypass — field accessed without permission	Field-level auth with `@auth` directive or middleware	GraphQL Shield, Hot Chocolate's `@authorize`
Arbitrary operations (partner/public)	Persisted queries — only pre-approved operation hashes allowed	Apollo APQ, Relay persisted queries

TypeScript — Apollo Server security configuration

import { ApolloServer } from '@apollo/server';
import depthLimit from 'graphql-depth-limit';
import costAnalysis from 'graphql-cost-analysis';

const server = new ApolloServer({
  typeDefs,
  resolvers,

  // Disable introspection in production
  introspection: process.env.NODE_ENV !== 'production',

  validationRules: [
    // Reject queries deeper than 10 levels
    depthLimit(10),

    // Reject queries with cost score > 1000
    costAnalysis({
      maximumCost: 1000,
      defaultCost: 1,
      scalarCost: 1,
      objectCost: 2,
      listFactor: 10,
    }),
  ],

  // Body size limit — prevent large query strings
  bodyParserConfig: { limit: '50kb' },
});

🚀 Performance

DataLoader (mandatory)

Every resolver that loads a related entity must go through DataLoader. Without it, loading 100 orders and their customers issues 101 database queries. With DataLoader, it's 2. Non-negotiable in production.

Field-level tracing

Use Apollo Studio, Cosmo, or OpenTelemetry spans to see resolver execution time per field. Slow resolvers are immediately visible. Optimize before — not after — load testing.

Response caching

Use @cacheControl directives to annotate fields with cache TTLs. The server can return a Cache-Control header representing the minimum TTL across all resolved fields in the response.

Persisted queries

Send a hash instead of the full query string. Reduces request payload size (important on mobile). Enables GET-based requests which CDNs can cache. Security and performance benefit in one.

Defer and Stream (GraphQL 2022)

Use @defer on non-critical fields to return the primary payload immediately and stream deferred fields as they resolve. Reduces perceived latency for complex pages.

Query planning (Federation)

The router generates a query plan across subgraphs. Fetch subgraph data in parallel where possible. Avoid deeply nested cross-subgraph entity references that force sequential fetches.

✅

Performance baseline checklist before going live: DataLoader on all relational fields ✓ · Query depth + complexity limits configured ✓ · Field-level tracing enabled ✓ · APQ configured ✓ · Load test with realistic query shapes (not synthetic single-field queries) ✓ · N+1 detector run in staging ✓

🔧 Tooling

Category	Tool	Notes
Server (.NET)	Hot Chocolate / Strawberry Shake	First-class .NET GraphQL server. Annotation-based + SDL-first. Federation v2 support.
Server (Node)	Apollo Server, Yoga	Apollo for enterprise features + Studio. Yoga for lightweight/edge deployments.
Client (React)	Apollo Client, urql, Relay	Apollo for full ecosystem. urql for lightweight. Relay for Relay-spec pagination + Facebook scale.
Client (.NET)	Strawberry Shake	Generated typed client from schema. Integrates with Hot Chocolate ecosystem.
Code generation	GraphQL Code Generator	Generates TypeScript types, React hooks, and resolvers from schema + operations. Run in CI.
Schema management	GraphQL Inspector	Breaking change detection in CI. Diff schemas. Validate coverage.
IDE tooling	GraphiQL, Apollo Sandbox	In-browser query explorers. Apollo Sandbox works without local server.
Federation router	Apollo Router, Cosmo	Apollo Router (Rust): mature, open core. Cosmo: fully open-source alternative with managed option.
Schema registry	Apollo GraphOS, Cosmo Platform	Schema versioning, breaking change gating, subgraph composition validation.
Observability	Apollo Studio, OpenTelemetry	Field-level usage metrics, error rates, resolver latency heatmaps.
Testing	jest + graphql-tag, Testcontainers	Unit-test resolvers in isolation. Integration-test against a real schema + DB.

🔗 Reference Links

Specification & Foundations

Server & Client Libraries

Security & Performance

Architecture & Decision Guides

Quick decision matrix

Scenario	Recommendation	Rationale
Multiple clients, different payload needs	GraphQL	Core value prop — clients shape their own queries
Aggregating 3+ microservices for a product surface	GraphQL Federation	Avoids bespoke BFF per surface; teams own sub-graphs
Simple internal CRUD microservice	REST or gRPC	Zero over/under-fetching problem; added complexity not justified
Public API with unknown consumers	REST preferred	REST has wider tooling, simpler caching, lower learning curve
Real-time + query on the same domain	GraphQL + Subscriptions	Unified protocol; no separate WS service needed
Internal service, team owns both sides	gRPC	Strictly typed contract, 3–5× faster, no browser concern
High CDN cache dependency	REST or APQ	HTTP GET caching native to REST; APQ is a workaround
File uploads are a primary use case	REST + presigned URLs	GraphQL has no standard multipart support

📌

Bottom line for ARB presentations: GraphQL solves a specific problem — multiple clients with divergent data needs consuming overlapping domain services. When that problem exists, it solves it elegantly. When it doesn't, GraphQL adds operational complexity for no customer-facing benefit. The decision should follow the problem, not the hype.