??????? gRPC & Protocol Buffers — Field Handbook
Field Manual
gRPC & Protocol Buffers
DOC-RPC-2024
LIVE
gRPC
High-Performance RPC Framework

gRPC &
Protobuf

// Contract-first. Binary. Multiplexed. Streaming-native.

The definitive guide to building high-performance inter-service communication with gRPC and Protocol Buffers. Covers the binary wire format, all four RPC modes, production hardening, and schema evolution strategies for microservice architectures.

proto3 HTTP/2 Bidirectional Streaming 10× faster than JSON Code Generation CNCF Graduated
01

What is gRPC

// GOOGLE'S OPEN-SOURCE RPC FRAMEWORK

gRPC (Google Remote Procedure Call) is a high-performance, open-source RPC framework developed by Google, now a CNCF graduated project. It uses Protocol Buffers as the interface definition language and binary serialization format, transmitted over HTTP/2. Released publicly in 2016, it has become the dominant wire protocol for inter-microservice communication in polyglot environments.

Contract-First Design

You define your API in a .proto file — the contract. The protoc compiler generates strongly-typed client and server stubs in any supported language. The schema is the documentation.

Binary Wire Format

Protocol Buffers serialize to a compact binary format — typically 3–10× smaller than equivalent JSON, with significantly faster serialization and deserialization. No parsing ambiguity, no whitespace overhead.

Polyglot Native

Official code generation for Go, Java, Python, C++, C#, Node.js, Ruby, Swift, Kotlin, Dart, and more. One .proto file generates idiomatic clients and servers across your entire stack.

📋
Proto IDL
Schema definition and code generation
HTTP/2
Multiplexing and flow control
🔁
Streaming
4 RPC modes including bidi
🛡️
TLS + Auth
mTLS and token interceptors
gRPC is not just for microservices. It's used for mobile-to-backend (gRPC-Web), internal tooling, database protocols (CockroachDB, TiKV), streaming pipelines, and AI inference APIs (NVIDIA Triton). If two systems need to talk fast and reliably, gRPC is a strong default.
02

Why gRPC over REST

// WHEN THE TRADEOFFS FAVOR RPC

REST over HTTP/1.1 with JSON is excellent for public APIs, browser clients, and developer ergonomics. gRPC wins decisively for internal service-to-service communication where performance, type safety, and streaming matter. The question is not "which is better" — it's which tradeoffs serve your use case.

REST + JSON (HTTP/1.1)
  • Human-readable, easy to debug with curl
  • One request per connection (without pipelining)
  • JSON parsing overhead — always text → object
  • No enforced schema (OpenAPI is optional)
  • No native streaming — SSE/WebSocket are bolt-ons
  • Large payloads — field names repeated per object
  • Versioning via URL (/v1/, /v2/) is messy
  • Excellent browser and public API support
gRPC + Protobuf (HTTP/2)
  • Binary — requires tooling to inspect (grpcurl, grpc-ui)
  • Multiplexed — many requests over one connection
  • Protobuf: up to 10× faster parse, 3–10× smaller
  • Enforced typed contract — compiler catches breakage
  • First-class streaming in all four directions
  • Field numbers encode keys — tiny wire size
  • Backward-compatible field evolution via field numbers
  • No native browser support (gRPC-Web workaround)

Performance At a Glance

Payload size
~7× smaller
Serialize speed
~5× faster
Connection reuse
Multiplexed
Type safety
Enforced
Browser support
gRPC-Web
Don't use gRPC for public APIs. If third-party developers, mobile browsers, or external systems need to call your API, REST/JSON remains the right choice. gRPC shines at the internal seam between your own services. Many teams run both: gRPC internally, REST externally via a gateway (Envoy, Kong, AWS API Gateway).
03

HTTP/2 Under the Hood

// WHY THE TRANSPORT LAYER MATTERS

gRPC is built entirely on HTTP/2. Understanding the transport layer explains why gRPC achieves its performance characteristics and why HTTP/1.1-based REST cannot replicate them without fundamental changes.

Multiplexing Key Feature

Multiple concurrent RPC calls share a single TCP connection via streams. HTTP/1.1 requires one connection per concurrent request (or connection pooling hacks). HTTP/2 eliminates head-of-line blocking at the application layer — a slow call doesn't block other calls on the same connection.

Header Compression (HPACK)

HTTP/2 uses HPACK to compress headers. In gRPC, metadata (auth tokens, trace IDs) is sent as headers. HPACK maintains a shared compression context between client and server, meaning repeated headers (like content-type: application/grpc) cost near-zero after the first request.

Binary Framing Layer

HTTP/2 transmits data in binary frames, not text. Each gRPC message is wrapped in a 5-byte length-prefix (1 byte compression flag + 4 byte length), then sent as HTTP/2 DATA frames. This enables precise length-delimited message framing without text parsing.

Flow Control & Push

HTTP/2 has per-stream and connection-level flow control — the receiver advertises how much data it can accept. This is critical for gRPC streaming: a slow consumer can signal backpressure to the sender without dropping messages or needing application-level throttling.

gRPC Frame Format on the Wire

Wire Format
## gRPC message framing (Length-Prefixed Message) ## Sits between the HTTP/2 DATA frames and the Protobuf payload ┌─────────────────────────────────────────────────────────────┐ │ Byte 0 │ Compressed-Flag (0 = no compression, 1 = gzip/snappy) │ │ Bytes 1–4 │ Message-Length (big-endian uint32, length of proto msg) │ │ Bytes 5–N │ Serialized Protobuf message │ └─────────────────────────────────────────────────────────────┘ ## HTTP/2 headers for a gRPC request :method POST :scheme https :path /helloworld.Greeter/SayHello :authority api.example.com content-type application/grpc # required by gRPC spec grpc-timeout 5S # optional: 5-second deadline grpc-encoding gzip # optional: message compression authorization Bearer <token> # custom metadata as headers ## HTTP/2 trailers for gRPC response status grpc-status 0 # 0 = OK, non-zero = error grpc-message "" # human-readable error message
gRPC-Web: Browsers cannot access HTTP/2 trailers, which gRPC uses for status codes. gRPC-Web works around this by encoding trailers in the response body and using an HTTP/1.1-compatible proxy (Envoy, grpc-gateway). It supports unary and server-streaming, but not client or bidirectional streaming in the browser.
04

Proto3 Syntax

// WRITING YOUR FIRST .PROTO FILE

A .proto file is the single source of truth for your service contract. It defines messages (data structures) and services (RPC endpoints). The protoc compiler and language-specific plugins generate all boilerplate — you only write the business logic.

Protocol Buffers — Complete Example
syntax = "proto3"; // always specify; proto2 is legacy package payments.v1; // logical namespace option go_package = "github.com/acme/payments/gen/go/payments/v1;paymentsv1"; option java_package = "com.acme.payments.v1"; import "google/protobuf/timestamp.proto"; // well-known types import "google/protobuf/money.proto"; // ── Message definitions ── message ChargeRequest { string order_id = 1; // field number — NEVER changes int64 amount_cents = 2; // snake_case names, camelCase generated string currency = 3; string customer_id = 4; PaymentMethod method = 5; // nested enum repeated string tags = 6; // repeated = list/array // field numbers 7–15 reserved for future use (1-byte encoding) } message ChargeResponse { string transaction_id = 1; ChargeStatus status = 2; google.protobuf.Timestamp processed_at = 3; optional string failure_reason = 4; // proto3 optional — has presence } enum PaymentMethod { PAYMENT_METHOD_UNSPECIFIED = 0; // enums MUST start at 0 PAYMENT_METHOD_CARD = 1; PAYMENT_METHOD_BANK = 2; PAYMENT_METHOD_CRYPTO = 3; } enum ChargeStatus { CHARGE_STATUS_UNSPECIFIED = 0; CHARGE_STATUS_SUCCESS = 1; CHARGE_STATUS_DECLINED = 2; CHARGE_STATUS_PENDING = 3; } // ── Service definition ── service PaymentService { rpc Charge(ChargeRequest) returns (ChargeResponse); rpc StreamTransactions(TransactionFilter) returns (stream Transaction); rpc BatchCharge(stream ChargeRequest) returns (BatchChargeResponse); }
Code generation: Run protoc --go_out=. --go-grpc_out=. payments.proto (or use buf generate — the modern alternative). The compiler produces a payments.pb.go (messages) and payments_grpc.pb.go (client/server stubs). Never edit generated files — regenerate from the proto.
05

Field Types & Rules

// SCALARS, WELL-KNOWN TYPES, ONEOF, MAP

Scalar Types

Proto TypeGoJavaPythonNotes
doublefloat64doublefloat64-bit IEEE 754
floatfloat32floatfloat32-bit IEEE 754
int32int32intintVarint; inefficient for negatives (use sint32)
int64int64longintVarint; inefficient for negatives
uint32/uint64uint32/64int/longintUnsigned varint
sint32/sint64int32/64int/longintZigZag encoding — use for negative numbers
fixed32/fixed64uint32/64int/longintAlways 4/8 bytes — faster for large numbers
boolboolbooleanboolVarint 0 or 1
stringstringStringstrUTF-8 encoded, length-delimited
bytes[]byteByteStringbytesArbitrary binary data

Advanced Field Patterns

Proto3 — Advanced Patterns
// ── repeated: ordered list ── message Order { repeated LineItem items = 1; // generates []LineItem in Go } // ── map: key-value pairs ── message Config { map<string, string> settings = 1; // key must be scalar (not float/bytes) map<int32, FeatureFlag> flags = 2; } // ── oneof: exactly one of these fields is set ── message Notification { string user_id = 1; oneof payload { EmailPayload email = 2; PushPayload push = 3; SMSPayload sms = 4; } // setting one clears others } // ── Well-Known Types (google/protobuf/*.proto) ── import "google/protobuf/timestamp.proto"; import "google/protobuf/duration.proto"; import "google/protobuf/wrappers.proto"; // nullable scalars import "google/protobuf/struct.proto"; // arbitrary JSON-like structure import "google/protobuf/any.proto"; // type-erased container import "google/protobuf/field_mask.proto";// partial updates (like PATCH) message Event { google.protobuf.Timestamp created_at = 1; // UTC timestamp with nanos google.protobuf.Duration duration = 2; // time duration google.protobuf.StringValue label = 3; // nullable string (has presence) google.protobuf.Struct metadata = 4; // arbitrary key-value } // ── reserved: prevent field number reuse ── message LegacyUser { reserved 2, 5, 9 to 11; // these field numbers are retired reserved "old_email", "phone_number"; // retired field names string user_id = 1; }
Proto3 default values: In proto3, all fields have default values (0, false, empty string). There is no way to distinguish "field not set" from "field set to zero" — unless you use optional keyword (proto3 optional, added in protobuf 3.15) or wrapper types. This is a common source of bugs when null-checking is semantically important.
06

Defining Services

// FROM .PROTO TO WORKING SERVER

A gRPC service definition maps directly to a generated interface. On the server side you implement the interface; on the client side the generated stub makes calls look like local function calls, even when they're crossing a network.

Go — Server Implementation
package main import ( "context" "net" "log" "google.golang.org/grpc" "google.golang.org/grpc/codes" "google.golang.org/grpc/status" pb "github.com/acme/payments/gen/go/payments/v1" ) // server implements pb.PaymentServiceServer (generated interface) type server struct { pb.UnimplementedPaymentServiceServer // forward-compat: embed this db *Database } // Unary RPC — one request, one response func (s *server) Charge(ctx context.Context, req *pb.ChargeRequest) (*pb.ChargeResponse, error) { // validate if req.AmountCents <= 0 { return nil, status.Errorf(codes.InvalidArgument, "amount must be positive, got %d", req.AmountCents) } // check deadline / context cancellation if ctx.Err() != nil { return nil, status.FromContextError(ctx.Err()).Err() } txID, err := s.db.Charge(ctx, req.OrderId, req.AmountCents) if err != nil { return nil, status.Errorf(codes.Internal, "charge failed: %v", err) } return &pb.ChargeResponse{TransactionId: txID, Status: pb.ChargeStatus_CHARGE_STATUS_SUCCESS}, nil } func main() { lis, _ := net.Listen("tcp", ":50051") s := grpc.NewServer( grpc.UnaryInterceptor(loggingInterceptor), // middleware grpc.MaxRecvMsgSize(16 * 1024 * 1024), // 16MB max message ) pb.RegisterPaymentServiceServer(s, &server{}) log.Fatal(s.Serve(lis)) }
Go — Client Usage
// Client setup — one connection, reused across all calls conn, err := grpc.NewClient( "api.example.com:443", grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)), grpc.WithUnaryInterceptor(retryInterceptor), ) defer conn.Close() client := pb.NewPaymentServiceClient(conn) // generated stub — thread-safe // Unary call with deadline ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() resp, err := client.Charge(ctx, &pb.ChargeRequest{ OrderId: "ord_abc123", AmountCents: 9999, Currency: "USD", CustomerId: "cust_xyz", Method: pb.PaymentMethod_PAYMENT_METHOD_CARD, }) if err != nil { st := status.Convert(err) log.Printf("RPC failed: code=%s msg=%s", st.Code(), st.Message()) return } fmt.Println(resp.TransactionId)
07

Binary Encoding

// HOW PROTOBUF ACHIEVES COMPACTNESS

Protobuf's compactness comes from its tag-value encoding. Field names are never sent over the wire — only field numbers (tags). An integer that fits in one byte takes one byte. This is why the choice of field number matters: fields 1–15 use one byte for the tag, while 16–2047 use two bytes. Reserve small numbers for frequently-used fields.

Encoding Deep Dive
## Protobuf wire types ## Tag = (field_number << 3) | wire_type Wire Type 0 VARINT int32, int64, uint32, uint64, sint32, sint64, bool, enum Wire Type 1 I64 fixed64, sfixed64, double Wire Type 2 LEN string, bytes, embedded messages, packed repeated Wire Type 5 I32 fixed32, sfixed32, float ## Example: encoding { order_id: "X1" (field 1), amount_cents: 150 (field 2) } # # field 1, string: tag = (1 << 3) | 2 = 0x0A # 0x0A 0x02 0x58 0x31 → tag, length=2, "X" "1" # # field 2, varint: tag = (2 << 3) | 0 = 0x10 # 0x10 0x96 0x01 → tag, varint(150) in 2 bytes # # Total: 7 bytes # Equivalent JSON: {"order_id":"X1","amount_cents":150} = 42 chars = 42 bytes # 6× smaller ## Varint encoding (base-128) ## 150 in binary = 10010110 ## Split into 7-bit groups, LSB first, MSB = "more bytes follow" ## 0010110 → 1001 0110 (MSB=1, more follows) = 0x96 ## 0000001 → 0000 0001 (MSB=0, last byte) = 0x01 ## Wire: 0x96 0x01 ## ZigZag for negative numbers (sint32/sint64) ## Avoids 10-byte varint for small negatives ## n=0 → 0, n=-1 → 1, n=1 → 2, n=-2 → 3 ## Encoding: (n << 1) ^ (n >> 31)
Field number assignment strategy: Fields 1–15 encode in 1 byte (tag only). Fields 16–2047 use 2 bytes. Assign numbers 1–15 to the most frequently populated fields in your most common messages. For an event log message, the timestamp and event type go in 1–2; rarely-populated debug fields go in 100+. This is a micro-optimization but meaningful at scale.
08

The Four RPC Modes

// UNARY, SERVER-STREAM, CLIENT-STREAM, BIDI

gRPC supports four communication patterns. Choosing the right mode is an architectural decision — it affects backpressure, latency, error handling, and how you structure your proto definitions. Start with unary unless you have a specific reason to stream.

Unary RPC Most Common

One request, one response. Behaves like a normal function call. Use for: any CRUD operation, queries, commands where you need a definitive result. Always add a deadline.

rpc GetUser(GetUserRequest) returns (User);
Server-Side Streaming Push Pattern

One request, many responses. Server streams a sequence of messages. Use for: large result sets, live feeds, progress updates, paginated results without client polling.

rpc WatchEvents(Filter) returns (stream Event);
Client-Side Streaming Ingestion

Many requests, one response. Client streams data; server waits for all and responds once. Use for: bulk upload, batch ingestion, file upload in chunks, accumulating metrics.

rpc IngestEvents(stream Event) returns (IngestSummary);
Bidirectional Streaming Advanced

Many requests, many responses, fully independent. Both sides send and receive independently. Use for: chat, collaborative editing, real-time games, live telemetry with acknowledgments.

rpc Chat(stream Message) returns (stream Message);

Server Streaming — Implementation

Go — Server & Client Streaming
// ── Server-side: stream responses to client ── func (s *server) WatchEvents(filter *pb.Filter, stream pb.EventService_WatchEventsServer) error { for event := range s.eventBus.Subscribe(filter) { // Check if client has disconnected or deadline exceeded if stream.Context().Err() != nil { return status.FromContextError(stream.Context().Err()).Err() } if err := stream.Send(event); err != nil { return err // client gone — stop sending } } return nil // stream complete } // ── Client-side: consume the stream ── stream, err := client.WatchEvents(ctx, &pb.Filter{UserId: "usr_123"}) for { event, err := stream.Recv() if err == io.EOF { break // server closed stream normally } if err != nil { log.Printf("stream error: %v", err) break } processEvent(event) } // ── Bidirectional: send and receive concurrently ── stream, _ := client.Chat(ctx) // Goroutine: receive go func() { for { msg, err := stream.Recv(); if err != nil { return }; display(msg) } }() // Main: send for input := range userInputCh { stream.Send(&pb.Message{Text: input}) } stream.CloseSend() // signal done sending; still can receive
09

Metadata & Headers

// PASSING CONTEXT ACROSS THE WIRE

gRPC metadata is analogous to HTTP headers. It carries cross-cutting concerns — auth tokens, trace IDs, request IDs, feature flags — without polluting your proto message definitions. Metadata is sent as HTTP/2 headers and trailers.

Go — Metadata Send & Receive
import "google.golang.org/grpc/metadata" // ── CLIENT: attach metadata to outgoing request ── md := metadata.Pairs( "authorization", "Bearer eyJhbGci...", "x-request-id", uuid.New().String(), "x-trace-id", tracing.TraceIDFromContext(ctx), "x-feature-flags", "new-pricing:true", ) ctx = metadata.NewOutgoingContext(ctx, md) resp, err := client.Charge(ctx, req) // ── SERVER: read incoming metadata ── func (s *server) Charge(ctx context.Context, req *pb.ChargeRequest) (*pb.ChargeResponse, error) { md, ok := metadata.FromIncomingContext(ctx) if !ok { return nil, status.Error(codes.Unauthenticated, "missing metadata") } authHeader := md.Get("authorization") if len(authHeader) == 0 { return nil, status.Error(codes.Unauthenticated, "missing auth") } token := strings.TrimPrefix(authHeader[0], "Bearer ") // Send response metadata (trailers) grpc.SetTrailer(ctx, metadata.Pairs("x-charge-latency-ms", "42")) return &pb.ChargeResponse{...}, nil } // ── Metadata key conventions ── // Keys are lowercase, use hyphens // Binary keys must end in "-bin" (base64 encoded value) // Reserved keys: "content-type", "te", "grpc-*"
10

Error Handling

// STATUS CODES AND RICH ERROR DETAILS

gRPC uses a standardized set of 17 status codes rather than HTTP's 70+ status codes. Every RPC returns a status — on success it's OK (0), on failure it's one of the defined codes. Rich error details can be attached using the google.rpc.Status proto with nested detail messages.

CodeNameUse WhenHTTP Analog
0OKSuccess200
1CANCELLEDClient cancelled the request499
2UNKNOWNServer error not classifiable elsewhere500
3INVALID_ARGUMENTBad input — validation failure (not retryable)400
4DEADLINE_EXCEEDEDDeadline expired before operation completed504
5NOT_FOUNDResource does not exist404
6ALREADY_EXISTSResource already exists (idempotency)409
7PERMISSION_DENIEDAuthenticated but not authorized for this action403
8RESOURCE_EXHAUSTEDRate limit, quota exceeded429
9FAILED_PRECONDITIONSystem not in correct state (not retryable as-is)400
10ABORTEDConcurrency conflict (optimistic locking, retry)409
13INTERNALInternal server error — bug in server500
14UNAVAILABLEServer temporarily unavailable — retryable503
16UNAUTHENTICATEDMissing or invalid credentials401

Rich Error Details

Go — Rich Error with Details
import ( "google.golang.org/grpc/codes" "google.golang.org/grpc/status" errdetails "google.golang.org/genproto/googleapis/rpc/errdetails" ) // Server: return structured validation errors func validateAndCharge(req *pb.ChargeRequest) error { st := status.New(codes.InvalidArgument, "request validation failed") // Attach field violation details — parsed by clients st, _ = st.WithDetails(&errdetails.BadRequest{ FieldViolations: []*errdetails.BadRequest_FieldViolation{ {Field: "amount_cents", Description: "must be between 50 and 99999999"}, {Field: "currency", Description: "must be a valid ISO 4217 code"}, }, }) return st.Err() } // Client: extract details resp, err := client.Charge(ctx, req) if err != nil { st := status.Convert(err) for _, detail := range st.Details() { switch d := detail.(type) { case *errdetails.BadRequest: for _, v := range d.FieldViolations { fmt.Printf("field %s: %s\n", v.Field, v.Description) } case *errdetails.RetryInfo: time.Sleep(d.RetryDelay.AsDuration()) // server told us when to retry } } }
11

Interceptors

// gRPC MIDDLEWARE — LOGGING, AUTH, RETRY, TRACING

Interceptors are gRPC's middleware pattern. They wrap RPC handlers to add cross-cutting behavior — authentication, logging, metrics, tracing, retry logic, and rate limiting — without cluttering business logic. Use go-grpc-middleware/v2 for a battle-tested interceptor chain.

Go — Unary Interceptors
// ── Logging interceptor ── func loggingInterceptor( ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler, ) (interface{}, error) { start := time.Now() resp, err := handler(ctx, req) // call actual handler log.Printf("method=%s latency=%v err=%v", info.FullMethod, time.Since(start), err) return resp, err } // ── Auth interceptor ── func authInterceptor(ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler, ) (interface{}, error) { // Skip for public methods if info.FullMethod == "/payments.v1.PaymentService/Ping" { return handler(ctx, req) } md, _ := metadata.FromIncomingContext(ctx) token := md.Get("authorization") if len(token) == 0 { return nil, status.Error(codes.Unauthenticated, "missing token") } claims, err := jwt.Validate(token[0]) if err != nil { return nil, status.Error(codes.Unauthenticated, "invalid token") } ctx = context.WithValue(ctx, claimsKey, claims) return handler(ctx, req) } // ── Chain interceptors with go-grpc-middleware ── import ( "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/recovery" "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/ratelimit" ) s := grpc.NewServer( grpc.ChainUnaryInterceptor( recovery.UnaryServerInterceptor(), // panic → INTERNAL error authInterceptor, loggingInterceptor, otelgrpc.UnaryServerInterceptor(), // OpenTelemetry tracing ratelimit.UnaryServerInterceptor(limiter), ), grpc.ChainStreamInterceptor( // same pattern for streaming recovery.StreamServerInterceptor(), authStreamInterceptor, ), )
12

TLS & Authentication

// SECURING GRPC IN PRODUCTION

gRPC traffic must be encrypted in production. The framework natively supports TLS and mutual TLS (mTLS). For authentication, gRPC's credentials.PerRPCCredentials interface allows injecting tokens per-call, while mTLS provides cryptographic service identity.

TLS (Server Auth) Minimum

Client verifies server certificate. Encrypts the channel. Suitable when services authenticate via tokens (JWT, OAuth). Standard setup for services behind a service mesh with certificate injection.

mTLS (Mutual Auth) Recommended

Both client and server present and verify certificates. Provides cryptographic service identity — no service can call your gRPC endpoint without a valid certificate from your CA. Required in Zero Trust architectures. Istio/Linkerd can inject mTLS transparently.

Go — TLS Configuration
// ── Server: TLS ── creds, err := credentials.NewServerTLSFromFile("server.crt", "server.key") s := grpc.NewServer(grpc.Creds(creds)) // ── Server: mTLS (require client certificate) ── cert, _ := tls.LoadX509KeyPair("server.crt", "server.key") caCert, _ := os.ReadFile("ca.crt") caCertPool := x509.NewCertPool() caCertPool.AppendCertsFromPEM(caCert) tlsConfig := &tls.Config{ Certificates: []tls.Certificate{cert}, ClientCAs: caCertPool, ClientAuth: tls.RequireAndVerifyClientCert, // mTLS enforcement MinVersion: tls.VersionTLS13, } s := grpc.NewServer(grpc.Creds(credentials.NewTLS(tlsConfig))) // ── Client: JWT per-call credentials ── type jwtCreds struct{ token string } func (j jwtCreds) GetRequestMetadata(ctx context.Context, uri ...string) (map[string]string, error) { return map[string]string{"authorization": "Bearer " + j.token}, nil } func (j jwtCreds) RequireTransportSecurity() bool { return true } // enforce TLS conn, _ := grpc.NewClient(addr, grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)), grpc.WithPerRPCCredentials(jwtCreds{token: tokenProvider.Get()}), )
13

Load Balancing

// CLIENT-SIDE vs PROXY — gRPC IS DIFFERENT

gRPC load balancing works differently from HTTP/1.1 because connections are long-lived and multiplexed. A single TCP connection carries all requests — traditional L4 load balancers see one connection per client and route all its traffic to one server. gRPC requires L7 (application layer) load balancing or client-side load balancing.

Client-Side Load Balancing Microservices

The gRPC client resolves all backend addresses (via DNS SRV, xDS, Consul) and distributes RPCs across them using round-robin or least-load policies. Each backend gets its own connection. No proxy latency.

// DNS round-robin (headless service in K8s) conn, _ := grpc.NewClient( "dns:///payments-svc:50051", grpc.WithDefaultServiceConfig(`{"loadBalancingConfig":[{"round_robin":{}}]}`), grpc.WithTransportCredentials(...), )
Proxy Load Balancing Recommended

Envoy or other L7 proxies understand gRPC and distribute individual RPCs (not connections) across backends. Simpler client configuration, built-in observability, health checks, circuit breaking. Standard in Kubernetes with a service mesh.

Kubernetes Headless Services: For client-side LB in K8s, use a headless service (clusterIP: None) instead of a ClusterIP service. A headless service returns all pod IPs in DNS resolution — the gRPC client gets all addresses and distributes across them. A ClusterIP service returns a single virtual IP, which breaks per-RPC balancing.
Kubernetes — Headless Service for gRPC
apiVersion: v1 kind: Service metadata: name: payments-grpc spec: clusterIP: None # headless — returns all pod IPs via DNS selector: app: payments-service ports: - name: grpc port: 50051 protocol: TCP --- # Connect from another service: # "dns:///payments-grpc.namespace.svc.cluster.local:50051" # gRPC client resolves all pod IPs and balances across them
14

Observability

// METRICS, TRACING, AND REFLECTION

gRPC generates rich telemetry — every RPC has a method name, status code, and latency. The gRPC ecosystem integrates natively with OpenTelemetry for tracing and Prometheus for metrics.

Prometheus Metrics

Use go-grpc-prometheus or OTel SDK to expose per-method histograms: grpc_server_handled_total (by method, status), grpc_server_handling_seconds (latency buckets), and grpc_server_msg_received_total.

OpenTelemetry Tracing

Trace context propagates via gRPC metadata headers (traceparent, tracestate). go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc instruments all calls automatically.

gRPC Reflection

Enable reflection.Register(s) on your server. Tools like grpcurl and grpc-ui can discover all services and methods at runtime without the proto file — essential for debugging and testing.

Bash — grpcurl (curl for gRPC)
# List all services (requires server-side reflection) grpcurl -plaintext localhost:50051 list # Describe a service grpcurl -plaintext localhost:50051 describe payments.v1.PaymentService # Call a unary RPC with JSON body grpcurl -plaintext -d '{"order_id":"ord_123","amount_cents":999,"currency":"USD"}' \ localhost:50051 payments.v1.PaymentService/Charge # Call with metadata (auth header) grpcurl -plaintext \ -H 'authorization: Bearer eyJhbGci...' \ -d '{"user_id":"u1"}' \ localhost:50051 users.v1.UserService/GetUser # Stream responses grpcurl -plaintext -d '{"user_id":"u1"}' \ localhost:50051 events.v1.EventService/WatchEvents # Use proto file directly (no reflection needed) grpcurl -import-path ./proto -proto payments.proto \ -plaintext -d '{}' localhost:50051 payments.v1.PaymentService/Ping
15

Schema Evolution

// BACKWARD & FORWARD COMPATIBILITY RULES

Protobuf was designed for schema evolution in distributed systems. Old clients can talk to new servers and vice versa — as long as you follow the rules. The core rule: field numbers are permanent. Never change a field number or re-use a retired one.

✓ Safe Changes Backward Compat
  • Add new fields with new field numbers
  • Delete a field (mark as reserved to prevent re-use)
  • Add new values to an enum (but not at 0)
  • Add a new RPC method to a service
  • Change a field between int32 and int64 (same wire type)
  • Change a field from scalar to repeated (compatible wire format)
✗ Breaking Changes Never Do
  • Change a field number — silently corrupts data
  • Change a field's type to an incompatible wire type
  • Remove an enum value that clients still use
  • Change enum value 0 meaning (it's the default)
  • Remove a required RPC method (proto2 required fields)
  • Re-use a retired field number for a different field

API Versioning Strategy

Versioning Best Practices
## Package versioning: embed version in the package name package payments.v1; // stable API package payments.v2; // breaking change requires new major version ## Within a version, use field evolution (no v1 → v2 bump needed): message ChargeRequest { string order_id = 1; // original int64 amount_cents = 2; // original string currency = 3; // original // v1.1 additions — old clients ignore these (unknown fields preserved) string idempotency_key = 4; // new in v1.1, optional map<string,string> metadata = 5; // new in v1.2 // retired field — was "customer_name" = 6 reserved 6; reserved "customer_name"; // v1.3 replacement string customer_id = 7; // new field 7, not reusing 6 } ## Unknown fields: proto3 preserves unknown fields by default ## New server receives old client message → unknown new fields = zero values (safe) ## Old client receives new server message → unknown new fields preserved, then ## re-sent to another server — so a proxy never strips data it doesn't understand ## Use buf for schema registry and breaking change detection ## buf breaking --against .git#branch=main ← CI gate
Use buf instead of raw protoc. The buf CLI from Buf Technologies handles dependency management, code generation across languages, linting (enforcing naming conventions), and breaking change detection as a CI gate. buf breaking --against .git#branch=main will fail if a commit introduces a backward-incompatible change to any proto file.
16

When to Use gRPC

// DECISION FRAMEWORK

gRPC is a powerful but opinionated choice. Use this framework to make the decision. The question is never "is gRPC better than REST?" — it's whether gRPC's tradeoffs serve your specific communication pattern.

ScenarioRecommendationReason
Internal microservice-to-microservice USE gRPC Performance, type safety, streaming, language agnosticism
Public-facing external API Use REST Universal browser/client support, developer ergonomics
Mobile app to backend Consider gRPC-Swift/Kotlin exist; REST is simpler; gRPC wins on bandwidth
Real-time streaming / push notifications USE gRPC Server/bidi streaming are first-class; SSE/WS are bolt-ons in REST
Large binary payload transfer USE gRPC Streaming + binary + no JSON encoding overhead
Simple CRUD, small team, fast iteration Use REST Lower tooling overhead, easier debugging, familiar mental model
Polyglot service mesh USE gRPC One proto = client/server in 10+ languages; no OpenAPI drift
Browser frontend to service Use REST or GraphQL gRPC-Web requires proxy; REST/GraphQL are native
AI inference / ML model serving USE gRPC Triton, TF Serving, Ray Serve all use gRPC natively; binary for tensors
Common Mistakes Avoid
  • Using gRPC for a public API without a REST gateway
  • Not setting deadlines on every unary call
  • Re-using field numbers after deletion
  • Using gRPC through a L4-only load balancer
  • Not enabling server-side reflection in dev/staging
  • Ignoring streaming backpressure — server floods slow client
  • Embedding business data in metadata (use message fields)
  • Running without TLS in any networked environment
Production Checklist Do These
  • Set deadlines on every client call (never infinite wait)
  • Embed UnimplementedXxxServer for forward compatibility
  • Use buf breaking in CI to catch schema regressions
  • Enable gRPC health checking (grpc_health_v1)
  • Use mTLS or a service mesh for service-to-service auth
  • Instrument with OTel for distributed tracing across calls
  • Limit max message size explicitly (default 4MB)
  • Use Kubernetes headless services for client-side load balancing
Reference implementations & tooling: grpc.io (official docs), buf.build (modern proto toolchain), grpcurl (CLI testing), grpc-ui (web UI for gRPC), Evans (interactive gRPC client), go-grpc-middleware/v2 (interceptor library), connectrpc.com (gRPC-compatible protocol with REST support), google.golang.org/grpc (Go reference implementation).