gRPC

High-Performance RPC Framework

gRPC &
Protobuf

// Contract-first. Binary. Multiplexed. Streaming-native.

The definitive guide to building high-performance inter-service communication with gRPC and Protocol Buffers. Covers the binary wire format, all four RPC modes, production hardening, and schema evolution strategies for microservice architectures.

proto3 HTTP/2 Bidirectional Streaming 10× faster than JSON Code Generation CNCF Graduated

What is gRPC

// GOOGLE'S OPEN-SOURCE RPC FRAMEWORK

gRPC (Google Remote Procedure Call) is a high-performance, open-source RPC framework developed by Google, now a CNCF graduated project. It uses Protocol Buffers as the interface definition language and binary serialization format, transmitted over HTTP/2. Released publicly in 2016, it has become the dominant wire protocol for inter-microservice communication in polyglot environments.

Contract-First Design

You define your API in a .proto file — the contract. The protoc compiler generates strongly-typed client and server stubs in any supported language. The schema is the documentation.

Binary Wire Format

Protocol Buffers serialize to a compact binary format — typically 3–10× smaller than equivalent JSON, with significantly faster serialization and deserialization. No parsing ambiguity, no whitespace overhead.

Polyglot Native

Official code generation for Go, Java, Python, C++, C#, Node.js, Ruby, Swift, Kotlin, Dart, and more. One .proto file generates idiomatic clients and servers across your entire stack.

📋

Proto IDL

Schema definition and code generation

⚡

HTTP/2

Multiplexing and flow control

🔁

Streaming

4 RPC modes including bidi

🛡️

TLS + Auth

mTLS and token interceptors

▸

gRPC is not just for microservices. It's used for mobile-to-backend (gRPC-Web), internal tooling, database protocols (CockroachDB, TiKV), streaming pipelines, and AI inference APIs (NVIDIA Triton). If two systems need to talk fast and reliably, gRPC is a strong default.

Why gRPC over REST

// WHEN THE TRADEOFFS FAVOR RPC

REST over HTTP/1.1 with JSON is excellent for public APIs, browser clients, and developer ergonomics. gRPC wins decisively for internal service-to-service communication where performance, type safety, and streaming matter. The question is not "which is better" — it's which tradeoffs serve your use case.

REST + JSON (HTTP/1.1)

Human-readable, easy to debug with curl
One request per connection (without pipelining)
JSON parsing overhead — always text → object
No enforced schema (OpenAPI is optional)
No native streaming — SSE/WebSocket are bolt-ons
Large payloads — field names repeated per object
Versioning via URL (/v1/, /v2/) is messy
Excellent browser and public API support

gRPC + Protobuf (HTTP/2)

Binary — requires tooling to inspect (grpcurl, grpc-ui)
Multiplexed — many requests over one connection
Protobuf: up to 10× faster parse, 3–10× smaller
Enforced typed contract — compiler catches breakage
First-class streaming in all four directions
Field numbers encode keys — tiny wire size
Backward-compatible field evolution via field numbers
No native browser support (gRPC-Web workaround)

Performance At a Glance

Payload size

~7× smaller

Serialize speed

~5× faster

Connection reuse

Multiplexed

Type safety

Enforced

Browser support

gRPC-Web

⚠

Don't use gRPC for public APIs. If third-party developers, mobile browsers, or external systems need to call your API, REST/JSON remains the right choice. gRPC shines at the internal seam between your own services. Many teams run both: gRPC internally, REST externally via a gateway (Envoy, Kong, AWS API Gateway).

HTTP/2 Under the Hood

// WHY THE TRANSPORT LAYER MATTERS

gRPC is built entirely on HTTP/2. Understanding the transport layer explains why gRPC achieves its performance characteristics and why HTTP/1.1-based REST cannot replicate them without fundamental changes.

Multiplexing Key Feature

Multiple concurrent RPC calls share a single TCP connection via streams. HTTP/1.1 requires one connection per concurrent request (or connection pooling hacks). HTTP/2 eliminates head-of-line blocking at the application layer — a slow call doesn't block other calls on the same connection.

Header Compression (HPACK)

HTTP/2 uses HPACK to compress headers. In gRPC, metadata (auth tokens, trace IDs) is sent as headers. HPACK maintains a shared compression context between client and server, meaning repeated headers (like content-type: application/grpc) cost near-zero after the first request.

Binary Framing Layer

HTTP/2 transmits data in binary frames, not text. Each gRPC message is wrapped in a 5-byte length-prefix (1 byte compression flag + 4 byte length), then sent as HTTP/2 DATA frames. This enables precise length-delimited message framing without text parsing.

Flow Control & Push

HTTP/2 has per-stream and connection-level flow control — the receiver advertises how much data it can accept. This is critical for gRPC streaming: a slow consumer can signal backpressure to the sender without dropping messages or needing application-level throttling.

gRPC Frame Format on the Wire

Wire Format

## gRPC message framing (Length-Prefixed Message)
## Sits between the HTTP/2 DATA frames and the Protobuf payload

┌─────────────────────────────────────────────────────────────┐
│  Byte 0        │ Compressed-Flag (0 = no compression, 1 = gzip/snappy) │
│  Bytes 1–4     │ Message-Length (big-endian uint32, length of proto msg) │
│  Bytes 5–N     │ Serialized Protobuf message                             │
└─────────────────────────────────────────────────────────────┘

## HTTP/2 headers for a gRPC request
:method         POST
:scheme         https
:path           /helloworld.Greeter/SayHello
:authority      api.example.com
content-type    application/grpc           # required by gRPC spec
grpc-timeout    5S                         # optional: 5-second deadline
grpc-encoding   gzip                       # optional: message compression
authorization   Bearer <token>            # custom metadata as headers

## HTTP/2 trailers for gRPC response status
grpc-status     0                          # 0 = OK, non-zero = error
grpc-message    ""                         # human-readable error message

ℹ

gRPC-Web: Browsers cannot access HTTP/2 trailers, which gRPC uses for status codes. gRPC-Web works around this by encoding trailers in the response body and using an HTTP/1.1-compatible proxy (Envoy, grpc-gateway). It supports unary and server-streaming, but not client or bidirectional streaming in the browser.

Proto3 Syntax

// WRITING YOUR FIRST .PROTO FILE

A .proto file is the single source of truth for your service contract. It defines messages (data structures) and services (RPC endpoints). The protoc compiler and language-specific plugins generate all boilerplate — you only write the business logic.

Protocol Buffers — Complete Example

syntax = "proto3";                          // always specify; proto2 is legacy

package payments.v1;                          // logical namespace

option go_package = "github.com/acme/payments/gen/go/payments/v1;paymentsv1";
option java_package = "com.acme.payments.v1";

import "google/protobuf/timestamp.proto";    // well-known types
import "google/protobuf/money.proto";

// ── Message definitions ──
message ChargeRequest {
  string   order_id     = 1;               // field number — NEVER changes
  int64    amount_cents  = 2;               // snake_case names, camelCase generated
  string   currency      = 3;
  string   customer_id   = 4;
  PaymentMethod method   = 5;               // nested enum
  repeated string tags   = 6;              // repeated = list/array
  // field numbers 7–15 reserved for future use (1-byte encoding)
}

message ChargeResponse {
  string   transaction_id = 1;
  ChargeStatus status     = 2;
  google.protobuf.Timestamp processed_at = 3;
  optional string failure_reason = 4;     // proto3 optional — has presence
}

enum PaymentMethod {
  PAYMENT_METHOD_UNSPECIFIED = 0;          // enums MUST start at 0
  PAYMENT_METHOD_CARD        = 1;
  PAYMENT_METHOD_BANK        = 2;
  PAYMENT_METHOD_CRYPTO      = 3;
}

enum ChargeStatus {
  CHARGE_STATUS_UNSPECIFIED = 0;
  CHARGE_STATUS_SUCCESS     = 1;
  CHARGE_STATUS_DECLINED    = 2;
  CHARGE_STATUS_PENDING     = 3;
}

// ── Service definition ──
service PaymentService {
  rpc Charge(ChargeRequest) returns (ChargeResponse);
  rpc StreamTransactions(TransactionFilter) returns (stream Transaction);
  rpc BatchCharge(stream ChargeRequest) returns (BatchChargeResponse);
}

▸

Code generation: Run protoc --go_out=. --go-grpc_out=. payments.proto (or use buf generate — the modern alternative). The compiler produces a payments.pb.go (messages) and payments_grpc.pb.go (client/server stubs). Never edit generated files — regenerate from the proto.

Field Types & Rules

// SCALARS, WELL-KNOWN TYPES, ONEOF, MAP

Scalar Types

Proto Type	Go	Java	Python	Notes
`double`	float64	double	float	64-bit IEEE 754
`float`	float32	float	float	32-bit IEEE 754
`int32`	int32	int	int	Varint; inefficient for negatives (use sint32)
`int64`	int64	long	int	Varint; inefficient for negatives
`uint32/uint64`	uint32/64	int/long	int	Unsigned varint
`sint32/sint64`	int32/64	int/long	int	ZigZag encoding — use for negative numbers
`fixed32/fixed64`	uint32/64	int/long	int	Always 4/8 bytes — faster for large numbers
`bool`	bool	boolean	bool	Varint 0 or 1
`string`	string	String	str	UTF-8 encoded, length-delimited
`bytes`	[]byte	ByteString	bytes	Arbitrary binary data

Advanced Field Patterns

Proto3 — Advanced Patterns

// ── repeated: ordered list ──
message Order {
  repeated LineItem items = 1;            // generates []LineItem in Go
}

// ── map: key-value pairs ──
message Config {
  map<string, string> settings = 1;       // key must be scalar (not float/bytes)
  map<int32, FeatureFlag> flags = 2;
}

// ── oneof: exactly one of these fields is set ──
message Notification {
  string user_id = 1;
  oneof payload {
    EmailPayload  email  = 2;
    PushPayload   push   = 3;
    SMSPayload    sms    = 4;
  }                                        // setting one clears others
}

// ── Well-Known Types (google/protobuf/*.proto) ──
import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/wrappers.proto";  // nullable scalars
import "google/protobuf/struct.proto";    // arbitrary JSON-like structure
import "google/protobuf/any.proto";       // type-erased container
import "google/protobuf/field_mask.proto";// partial updates (like PATCH)

message Event {
  google.protobuf.Timestamp created_at = 1;  // UTC timestamp with nanos
  google.protobuf.Duration  duration    = 2;  // time duration
  google.protobuf.StringValue label     = 3;  // nullable string (has presence)
  google.protobuf.Struct    metadata    = 4;  // arbitrary key-value
}

// ── reserved: prevent field number reuse ──
message LegacyUser {
  reserved 2, 5, 9 to 11;              // these field numbers are retired
  reserved "old_email", "phone_number"; // retired field names
  string user_id = 1;
}

⚠

Proto3 default values: In proto3, all fields have default values (0, false, empty string). There is no way to distinguish "field not set" from "field set to zero" — unless you use optional keyword (proto3 optional, added in protobuf 3.15) or wrapper types. This is a common source of bugs when null-checking is semantically important.

Defining Services

// FROM .PROTO TO WORKING SERVER

A gRPC service definition maps directly to a generated interface. On the server side you implement the interface; on the client side the generated stub makes calls look like local function calls, even when they're crossing a network.

Go — Server Implementation

package main

import (
  "context"
  "net"
  "log"
  "google.golang.org/grpc"
  "google.golang.org/grpc/codes"
  "google.golang.org/grpc/status"
  pb "github.com/acme/payments/gen/go/payments/v1"
)

// server implements pb.PaymentServiceServer (generated interface)
type server struct {
  pb.UnimplementedPaymentServiceServer // forward-compat: embed this
  db *Database
}

// Unary RPC — one request, one response
func (s *server) Charge(ctx context.Context, req *pb.ChargeRequest) (*pb.ChargeResponse, error) {
  // validate
  if req.AmountCents <= 0 {
    return nil, status.Errorf(codes.InvalidArgument, "amount must be positive, got %d", req.AmountCents)
  }
  // check deadline / context cancellation
  if ctx.Err() != nil {
    return nil, status.FromContextError(ctx.Err()).Err()
  }
  txID, err := s.db.Charge(ctx, req.OrderId, req.AmountCents)
  if err != nil {
    return nil, status.Errorf(codes.Internal, "charge failed: %v", err)
  }
  return &pb.ChargeResponse{TransactionId: txID, Status: pb.ChargeStatus_CHARGE_STATUS_SUCCESS}, nil
}

func main() {
  lis, _ := net.Listen("tcp", ":50051")
  s := grpc.NewServer(
    grpc.UnaryInterceptor(loggingInterceptor),   // middleware
    grpc.MaxRecvMsgSize(16 * 1024 * 1024),       // 16MB max message
  )
  pb.RegisterPaymentServiceServer(s, &server{})
  log.Fatal(s.Serve(lis))
}

Go — Client Usage

// Client setup — one connection, reused across all calls
conn, err := grpc.NewClient(
  "api.example.com:443",
  grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)),
  grpc.WithUnaryInterceptor(retryInterceptor),
)
defer conn.Close()

client := pb.NewPaymentServiceClient(conn)  // generated stub — thread-safe

// Unary call with deadline
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

resp, err := client.Charge(ctx, &pb.ChargeRequest{
  OrderId:     "ord_abc123",
  AmountCents: 9999,
  Currency:    "USD",
  CustomerId:  "cust_xyz",
  Method:      pb.PaymentMethod_PAYMENT_METHOD_CARD,
})
if err != nil {
  st := status.Convert(err)
  log.Printf("RPC failed: code=%s msg=%s", st.Code(), st.Message())
  return
}
fmt.Println(resp.TransactionId)

Binary Encoding

// HOW PROTOBUF ACHIEVES COMPACTNESS

Protobuf's compactness comes from its tag-value encoding. Field names are never sent over the wire — only field numbers (tags). An integer that fits in one byte takes one byte. This is why the choice of field number matters: fields 1–15 use one byte for the tag, while 16–2047 use two bytes. Reserve small numbers for frequently-used fields.

Encoding Deep Dive

## Protobuf wire types
## Tag = (field_number << 3) | wire_type

Wire Type 0  VARINT    int32, int64, uint32, uint64, sint32, sint64, bool, enum
Wire Type 1  I64       fixed64, sfixed64, double
Wire Type 2  LEN       string, bytes, embedded messages, packed repeated
Wire Type 5  I32       fixed32, sfixed32, float

## Example: encoding { order_id: "X1" (field 1), amount_cents: 150 (field 2) }
#
# field 1, string: tag = (1 << 3) | 2 = 0x0A
#   0x0A 0x02 0x58 0x31   → tag, length=2, "X" "1"
#
# field 2, varint: tag = (2 << 3) | 0 = 0x10
#   0x10 0x96 0x01         → tag, varint(150) in 2 bytes
#
# Total: 7 bytes
# Equivalent JSON: {"order_id":"X1","amount_cents":150} = 42 chars = 42 bytes
# 6× smaller

## Varint encoding (base-128)
## 150 in binary = 10010110
## Split into 7-bit groups, LSB first, MSB = "more bytes follow"
## 0010110 → 1001 0110 (MSB=1, more follows) = 0x96
## 0000001 → 0000 0001 (MSB=0, last byte)    = 0x01
## Wire: 0x96 0x01

## ZigZag for negative numbers (sint32/sint64)
## Avoids 10-byte varint for small negatives
## n=0 → 0, n=-1 → 1, n=1 → 2, n=-2 → 3
## Encoding: (n << 1) ^ (n >> 31)

▸

Field number assignment strategy: Fields 1–15 encode in 1 byte (tag only). Fields 16–2047 use 2 bytes. Assign numbers 1–15 to the most frequently populated fields in your most common messages. For an event log message, the timestamp and event type go in 1–2; rarely-populated debug fields go in 100+. This is a micro-optimization but meaningful at scale.

The Four RPC Modes

// UNARY, SERVER-STREAM, CLIENT-STREAM, BIDI

gRPC supports four communication patterns. Choosing the right mode is an architectural decision — it affects backpressure, latency, error handling, and how you structure your proto definitions. Start with unary unless you have a specific reason to stream.

Unary RPC Most Common

One request, one response. Behaves like a normal function call. Use for: any CRUD operation, queries, commands where you need a definitive result. Always add a deadline.

rpc GetUser(GetUserRequest) returns (User);

Server-Side Streaming Push Pattern

One request, many responses. Server streams a sequence of messages. Use for: large result sets, live feeds, progress updates, paginated results without client polling.

rpc WatchEvents(Filter) returns (stream Event);

Client-Side Streaming Ingestion

Many requests, one response. Client streams data; server waits for all and responds once. Use for: bulk upload, batch ingestion, file upload in chunks, accumulating metrics.

rpc IngestEvents(stream Event) returns (IngestSummary);

Bidirectional Streaming Advanced

Many requests, many responses, fully independent. Both sides send and receive independently. Use for: chat, collaborative editing, real-time games, live telemetry with acknowledgments.

rpc Chat(stream Message) returns (stream Message);

Server Streaming — Implementation

Go — Server & Client Streaming

// ── Server-side: stream responses to client ──
func (s *server) WatchEvents(filter *pb.Filter, stream pb.EventService_WatchEventsServer) error {
  for event := range s.eventBus.Subscribe(filter) {
    // Check if client has disconnected or deadline exceeded
    if stream.Context().Err() != nil {
      return status.FromContextError(stream.Context().Err()).Err()
    }
    if err := stream.Send(event); err != nil {
      return err  // client gone — stop sending
    }
  }
  return nil  // stream complete
}

// ── Client-side: consume the stream ──
stream, err := client.WatchEvents(ctx, &pb.Filter{UserId: "usr_123"})
for {
  event, err := stream.Recv()
  if err == io.EOF {
    break  // server closed stream normally
  }
  if err != nil {
    log.Printf("stream error: %v", err)
    break
  }
  processEvent(event)
}

// ── Bidirectional: send and receive concurrently ──
stream, _ := client.Chat(ctx)

// Goroutine: receive
go func() {
  for { msg, err := stream.Recv(); if err != nil { return }; display(msg) }
}()

// Main: send
for input := range userInputCh {
  stream.Send(&pb.Message{Text: input})
}
stream.CloseSend()  // signal done sending; still can receive

Metadata & Headers

// PASSING CONTEXT ACROSS THE WIRE

gRPC metadata is analogous to HTTP headers. It carries cross-cutting concerns — auth tokens, trace IDs, request IDs, feature flags — without polluting your proto message definitions. Metadata is sent as HTTP/2 headers and trailers.

Go — Metadata Send & Receive

import "google.golang.org/grpc/metadata"

// ── CLIENT: attach metadata to outgoing request ──
md := metadata.Pairs(
  "authorization",  "Bearer eyJhbGci...",
  "x-request-id",   uuid.New().String(),
  "x-trace-id",     tracing.TraceIDFromContext(ctx),
  "x-feature-flags", "new-pricing:true",
)
ctx = metadata.NewOutgoingContext(ctx, md)
resp, err := client.Charge(ctx, req)

// ── SERVER: read incoming metadata ──
func (s *server) Charge(ctx context.Context, req *pb.ChargeRequest) (*pb.ChargeResponse, error) {
  md, ok := metadata.FromIncomingContext(ctx)
  if !ok { return nil, status.Error(codes.Unauthenticated, "missing metadata") }

  authHeader := md.Get("authorization")
  if len(authHeader) == 0 { return nil, status.Error(codes.Unauthenticated, "missing auth") }
  token := strings.TrimPrefix(authHeader[0], "Bearer ")

  // Send response metadata (trailers)
  grpc.SetTrailer(ctx, metadata.Pairs("x-charge-latency-ms", "42"))
  return &pb.ChargeResponse{...}, nil
}

// ── Metadata key conventions ──
// Keys are lowercase, use hyphens
// Binary keys must end in "-bin" (base64 encoded value)
// Reserved keys: "content-type", "te", "grpc-*"

Error Handling

// STATUS CODES AND RICH ERROR DETAILS

gRPC uses a standardized set of 17 status codes rather than HTTP's 70+ status codes. Every RPC returns a status — on success it's OK (0), on failure it's one of the defined codes. Rich error details can be attached using the google.rpc.Status proto with nested detail messages.

Code	Name	Use When	HTTP Analog
`0`	OK	Success	200
`1`	CANCELLED	Client cancelled the request	499
`2`	UNKNOWN	Server error not classifiable elsewhere	500
`3`	INVALID_ARGUMENT	Bad input — validation failure (not retryable)	400
`4`	DEADLINE_EXCEEDED	Deadline expired before operation completed	504
`5`	NOT_FOUND	Resource does not exist	404
`6`	ALREADY_EXISTS	Resource already exists (idempotency)	409
`7`	PERMISSION_DENIED	Authenticated but not authorized for this action	403
`8`	RESOURCE_EXHAUSTED	Rate limit, quota exceeded	429
`9`	FAILED_PRECONDITION	System not in correct state (not retryable as-is)	400
`10`	ABORTED	Concurrency conflict (optimistic locking, retry)	409
`13`	INTERNAL	Internal server error — bug in server	500
`14`	UNAVAILABLE	Server temporarily unavailable — retryable	503
`16`	UNAUTHENTICATED	Missing or invalid credentials	401

Rich Error Details

Go — Rich Error with Details

import (
  "google.golang.org/grpc/codes"
  "google.golang.org/grpc/status"
  errdetails "google.golang.org/genproto/googleapis/rpc/errdetails"
)

// Server: return structured validation errors
func validateAndCharge(req *pb.ChargeRequest) error {
  st := status.New(codes.InvalidArgument, "request validation failed")

  // Attach field violation details — parsed by clients
  st, _ = st.WithDetails(&errdetails.BadRequest{
    FieldViolations: []*errdetails.BadRequest_FieldViolation{
      {Field: "amount_cents", Description: "must be between 50 and 99999999"},
      {Field: "currency",     Description: "must be a valid ISO 4217 code"},
    },
  })
  return st.Err()
}

// Client: extract details
resp, err := client.Charge(ctx, req)
if err != nil {
  st := status.Convert(err)
  for _, detail := range st.Details() {
    switch d := detail.(type) {
    case *errdetails.BadRequest:
      for _, v := range d.FieldViolations {
        fmt.Printf("field %s: %s\n", v.Field, v.Description)
      }
    case *errdetails.RetryInfo:
      time.Sleep(d.RetryDelay.AsDuration())  // server told us when to retry
    }
  }
}

Interceptors

// gRPC MIDDLEWARE — LOGGING, AUTH, RETRY, TRACING

Interceptors are gRPC's middleware pattern. They wrap RPC handlers to add cross-cutting behavior — authentication, logging, metrics, tracing, retry logic, and rate limiting — without cluttering business logic. Use go-grpc-middleware/v2 for a battle-tested interceptor chain.

Go — Unary Interceptors

// ── Logging interceptor ──
func loggingInterceptor(
  ctx context.Context, req interface{},
  info *grpc.UnaryServerInfo, handler grpc.UnaryHandler,
) (interface{}, error) {
  start := time.Now()
  resp, err := handler(ctx, req)              // call actual handler
  log.Printf("method=%s latency=%v err=%v", info.FullMethod, time.Since(start), err)
  return resp, err
}

// ── Auth interceptor ──
func authInterceptor(ctx context.Context, req interface{},
  info *grpc.UnaryServerInfo, handler grpc.UnaryHandler,
) (interface{}, error) {
  // Skip for public methods
  if info.FullMethod == "/payments.v1.PaymentService/Ping" {
    return handler(ctx, req)
  }
  md, _ := metadata.FromIncomingContext(ctx)
  token := md.Get("authorization")
  if len(token) == 0 { return nil, status.Error(codes.Unauthenticated, "missing token") }
  claims, err := jwt.Validate(token[0])
  if err != nil { return nil, status.Error(codes.Unauthenticated, "invalid token") }
  ctx = context.WithValue(ctx, claimsKey, claims)
  return handler(ctx, req)
}

// ── Chain interceptors with go-grpc-middleware ──
import (
  "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/recovery"
  "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/ratelimit"
)
s := grpc.NewServer(
  grpc.ChainUnaryInterceptor(
    recovery.UnaryServerInterceptor(),   // panic → INTERNAL error
    authInterceptor,
    loggingInterceptor,
    otelgrpc.UnaryServerInterceptor(),   // OpenTelemetry tracing
    ratelimit.UnaryServerInterceptor(limiter),
  ),
  grpc.ChainStreamInterceptor(           // same pattern for streaming
    recovery.StreamServerInterceptor(),
    authStreamInterceptor,
  ),
)

TLS & Authentication

// SECURING GRPC IN PRODUCTION

gRPC traffic must be encrypted in production. The framework natively supports TLS and mutual TLS (mTLS). For authentication, gRPC's credentials.PerRPCCredentials interface allows injecting tokens per-call, while mTLS provides cryptographic service identity.

TLS (Server Auth) Minimum

Client verifies server certificate. Encrypts the channel. Suitable when services authenticate via tokens (JWT, OAuth). Standard setup for services behind a service mesh with certificate injection.

mTLS (Mutual Auth) Recommended

Both client and server present and verify certificates. Provides cryptographic service identity — no service can call your gRPC endpoint without a valid certificate from your CA. Required in Zero Trust architectures. Istio/Linkerd can inject mTLS transparently.

Go — TLS Configuration

// ── Server: TLS ──
creds, err := credentials.NewServerTLSFromFile("server.crt", "server.key")
s := grpc.NewServer(grpc.Creds(creds))

// ── Server: mTLS (require client certificate) ──
cert, _ := tls.LoadX509KeyPair("server.crt", "server.key")
caCert, _ := os.ReadFile("ca.crt")
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCert)

tlsConfig := &tls.Config{
  Certificates: []tls.Certificate{cert},
  ClientCAs:    caCertPool,
  ClientAuth:   tls.RequireAndVerifyClientCert,  // mTLS enforcement
  MinVersion:   tls.VersionTLS13,
}
s := grpc.NewServer(grpc.Creds(credentials.NewTLS(tlsConfig)))

// ── Client: JWT per-call credentials ──
type jwtCreds struct{ token string }
func (j jwtCreds) GetRequestMetadata(ctx context.Context, uri ...string) (map[string]string, error) {
  return map[string]string{"authorization": "Bearer " + j.token}, nil
}
func (j jwtCreds) RequireTransportSecurity() bool { return true }  // enforce TLS

conn, _ := grpc.NewClient(addr,
  grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)),
  grpc.WithPerRPCCredentials(jwtCreds{token: tokenProvider.Get()}),
)

Load Balancing

// CLIENT-SIDE vs PROXY — gRPC IS DIFFERENT

gRPC load balancing works differently from HTTP/1.1 because connections are long-lived and multiplexed. A single TCP connection carries all requests — traditional L4 load balancers see one connection per client and route all its traffic to one server. gRPC requires L7 (application layer) load balancing or client-side load balancing.

Client-Side Load Balancing Microservices

The gRPC client resolves all backend addresses (via DNS SRV, xDS, Consul) and distributes RPCs across them using round-robin or least-load policies. Each backend gets its own connection. No proxy latency.

// DNS round-robin (headless service in K8s)
conn, _ := grpc.NewClient(
  "dns:///payments-svc:50051",
  grpc.WithDefaultServiceConfig(`{"loadBalancingConfig":[{"round_robin":{}}]}`),
  grpc.WithTransportCredentials(...),
)

Proxy Load Balancing Recommended

Envoy or other L7 proxies understand gRPC and distribute individual RPCs (not connections) across backends. Simpler client configuration, built-in observability, health checks, circuit breaking. Standard in Kubernetes with a service mesh.

ℹ

Kubernetes Headless Services: For client-side LB in K8s, use a headless service (clusterIP: None) instead of a ClusterIP service. A headless service returns all pod IPs in DNS resolution — the gRPC client gets all addresses and distributes across them. A ClusterIP service returns a single virtual IP, which breaks per-RPC balancing.

Kubernetes — Headless Service for gRPC

apiVersion: v1
kind: Service
metadata:
  name: payments-grpc
spec:
  clusterIP: None           # headless — returns all pod IPs via DNS
  selector:
    app: payments-service
  ports:
    - name: grpc
      port: 50051
      protocol: TCP
---
# Connect from another service:
# "dns:///payments-grpc.namespace.svc.cluster.local:50051"
# gRPC client resolves all pod IPs and balances across them

Observability

// METRICS, TRACING, AND REFLECTION

gRPC generates rich telemetry — every RPC has a method name, status code, and latency. The gRPC ecosystem integrates natively with OpenTelemetry for tracing and Prometheus for metrics.

Prometheus Metrics

Use go-grpc-prometheus or OTel SDK to expose per-method histograms: grpc_server_handled_total (by method, status), grpc_server_handling_seconds (latency buckets), and grpc_server_msg_received_total.

OpenTelemetry Tracing

Trace context propagates via gRPC metadata headers (traceparent, tracestate). go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc instruments all calls automatically.

gRPC Reflection

Enable reflection.Register(s) on your server. Tools like grpcurl and grpc-ui can discover all services and methods at runtime without the proto file — essential for debugging and testing.

Bash — grpcurl (curl for gRPC)

# List all services (requires server-side reflection)
grpcurl -plaintext localhost:50051 list

# Describe a service
grpcurl -plaintext localhost:50051 describe payments.v1.PaymentService

# Call a unary RPC with JSON body
grpcurl -plaintext -d '{"order_id":"ord_123","amount_cents":999,"currency":"USD"}' \
  localhost:50051 payments.v1.PaymentService/Charge

# Call with metadata (auth header)
grpcurl -plaintext \
  -H 'authorization: Bearer eyJhbGci...' \
  -d '{"user_id":"u1"}' \
  localhost:50051 users.v1.UserService/GetUser

# Stream responses
grpcurl -plaintext -d '{"user_id":"u1"}' \
  localhost:50051 events.v1.EventService/WatchEvents

# Use proto file directly (no reflection needed)
grpcurl -import-path ./proto -proto payments.proto \
  -plaintext -d '{}' localhost:50051 payments.v1.PaymentService/Ping

Schema Evolution

// BACKWARD & FORWARD COMPATIBILITY RULES

Protobuf was designed for schema evolution in distributed systems. Old clients can talk to new servers and vice versa — as long as you follow the rules. The core rule: field numbers are permanent. Never change a field number or re-use a retired one.

✓ Safe Changes Backward Compat

Add new fields with new field numbers
Delete a field (mark as reserved to prevent re-use)
Add new values to an enum (but not at 0)
Add a new RPC method to a service
Change a field between int32 and int64 (same wire type)
Change a field from scalar to repeated (compatible wire format)

✗ Breaking Changes Never Do

Change a field number — silently corrupts data
Change a field's type to an incompatible wire type
Remove an enum value that clients still use
Change enum value 0 meaning (it's the default)
Remove a required RPC method (proto2 required fields)
Re-use a retired field number for a different field

API Versioning Strategy

Versioning Best Practices

## Package versioning: embed version in the package name
package payments.v1;  // stable API
package payments.v2;  // breaking change requires new major version

## Within a version, use field evolution (no v1 → v2 bump needed):
message ChargeRequest {
  string  order_id     = 1;    // original
  int64   amount_cents  = 2;    // original
  string  currency      = 3;    // original
  // v1.1 additions — old clients ignore these (unknown fields preserved)
  string  idempotency_key = 4;  // new in v1.1, optional
  map<string,string> metadata = 5;  // new in v1.2
  // retired field — was "customer_name" = 6
  reserved 6; reserved "customer_name";
  // v1.3 replacement
  string customer_id = 7;       // new field 7, not reusing 6
}

## Unknown fields: proto3 preserves unknown fields by default
## New server receives old client message → unknown new fields = zero values (safe)
## Old client receives new server message → unknown new fields preserved, then
## re-sent to another server — so a proxy never strips data it doesn't understand

## Use buf for schema registry and breaking change detection
## buf breaking --against .git#branch=main  ← CI gate

▸

Use buf instead of raw protoc. The buf CLI from Buf Technologies handles dependency management, code generation across languages, linting (enforcing naming conventions), and breaking change detection as a CI gate. buf breaking --against .git#branch=main will fail if a commit introduces a backward-incompatible change to any proto file.

When to Use gRPC

// DECISION FRAMEWORK

gRPC is a powerful but opinionated choice. Use this framework to make the decision. The question is never "is gRPC better than REST?" — it's whether gRPC's tradeoffs serve your specific communication pattern.

Scenario	Recommendation	Reason
Internal microservice-to-microservice	USE gRPC	Performance, type safety, streaming, language agnosticism
Public-facing external API	Use REST	Universal browser/client support, developer ergonomics
Mobile app to backend	Consider	gRPC-Swift/Kotlin exist; REST is simpler; gRPC wins on bandwidth
Real-time streaming / push notifications	USE gRPC	Server/bidi streaming are first-class; SSE/WS are bolt-ons in REST
Large binary payload transfer	USE gRPC	Streaming + binary + no JSON encoding overhead
Simple CRUD, small team, fast iteration	Use REST	Lower tooling overhead, easier debugging, familiar mental model
Polyglot service mesh	USE gRPC	One proto = client/server in 10+ languages; no OpenAPI drift
Browser frontend to service	Use REST or GraphQL	gRPC-Web requires proxy; REST/GraphQL are native
AI inference / ML model serving	USE gRPC	Triton, TF Serving, Ray Serve all use gRPC natively; binary for tensors

Common Mistakes Avoid

Using gRPC for a public API without a REST gateway
Not setting deadlines on every unary call
Re-using field numbers after deletion
Using gRPC through a L4-only load balancer
Not enabling server-side reflection in dev/staging
Ignoring streaming backpressure — server floods slow client
Embedding business data in metadata (use message fields)
Running without TLS in any networked environment

Production Checklist Do These

Set deadlines on every client call (never infinite wait)
Embed UnimplementedXxxServer for forward compatibility
Use buf breaking in CI to catch schema regressions
Enable gRPC health checking (grpc_health_v1)
Use mTLS or a service mesh for service-to-service auth
Instrument with OTel for distributed tracing across calls
Limit max message size explicitly (default 4MB)
Use Kubernetes headless services for client-side load balancing

▸

Reference implementations & tooling: grpc.io (official docs), buf.build (modern proto toolchain), grpcurl (CLI testing), grpc-ui (web UI for gRPC), Evans (interactive gRPC client), go-grpc-middleware/v2 (interceptor library), connectrpc.com (gRPC-compatible protocol with REST support), google.golang.org/grpc (Go reference implementation).