Agent
Dev Kit
Google's open-source framework for building production-grade AI agents and multi-agent pipelines. Supports Python, TypeScript, Go, and Java with first-class Gemini integration and pluggable model backends.
Overview
Agent Development Kit (ADK) is Google's open-source framework designed to help developers quickly build, manage, evaluate, and deploy AI-powered agents. It treats agents as code — composable, testable, and version-controlled. ADK is model-agnostic but optimized for Google's Gemini family, and it natively supports the A2A (Agent-to-Agent) communication protocol.
Agents are defined in code — not drag-and-drop UIs. Every component (identity, tools, instructions, memory) is explicit, version-controlled, and composable.
First-class SDKs for Python, TypeScript, Go, and Java. Core concepts are identical across languages. Python 2.0 Beta introduces graph-based workflows and agent teams.
Built-in evaluation, observability (logging, metrics, traces), deployment targets (Cloud Run, GKE, Agent Runtime), and safety guardrails make ADK enterprise-ready.
adk.dev/2.0 for details.Core Architecture
Installation
- Python: 3.10 or later +
pip - TypeScript: Node.js 18+ +
npm - Go: 1.21+
- Java: JDK 17+ + Maven or Gradle
- API Key: Gemini API key from Google AI Studio
- Gemini 2.0 Flash / Pro (default, recommended)
- Anthropic Claude (via LiteLLM or direct)
- Ollama / vLLM (local/self-hosted)
- Any OpenAI-compatible endpoint via LiteLLM
- Google Agent Platform hosted models
Quickstart
The fastest path to a running agent uses the adk create command — it scaffolds the project, wires up the boilerplate, and drops you directly into writing agent logic.
Create a project
Run adk create my_agent to scaffold a new agent directory with agent.py, .env, and __init__.py.
Set your API key
Add GOOGLE_API_KEY="YOUR_KEY" to my_agent/.env. Get a key at aistudio.google.com/app/apikey.
Define your agent
Edit agent.py to define root_agent with a model, name, description, and optional tools.
Run it
Use adk run my_agent for CLI interaction or adk web for the browser dev UI (development only).
Project Structure
| File / Field | Required? | Purpose |
|---|---|---|
root_agent | YES | The ADK entry-point. The runner looks for this variable name in agent.py. |
.env | Recommended | Stores GOOGLE_API_KEY and other secrets. Never commit to source control. |
__init__.py | YES (Python) | Required for Python packaging; the adk CLI discovers agents through packages. |
agent.name | YES | Unique string identifier. Used for routing in multi-agent systems. Avoid "user". |
agent.model | YES | LLM identifier string e.g. "gemini-2.0-flash" or "claude-3-5-sonnet". |
agent.instruction | Recommended | System prompt. Defines persona, task, constraints, and tool usage guidance. |
agent.description | Recommended | Used by parent agents / routers to decide when to delegate to this agent. |
LLM Agents
The LlmAgent (aliased as Agent) is the primary building block. It uses a Large Language Model for reasoning, natural language understanding, decision-making, and tool invocation. Unlike workflow agents, its behavior is non-deterministic — it interprets context and decides dynamically how to proceed.
Key Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
name | str | YES | Unique identifier for the agent. Used internally for delegation and routing. |
model | str | YES | LLM identifier: "gemini-2.0-flash", "gemini-flash-latest", etc. |
instruction | str | fn | Recommended | System prompt. Can be a static string or a function returning a string (supports {state_var} templates). |
description | str | Multi-agent | Summary of capabilities — used by orchestrator agents for routing decisions. |
tools | list | Optional | List of callables, BaseTool instances, or other agents (as AgentTool). |
output_schema | Pydantic / Schema | Optional | Enforces structured JSON output. Cannot be combined with tools on most models. |
output_key | str | Optional | Auto-saves final response text to session.state[output_key]. |
input_schema | Pydantic / Schema | Optional | Validates that incoming message is a JSON string conforming to this schema. |
include_contents | "default" | "none" | Optional | Controls whether conversation history is passed to the LLM. 'none' = stateless. |
generate_content_config | GenContentConfig | Optional | Fine-tune LLM: temperature, max_output_tokens, safety settings, top_p, top_k. |
planner | BasePlanner | Optional | BuiltInPlanner (Gemini thinking) or PlanReActPlanner (for non-thinking models). |
code_executor | BaseCodeExecutor | Optional | Allows agent to execute code blocks. Use BuiltInCodeExecutor. |
Instruction Templating
Use {var} syntax to inject session state into instructions. Use {var?} to silently skip if the variable doesn't exist. Use {artifact.var} to inject artifact text content.
Pass a callable as instruction to dynamically compute it per-request. The function receives the ReadonlyContext and returns a string — ideal for personalization or conditional prompts.
Workflow Agents
Workflow agents provide deterministic execution — they follow predefined paths rather than using an LLM to decide what to do next. Three built-in types cover the most common orchestration patterns. They are ideal as orchestrators that coordinate specialist LLM sub-agents.
Multi-Agent Systems
ADK supports multi-agent composition through three communication mechanisms. Agents can be organized into hierarchies where a parent/orchestrator delegates work to specialist sub-agents. This enables separation of concerns, reusability, and parallelism.
Communication Mechanisms
Agents read and write to session.state — a dict shared within a session. Use output_key to write; use {key} in instructions to read. Simple, explicit, synchronous.
The orchestrator LLM autonomously decides to transfer control to a sub-agent based on its description. Sub-agents listed in sub_agents are presented as options to the parent LLM.
Wrap any agent as a tool using AgentTool(agent). The parent LLM explicitly calls it like a function tool — provides more predictable delegation than implicit transfer.
Common Multi-Agent Patterns
| Pattern | Structure | When to use |
|---|---|---|
| Coordinator / Dispatcher | LLM root → specialist sub-agents via AgentTool | Customer service routing, intent classification |
| Sequential Pipeline | SequentialAgent → [step1, step2, step3] | ETL, content generation, multi-step analysis |
| Parallel Fan-Out | ParallelAgent → [agentA, agentB, agentC] | Independent research tasks, latency optimization |
| Generator–Critic | writer agent → reviewer agent (LoopAgent) | Document drafting with quality gates |
| Hierarchical Decomposition | Root → managers → workers (nested) | Complex tasks requiring specialization at multiple levels |
| Human-in-the-Loop | Agent raises interrupt → human input → resume | Approval workflows, sensitive actions |
Custom Tools
Tools extend an agent's capabilities beyond built-in LLM knowledge. The LLM uses the tool's name, docstring, and parameter types to decide when and how to call it. Write clear, descriptive docstrings — they are the tool's "contract" with the LLM.
Function Tools (Python)
In Python, any regular function passed to tools=[] is automatically wrapped as a FunctionTool. ADK extracts the schema from type hints and the docstring.
Tool Types Summary
| Tool Type | How to define | Best for |
|---|---|---|
| Function Tool | Plain Python/TS/Go/Java function | Custom business logic, API calls, DB queries |
| MCP Tool | MCPToolset connecting to MCP server | Reusing ecosystem tools (filesystem, GitHub, etc.) |
| OpenAPI Tool | OpenAPIToolset from OpenAPI spec URL/file | Auto-generating tools from REST API specs |
| AgentTool | AgentTool(another_agent) | Delegation to specialist agents as callable tools |
| Built-in Tools | Google Search, Code Execution, Vertex Search | Web grounding, code running, enterprise search |
dict objects with a consistent structure (include a "status" key). Keep tools focused on one capability. Avoid side-effectful tools without confirmation (use the action confirmation pattern for irreversible operations). Keep parameter names and types clear — they form the JSON schema the LLM sees.MCP Tools
ADK natively supports the Model Context Protocol (MCP) — an open standard for connecting AI agents to external tools and data sources. You can consume MCP servers as tools, or expose your ADK agents as MCP-compatible servers.
Use MCPToolset to connect to any MCP server. ADK automatically discovers available tools and exposes them to your agent's LLM.
Any ADK agent can be wrapped and exposed as an MCP server, making it consumable by other agents, Claude, or any MCP-compatible client.
Supported Models
| Model / Backend | Identifier / Config | Notes |
|---|---|---|
| Gemini 2.0 Flash | "gemini-2.0-flash" |
Recommended default. Fast, multimodal, tool-use optimized. |
| Gemini 2.5 Pro | "gemini-2.5-pro-preview-03-25" |
Best reasoning. Use with BuiltInPlanner for thinking mode. |
| Gemini Flash Latest | "gemini-flash-latest" |
Tracks latest flash release automatically. |
| Anthropic Claude | Via LiteLLM: "claude-3-5-sonnet" |
Requires ANTHROPIC_API_KEY. Use LiteLLM integration. |
| Ollama (local) | OllamaModel("llama3.2") |
Runs fully local. Requires Ollama installed and model pulled. |
| LiteLLM | LiteLlm("openai/gpt-4o") |
Proxy to any OpenAI-compatible API. Supports 100+ models. |
| vLLM | VLLMModel(...) |
For self-hosted GPU inference. OpenAI-compatible endpoint. |
| Gemma 2/3 (local) | Via LiteRT-LM or Ollama | On-device inference via LiteRT-LM for edge deployments. |
ModelRouter for runtime routing based on task characteristics.Sessions & State
Every agent interaction happens within a session. Sessions store conversation history and a mutable state dictionary. State is the primary communication channel between agents in a pipeline — write with output_key, read with {state_var} in instructions.
- InMemorySessionService — development / testing. No persistence.
- DatabaseSessionService — SQL-backed, production-ready.
- VertexAISessionService — managed, Vertex AI Agent Engine.
- Custom implementations via
BaseSessionServiceinterface.
- Session state (
session.state) — scoped to current conversation. - User state (
user:prefix) — persists across sessions for a user. - App state (
app:prefix) — shared across all users of the app. - Temp state (
temp:prefix) — current turn only, not persisted.
Memory
Conversation history is managed automatically within a session. Use context_caching to cache large static contexts (system prompts, documents) and reduce latency + cost. Use context_compaction for long conversations that would exceed the context window.
Implement BaseMemoryService to persist and retrieve facts across sessions. Vertex AI Agent Engine provides a managed memory service. Use the load_memory tool to inject relevant memories into context on each turn.
CachingConfig) can cut costs by 75%+ on repeated large prompts like extensive system instructions or reference documents.Callbacks
Callbacks let you intercept and observe — or modify — the agent execution lifecycle without subclassing. They are the recommended way to add logging, guardrails, caching, and custom behavior.
| Callback | Fires when | Can override? |
|---|---|---|
before_agent_callback | Before any agent starts executing | Yes — return early to skip execution |
after_agent_callback | After an agent finishes executing | Yes — modify the response |
before_model_callback | Before LLM is called (has the LLM request) | Yes — short-circuit with cached response |
after_model_callback | After LLM returns (has the LLM response) | Yes — modify response, add guardrails |
before_tool_callback | Before a tool is executed | Yes — modify args, skip, or mock |
after_tool_callback | After a tool returns | Yes — modify tool result |
Runtime & Interfaces
Streaming
ADK supports bidirectional streaming via the Gemini Live API through the Gemini Live API Toolkit. This enables real-time voice and video agents with low-latency responses. Streaming agents handle audio input, image frames, and text simultaneously.
The run_async method yields events as they arrive — text is streamed token by token in the event stream. Works with all ADK agents out of the box.
For voice/video: uses the LiveRequestQueue and websocket transport. Supports audio input/output, video frames, and real-time interruption. Requires Gemini models with Live API support.
is_long_running=True.Deployment
| Target | Best for | Command / Notes |
|---|---|---|
| Agent Runtime | Managed, Vertex AI-integrated, sessions + memory included | agents deploy <path> via agents-cli |
| Cloud Run | Serverless HTTP, auto-scaling, simple containerized agents | Dockerfile + gcloud run deploy; ADK docs provide templates |
| GKE | High-traffic, custom networking, GPU workloads, fine control | Kubernetes deployment manifest; use Workload Identity for secrets |
- Never commit
.envor API keys — use Secret Manager - Set
GOOGLE_CLOUD_PROJECTfor Vertex AI authentication - Run
adk evalagainst your test set before deploying - Configure logging to Cloud Logging (structured JSON format)
- Set
max_llm_callsin RunConfig to prevent runaway loops - Use
DatabaseSessionService— not InMemory — in production
- Logs: Cloud Logging + structured ADK event logs
- Traces: Cloud Trace via OpenTelemetry integration
- Metrics: LLM call count, tool call rate, latency P95
- Eval: Automated eval runs in CI/CD pipeline
- Alerts: Error rate > 1%, P95 latency > 10s
Evaluation
ADK has first-class evaluation support. Before deploying, run structured evaluations to measure whether your agent achieves its intended goals, handles edge cases, and uses tools correctly.
Checks that the agent followed the right sequence of steps — correct tool calls in the right order. Deterministic pass/fail.
LLM-judged evaluation of final response quality against reference answers. Configurable rubrics and scoring criteria.
A "user simulator" LLM plays the user role, driving multi-turn conversations automatically. Useful for testing conversation flows.
A2A Protocol
The Agent-to-Agent (A2A) Protocol is an open standard that enables interoperability between AI agents built on different frameworks. An ADK agent can both expose itself as an A2A server and consume other A2A agents as tools — enabling cross-framework multi-agent systems.
Wrap any ADK agent with A2AServer to make it discoverable and callable by any A2A-compatible client (ADK, LangChain, custom). Publishes an agent.json capability descriptor.
Use A2AClient to call remote A2A agents from within an ADK agent. The remote agent appears as a regular tool. Works across networks, clouds, and frameworks.