SmolAgents Practitioner's Handbook
A complete, production-ready reference for building agentic AI systems with Hugging Face's minimalist smolagents framework — covering models, CodeAgents, tools, sandboxing, multi-agent orchestration, vision, and telemetry.
Install & Setup
smolagents is Hugging Face's minimalist agentic framework — approximately 1,000 lines of core logic. Its philosophy: agents should write and execute Python code rather than select from a menu of JSON tool calls. This makes them far more expressive and composable.
# Install smolagents with all optional extras pip install smolagents # With E2B sandbox support pip install smolagents[e2b] # With LiteLLM for multi-provider support (OpenAI, Anthropic, etc.) pip install smolagents[litellm] # With Gradio UI pip install smolagents[gradio] # Everything at once pip install smolagents[e2b,litellm,gradio] # Set your HF token (needed for gated/private models) export HF_TOKEN=hf_...
CodeAgent generates runnable Python instead of structured JSON — giving it the full power of the language for loops, conditionals, and data manipulation.Core Architecture at a Glance
Module 1 — Core Models & Setup
Hugging Face Hub Models
Use HfApiModel to call any model hosted on the Hugging Face Hub via the Inference API — no GPU required. For local execution with full control, use TransformersModel to load weights directly.
# ── HfApiModel: serverless inference via HF Hub API ────────────────────── from smolagents import HfApiModel # Points to a hosted model; HF_TOKEN env var is picked up automatically. # Use any model ID from hf.co/models — gated models require an approved token. model = HfApiModel( model_id="Qwen/Qwen2.5-72B-Instruct", # model to call token="hf_...", # or set HF_TOKEN env var timeout=120, # seconds before giving up ) # Quick smoke-test — call the model directly (no agent wrapping yet) response = model([{"role": "user", "content": "What is 2 + 2?"}]) print(response.content) # → "4"
# ── TransformersModel: run weights locally on your GPU ─────────────────── from smolagents import TransformersModel # Downloads (or reads from cache) and runs the model locally. # Requires transformers + accelerate installed. Great for offline/privacy use. local_model = TransformersModel( model_id="Qwen/Qwen2.5-7B-Instruct", # smaller model for local GPU device_map="auto", # auto-assign to GPU/CPU torch_dtype="auto", # bfloat16 where possible max_new_tokens=2048, ) response = local_model([{"role": "user", "content": "Explain gradient descent briefly."}]) print(response.content)
CodeAgent, the model only needs to generate Python code blocks — making it compatible with a wider range of open models than JSON-tool-calling alternatives.External Providers (OpenAI / Anthropic)
Use LiteLLMModel for a unified interface to 100+ providers, or OpenAIServerModel for any OpenAI-compatible endpoint (including Ollama, vLLM, Together AI).
# ── LiteLLMModel: unified interface to OpenAI, Anthropic, Gemini, etc. ─── import os from smolagents import LiteLLMModel # OpenAI GPT-4o — set OPENAI_API_KEY in your environment model_openai = LiteLLMModel( model_id="openai/gpt-4o", # litellm provider/model format api_key=os.environ["OPENAI_API_KEY"], temperature=0.1, # lower temp = more deterministic code ) # Anthropic Claude — set ANTHROPIC_API_KEY in your environment model_claude = LiteLLMModel( model_id="anthropic/claude-sonnet-4-5", api_key=os.environ["ANTHROPIC_API_KEY"], ) # Google Gemini model_gemini = LiteLLMModel( model_id="gemini/gemini-2.0-flash", api_key=os.environ["GEMINI_API_KEY"], )
# ── OpenAIServerModel: any OpenAI-compatible REST endpoint ─────────────── from smolagents import OpenAIServerModel # Works with Ollama, vLLM, Together AI, Groq, Fireworks, etc. model_local = OpenAIServerModel( model_id="qwen2.5:7b", # model name as registered by the server api_base="http://localhost:11434/v1", # Ollama local endpoint api_key="ollama", # Ollama ignores the key, but it's required ) # Together AI example model_together = OpenAIServerModel( model_id="meta-llama/Llama-3.3-70B-Instruct-Turbo", api_base="https://api.together.xyz/v1", api_key=os.environ["TOGETHER_API_KEY"], )
Basic Execution
Before wrapping a model in an agent, you can call it directly to verify connectivity and inspect the raw ChatMessage response object.
# ── Direct model call without any agent wrapper ─────────────────────────── from smolagents import HfApiModel from smolagents.models import ChatMessage # type returned by all model backends model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") # Messages follow the OpenAI chat format: list of role/content dicts messages = [ {"role": "system", "content": "You are a concise Python expert."}, {"role": "user", "content": "Write a one-liner to flatten a nested list."}, ] response: ChatMessage = model(messages) print(response.content) # the text reply print(response.role) # "assistant" print(response.tool_calls) # None for plain text response # Streaming (where supported) for chunk in model.stream(messages): print(chunk, end="", flush=True)
Module 2 — The Core Agents
CodeAgent — The Star ⭐
The CodeAgent is smolagents' flagship: instead of emitting a JSON blob like {"tool": "search", "query": "..."}, it writes real Python code that calls tool functions, uses variables across steps, and leverages the full language (loops, list comprehensions, error handling). The code is executed in a sandboxed interpreter and the output is fed back as the next observation.
# ── Minimal CodeAgent ──────────────────────────────────────────────────── from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool # 1. Choose a backbone model model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") # 2. Equip the agent with tools (built-ins or custom — covered in Module 3) agent = CodeAgent( tools=[DuckDuckGoSearchTool()], # tools the generated code may call model=model, max_steps=10, # max reasoning steps before giving up verbosity_level=1, # 0=silent, 1=steps, 2=full code ) # 3. Run a task — the agent writes Python to fulfill it result = agent.run( "What is the current price of Bitcoin in USD? Search the web and return just the number." ) print(result)
# ── What the agent generates internally (example) ──────────────────────── # The LLM produces something like this Python snippet, which smolagents # executes in its sandboxed interpreter: # # results = web_search("Bitcoin price USD today") # price_text = results[0]["snippet"] # # Extract just the dollar amount # import re # match = re.search(r'\$[\d,]+', price_text) # final_answer(match.group() if match else price_text) # # Notice: the agent used re (a standard library) inside its generated code, # combined the tool call with string processing — impossible with JSON calls.
ToolCallingAgent
The ToolCallingAgent follows the traditional function-calling approach: the LLM outputs a structured JSON object specifying which tool to call and with which arguments. It's more predictable but less expressive — one tool call per step, no Python logic between calls.
| Feature | CodeAgent | ToolCallingAgent |
|---|---|---|
| Output format | Python code block | JSON tool call |
| Multi-tool per step | ✅ Yes | ❌ One at a time |
| Uses variables across steps | ✅ Yes | ❌ No |
| Model requirement | Any chat model | Needs function-calling support |
| Predictability | Medium | High |
| Best for | Complex reasoning, data manipulation | Simple routing, structured APIs |
# ── ToolCallingAgent: classic JSON-based tool routing ──────────────────── from smolagents import ToolCallingAgent, HfApiModel, DuckDuckGoSearchTool # Requires a model that natively supports tool/function calling model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = ToolCallingAgent( tools=[DuckDuckGoSearchTool()], model=model, max_steps=5, ) # Same .run() interface as CodeAgent result = agent.run("Search for the latest news about open-source LLMs.") print(result) # When to prefer ToolCallingAgent over CodeAgent: # - You need deterministic, auditable tool selection (compliance contexts) # - Your tools have strict input schemas and you want validation at the call site # - The model you're using doesn't generate good Python code # - Task is simple single-tool routing (e.g., always call one API endpoint)
System Prompts
Every smolagents agent has a default system prompt that instructs the model on how to use tools and format its output. You can override it entirely or use the template variables to extend it.
# ── Inspect the default system prompt ──────────────────────────────────── from smolagents import CodeAgent, HfApiModel model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = CodeAgent(tools=[], model=model) # See what the default prompt looks like print(agent.system_prompt_template) # raw template with {{tool_descriptions}}
# ── Custom system prompt with persona and constraints ───────────────────── from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool # The template MUST include {{tool_descriptions}} and {{tool_names}} # so smolagents can inject the available tools at runtime. CUSTOM_SYSTEM_PROMPT = """You are FinBot, a senior quantitative analyst assistant. You ONLY answer questions related to finance, economics, and markets. For all other topics, politely decline and redirect to finance. You have access to the following tools: {{tool_descriptions}} Rules you MUST follow: 1. Always cite the source of any data you retrieve. 2. Express monetary values with proper currency symbols and commas. 3. When computing statistics, show your work in the code. 4. Use `final_answer()` to return your response once you have enough data. Available tool names: {{tool_names}} """ agent = CodeAgent( tools=[DuckDuckGoSearchTool()], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"), system_prompt=CUSTOM_SYSTEM_PROMPT, # pass your custom template here ) result = agent.run("What is the current P/E ratio of the S&P 500?") print(result)
{{tool_descriptions}} and {{tool_names}} in your custom system prompt. Omitting them means the agent won't know which tools are available and will fail to call them.Module 3 — Tools & Tool Building
Built-in Tools
smolagents ships a curated set of ready-to-use tools. Import them directly and pass to any agent.
| Tool class | What it does | Extra deps |
|---|---|---|
DuckDuckGoSearchTool | Web search via DuckDuckGo (no API key) | duckduckgo-search |
PythonInterpreterTool | Execute arbitrary Python code | — |
WikipediaSearchTool | Search and retrieve Wikipedia articles | wikipedia-api |
VisitWebpageTool | Fetch and parse a URL's text content | requests, markdownify |
SpeechToTextTool | Transcribe audio files via Whisper | transformers |
TextToImageTool | Generate images from text prompts | diffusers |
GoogleSearchTool | Web search via Google Custom Search API | API key required |
# ── Agent with multiple built-in tools ─────────────────────────────────── from smolagents import ( CodeAgent, HfApiModel, DuckDuckGoSearchTool, VisitWebpageTool, WikipediaSearchTool, ) model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") # Equip the agent with several tools; it decides which to use agent = CodeAgent( tools=[ DuckDuckGoSearchTool(), # live web search VisitWebpageTool(), # read specific page content WikipediaSearchTool(), # structured encyclopedia lookup ], model=model, max_steps=8, ) result = agent.run( "Compare the market cap of NVIDIA and AMD. Use Wikipedia for historical context " "and DuckDuckGo for current figures." ) print(result)
# ── PythonInterpreterTool: let the agent run Python ─────────────────────── from smolagents import CodeAgent, HfApiModel, PythonInterpreterTool # PythonInterpreterTool lets sub-agents or ToolCallingAgents run Python # (CodeAgent already executes code natively; this tool is mainly for # ToolCallingAgents that need a Python execution capability). python_tool = PythonInterpreterTool( authorized_imports=["math", "statistics", "numpy"], ) agent = CodeAgent(tools=[python_tool], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")) result = agent.run("Calculate the standard deviation of [2, 4, 4, 4, 5, 5, 7, 9].") print(result)
Custom Tools — @tool Decorator
The fastest way to create a custom tool. Decorate a plain Python function with @tool. smolagents parses the docstring and type hints to build the tool's schema — both are mandatory.
# ── Custom tool via @tool decorator ────────────────────────────────────── from smolagents import tool, CodeAgent, HfApiModel @tool def get_stock_price(ticker: str) -> str: """Retrieves the current stock price for a given ticker symbol. This tool fetches real-time stock price data from Yahoo Finance. Use it when you need the current market price of a publicly traded company. Args: ticker: The stock ticker symbol (e.g. 'AAPL', 'MSFT', 'GOOGL'). Must be a valid US stock exchange ticker. Returns: A string describing the current price, e.g. 'AAPL: $182.34' """ # In production, replace with a real finance API (yfinance, Alpha Vantage, etc.) import random fake_price = round(random.uniform(50, 500), 2) return f"{ticker.upper()}: ${fake_price}" @tool def calculate_percentage_change(old_value: float, new_value: float) -> float: """Calculates the percentage change between two numerical values. Use this tool for computing growth rates, price changes, or any relative change calculation. Args: old_value: The original or baseline value. Must be non-zero. new_value: The new or current value to compare against old_value. Returns: The percentage change as a float. Positive means increase, negative means decrease. """ if old_value == 0: raise ValueError("old_value cannot be zero — division by zero.") return ((new_value - old_value) / abs(old_value)) * 100 # Pass the decorated functions directly — smolagents wraps them automatically agent = CodeAgent( tools=[get_stock_price, calculate_percentage_change], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"), ) result = agent.run( "Get the price of AAPL and MSFT, then compute the percentage difference between them." ) print(result)
Class-Based Tools
For tools that need initialization parameters (API keys, database connections, config) or stateful behavior, subclass Tool and define the required class attributes and a forward() method.
# ── Complex tool via Tool subclass ──────────────────────────────────────── from smolagents import Tool, CodeAgent, HfApiModel from typing import Any class DatabaseQueryTool(Tool): """Query a SQLite database with natural-language-friendly SQL.""" # --- Required class attributes ------------------------------------------- name = "database_query" # snake_case identifier description = ( "Executes a SQL SELECT query against the application database and returns " "the results as a formatted string. Use this when you need to look up or " "aggregate data from the database." ) inputs = { # input schema dict "query": { "type": "string", "description": ( "A valid SQL SELECT query. Only SELECT statements are allowed; " "INSERT/UPDATE/DELETE will be rejected." ), } } output_type = "string" # type returned by forward() def __init__(self, db_path: str = ":memory:", **kwargs: Any): # Call super().__init__() — smolagents sets up internal state here super().__init__(**kwargs) import sqlite3 self.conn = sqlite3.connect(db_path, check_same_thread=False) self._seed_demo_data() def _seed_demo_data(self) -> None: # Populate in-memory DB with sample data for demonstration self.conn.executescript(""" CREATE TABLE IF NOT EXISTS products ( id INTEGER PRIMARY KEY, name TEXT, category TEXT, price REAL, stock INTEGER ); INSERT OR IGNORE INTO products VALUES (1, 'Laptop Pro', 'Electronics', 1299.99, 45), (2, 'Wireless Mouse', 'Electronics', 29.99, 200), (3, 'Standing Desk', 'Furniture', 599.00, 12), (4, 'Coffee Mug', 'Kitchen', 14.99, 500), (5, 'Notebook', 'Stationery', 4.99, 1000); """) self.conn.commit() def forward(self, query: str) -> str: # Security: only allow SELECT queries stripped = query.strip().upper() if not stripped.startswith("SELECT"): return "Error: Only SELECT queries are permitted." try: cursor = self.conn.execute(query) columns = [desc[0] for desc in cursor.description] rows = cursor.fetchall() if not rows: return "Query returned no results." # Format as a simple ASCII table header = " | ".join(columns) divider = "-" * len(header) data = "\n".join(" | ".join(str(v) for v in row) for row in rows) return f"{header}\n{divider}\n{data}" except Exception as e: return f"SQL Error: {e}" # Instantiate with a specific DB path (or use :memory: for in-process testing) db_tool = DatabaseQueryTool(db_path=":memory:") agent = CodeAgent( tools=[db_tool], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"), ) result = agent.run( "Which product category has the highest average price? Show me the top 3 products." ) print(result)
Module 4 — Secure Code Execution
Local Sandboxing
When a CodeAgent generates Python code, smolagents does not use Python's built-in exec() directly. Instead it runs the code through a custom AST (Abstract Syntax Tree) evaluator that restricts dangerous operations.
open, os.remove), subprocess execution (subprocess, os.system), network access (socket, urllib), import of arbitrary modules, eval/exec, __import__, attribute access to dunder methods.
print, len, range, sorted, zip, map, filter.
# ── The AST evaluator rejects dangerous code patterns ──────────────────── from smolagents import CodeAgent, HfApiModel agent = CodeAgent(tools=[], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")) # These would be blocked if the agent tried to generate them: # import os; os.system("rm -rf /") ← subprocess blocked # open("/etc/passwd").read() ← file I/O blocked # __import__("subprocess").run(["ls"]) ← dynamic import blocked # # The agent CAN do: # result = [x**2 for x in range(10)] ← pure Python ✅ # answer = some_tool(arg) ← tool call ✅ # final_answer(result) ← built-in special function ✅ print("Sandbox active — only safe operations permitted by default.")
E2B Cloud Sandbox
For maximum isolation — especially with untrusted user inputs or when the agent needs full OS access (file I/O, networking, etc.) — use E2BExecutor to run generated code in an ephemeral cloud micro-VM provided by E2B.
# ── E2B cloud sandbox execution ─────────────────────────────────────────── # Install: pip install smolagents[e2b] # Requires: E2B_API_KEY environment variable import os from smolagents import CodeAgent, HfApiModel from smolagents.executors import E2BExecutor # E2B spins up an isolated Firecracker microVM for each agent run. # The VM is destroyed after execution — no state leaks between runs. e2b_executor = E2BExecutor( api_key=os.environ["E2B_API_KEY"], # Optional: specify a custom E2B sandbox template with pre-installed packages # template="my-custom-template-id", ) agent = CodeAgent( tools=[], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"), executor=e2b_executor, # swap out the local executor additional_authorized_imports=[ # fully isolated, so wider imports are safe "pandas", "numpy", "requests", "beautifulsoup4", "matplotlib" ], ) # This code runs inside E2B's cloud VM, not on your machine result = agent.run( """ Write a Python script that fetches the top 5 Hacker News stories using the HN API (https://hacker-news.firebaseio.com/v0/topstories.json), then formats them as a numbered list with title, score, and URL. """ ) print(result)
Module 5 — Multi-Agent Orchestration
Manager & Worker Architecture
In smolagents, a multi-agent system is built by wrapping a worker agent as a tool and giving it to a manager agent. The manager delegates sub-tasks via normal tool calls; the worker executes them with its own specialized tools and returns the result.
# ── Multi-agent: Manager delegates to a specialized Web Scraper worker ──── from smolagents import ( CodeAgent, HfApiModel, DuckDuckGoSearchTool, VisitWebpageTool, ManagedAgent, # wraps an agent as a callable tool ) model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") # ── Worker Agent: specialized in web research ───────────────────────────── web_research_agent = CodeAgent( tools=[DuckDuckGoSearchTool(), VisitWebpageTool()], model=model, name="web_researcher", # used in manager's tool description max_steps=6, ) # Wrap the worker as a ManagedAgent so the manager can call it like a tool. # description tells the manager WHEN and HOW to invoke this sub-agent. web_research_tool = ManagedAgent( agent=web_research_agent, name="web_research", description=( "Performs deep web research on any topic. " "Input: a detailed research question or task as a string. " "Output: a comprehensive summary with sources." ), ) # ── Manager Agent: orchestrates the overall workflow ────────────────────── # The manager sees web_research_tool as just another tool to call manager_agent = CodeAgent( tools=[web_research_tool], # worker agents appear as tools here model=model, max_steps=5, ) # The manager breaks down the task and delegates research to the worker result = manager_agent.run( """ Research the three most popular open-source LLM frameworks in 2025 (smolagents, LangChain, LlamaIndex). For each, find: 1. GitHub star count 2. Primary use case 3. Latest version Then produce a comparison table. """ ) print(result)
State Passing Between Agents
Because both agents run Python, complex objects — DataFrames, images, dictionaries, even trained model weights — can be passed as return values and accessed in the manager's variable scope.
# ── Passing pandas DataFrames between agents ────────────────────────────── from smolagents import CodeAgent, HfApiModel, ManagedAgent, tool import pandas as pd model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") @tool def load_sales_data() -> str: """Loads the company's monthly sales data and returns it as a CSV string. Returns: A CSV-formatted string with columns: month, region, revenue, units_sold. """ # In production, this would read from a database or file system data = { "month": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"], "region": ["North", "North", "South", "South", "East", "East"], "revenue": [120000, 135000, 98000, 112000, 145000, 160000], "units_sold": [240, 270, 196, 224, 290, 320], } return pd.DataFrame(data).to_csv(index=False) # Worker: data loading and cleaning specialist data_agent = CodeAgent( tools=[load_sales_data], model=model, additional_authorized_imports=["pandas", "io"], name="data_loader", ) data_tool = ManagedAgent( agent=data_agent, name="load_and_clean_data", description=( "Loads sales data and returns a cleaned, analysis-ready CSV string. " "No inputs required. Call this first before doing any analysis." ), ) # Manager: analysis and reporting manager = CodeAgent( tools=[data_tool], model=model, additional_authorized_imports=["pandas", "io", "numpy"], max_steps=8, ) # The manager will: # 1. Call data_tool → gets CSV string # 2. Parse CSV with pandas in its own code # 3. Compute statistics and produce the final report result = manager.run( """ Load the sales data, then: - Compute total revenue and units sold per region - Find the best and worst performing months - Calculate month-over-month revenue growth rate - Summarize findings in a concise business report """ ) print(result)
PIL.Image objects or base64-encoded strings from worker agents. The manager can pass these to vision tools or save them. For large binary objects, use shared file paths as the exchange medium instead of in-memory passing.Module 6 — Vision & Multimodality
Processing Images with a Multimodal Agent
smolagents supports vision models (Qwen-VL, LLaVA, Llama-3.2-Vision, etc.) natively. Pass images as URLs or local files alongside text prompts. The model receives both modalities in a single message.
# ── Multimodal CodeAgent with image input ──────────────────────────────── from smolagents import CodeAgent, HfApiModel from smolagents.utils import encode_image_base64 # helper for local files # Use a vision-capable model — Qwen2-VL and Llama-3.2-Vision are recommended vision_model = HfApiModel( model_id="Qwen/Qwen2-VL-72B-Instruct", # powerful vision-language model ) agent = CodeAgent( tools=[], # can add tools; the model will also reason about the image model=vision_model, max_steps=5, ) # ── Option A: pass an image URL directly ───────────────────────────────── result = agent.run( task="Describe this image in detail. Identify all objects, their colors, " "and the overall scene composition.", images=["https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"], ) print(result)
# ── Option B: pass a local image file ──────────────────────────────────── from PIL import Image from smolagents import CodeAgent, HfApiModel vision_model = HfApiModel(model_id="Qwen/Qwen2-VL-72B-Instruct") agent = CodeAgent(tools=[], model=vision_model) # Load from disk as PIL Image — smolagents handles encoding automatically image = Image.open("/path/to/chart.png") result = agent.run( task="This is a business chart. Extract all the data points, axis labels, " "and title. Then describe the main trend shown.", images=[image], # PIL Image objects are accepted directly ) print(result)
# ── Option C: multi-image analysis ─────────────────────────────────────── from smolagents import CodeAgent, HfApiModel from PIL import Image vision_model = HfApiModel(model_id="meta-llama/Llama-3.2-11B-Vision-Instruct") agent = CodeAgent( tools=[], model=vision_model, additional_authorized_imports=["PIL"], ) before_img = Image.open("before.jpg") after_img = Image.open("after.jpg") # Pass multiple images — the model receives them in order alongside the text result = agent.run( task="You are given a BEFORE and AFTER image of a room renovation. " "List all the changes you can identify between the two images. " "Be specific about colors, furniture, and layout changes.", images=[before_img, after_img], # order matches the text references ) print(result)
HfApiModel with Qwen2-VL or Llama-3.2-Vision will work. LiteLLMModel with gpt-4o or claude-opus-4 also supports image inputs via the same API.# ── Vision agent with LiteLLM (GPT-4o) ─────────────────────────────────── import os from smolagents import CodeAgent, LiteLLMModel, DuckDuckGoSearchTool vision_model = LiteLLMModel( model_id="openai/gpt-4o", api_key=os.environ["OPENAI_API_KEY"], ) agent = CodeAgent( tools=[DuckDuckGoSearchTool()], model=vision_model, ) # Mix vision and tool use in a single task result = agent.run( task="Analyze the logo in this image. Identify the brand, then search the web " "to find their current stock price and latest news.", images=["https://example.com/logo.png"], ) print(result)
Module 7 — Telemetry & Memory
Step Logs & Agent Memory
After (or during) a run, every intermediate step is captured in agent.logs. Each log entry contains the generated code, execution output, timestamps, and errors — invaluable for debugging and auditing.
# ── Inspecting agent.logs after a run ──────────────────────────────────── from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model, max_steps=5) result = agent.run("What are the top 3 AI research papers published in 2025?") # agent.logs is a list of ActionStep and PlanningStep objects print(f"\n{'='*60}") print(f"Total steps taken: {len(agent.logs)}") print(f"{'='*60}\n") for i, step in enumerate(agent.logs): step_type = type(step).__name__ print(f"── Step {i+1} ({step_type}) ──────────────────────────") # ActionStep: contains the LLM reasoning and code execution if hasattr(step, "model_output") and step.model_output: print(f"[LLM Output]:\n{step.model_output[:300]}...") if hasattr(step, "tool_calls") and step.tool_calls: for call in step.tool_calls: print(f"[Code executed]:\n{call.arguments.get('code', '')[:400]}") if hasattr(step, "observations") and step.observations: print(f"[Observation]: {str(step.observations)[:200]}") if hasattr(step, "step_number"): print(f"[Step number]: {step.step_number}") # Duration is available on ActionStep if timing was captured if hasattr(step, "duration") and step.duration: print(f"[Duration]: {step.duration:.2f}s") print()
# ── Export logs for analysis or storage ─────────────────────────────────── import json from smolagents import CodeAgent, HfApiModel model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = CodeAgent(tools=[], model=model) agent.run("Calculate the sum of all prime numbers under 100.") # Build a structured log record for each step log_records = [] for step in agent.logs: record = { "type": type(step).__name__, "step_number": getattr(step, "step_number", None), "duration_s": getattr(step, "duration", None), "model_output": getattr(step, "model_output", None), "observations": str(getattr(step, "observations", "")), "error": str(getattr(step, "error", "")) if getattr(step, "error", None) else None, } log_records.append(record) # Save to file for later analysis with open("agent_run.json", "w") as f: json.dump(log_records, f, indent=2, default=str) print(f"Saved {len(log_records)} step records to agent_run.json")
# ── Replaying agent memory for continued conversations ─────────────────── from smolagents import CodeAgent, HfApiModel model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = CodeAgent(tools=[], model=model) # First run agent.run("My name is Alice and I love Python. Remember this.") # Access raw memory (list of messages passed to the LLM) memory_snapshot = agent.memory.get_full_message_list() print(f"Memory contains {len(memory_snapshot)} messages") # Second run — agent retains context from the first run within the same session result = agent.run("What is my name and what programming language do I love?") print(result) # → "Your name is Alice and you love Python." # Reset memory for a fresh session agent.memory.reset() print("Memory cleared.")
Gradio UI
smolagents ships a GradioUI helper that wraps any agent in a polished web chat interface — zero frontend code required. One line to deploy.
# ── Instant web UI for any agent ────────────────────────────────────────── # Install: pip install smolagents[gradio] from smolagents import ( CodeAgent, HfApiModel, DuckDuckGoSearchTool, VisitWebpageTool, GradioUI, ) model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = CodeAgent( tools=[DuckDuckGoSearchTool(), VisitWebpageTool()], model=model, max_steps=10, verbosity_level=1, ) # Launch: opens a browser tab at http://localhost:7860 # The UI shows the chat, intermediate steps, and generated code in real time ui = GradioUI(agent) ui.launch( server_name="0.0.0.0", # bind to all interfaces (for Docker/remote) server_port=7860, share=False, # set True to get a public gradio.live URL debug=False, )
# ── Customized Gradio UI ─────────────────────────────────────────────────── from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool, GradioUI import gradio as gr model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct") agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model) # GradioUI wraps your agent; you can extend the underlying gr.Blocks app ui = GradioUI( agent, file_upload_folder="./uploads", # allow file uploads to this folder ) # Add custom components by accessing the underlying Gradio app with ui.demo: gr.Markdown( """ ## 🤖 SmolAgents Research Assistant Ask me anything — I can search the web, analyze data, and write code. """ ) ui.launch()
share=False and the Space handles routing. Add HF_TOKEN and any API keys as Space secrets.Reference Links
- Docs huggingface.co/docs/smolagents — Official smolagents documentation
- GitHub github.com/huggingface/smolagents — Source code (~1,000 lines of core)
- Course huggingface.co/learn/agents-course — HF Agents Course (free)
- Models HF Hub — Text Generation models — All models usable with HfApiModel
- E2B e2b.dev/docs — E2B cloud sandboxing documentation
- LiteLLM docs.litellm.ai — All supported providers and model strings
- Paper smolagents: Tiny Agents, Big Impact (HF Blog)
- Gradio gradio.app/docs — Gradio documentation for UI customization