SmolAgents Practitioner's Handbook

A complete, production-ready reference for building agentic AI systems with Hugging Face's minimalist smolagents framework — covering models, CodeAgents, tools, sandboxing, multi-agent orchestration, vision, and telemetry.

smolagents 1.x CodeAgent · ToolCallingAgent ~1,000 lines of core code March 2026

Install & Setup

smolagents is Hugging Face's minimalist agentic framework — approximately 1,000 lines of core logic. Its philosophy: agents should write and execute Python code rather than select from a menu of JSON tool calls. This makes them far more expressive and composable.

# Install smolagents with all optional extras
pip install smolagents

# With E2B sandbox support
pip install smolagents[e2b]

# With LiteLLM for multi-provider support (OpenAI, Anthropic, etc.)
pip install smolagents[litellm]

# With Gradio UI
pip install smolagents[gradio]

# Everything at once
pip install smolagents[e2b,litellm,gradio]

# Set your HF token (needed for gated/private models)
export HF_TOKEN=hf_...

💡

Philosophy: smolagents deliberately keeps its core small. Unlike LangChain or LlamaIndex, it avoids abstractions that obscure what's happening. The CodeAgent generates runnable Python instead of structured JSON — giving it the full power of the language for loops, conditionals, and data manipulation.

Core Architecture at a Glance

User Task

→

Agent (LLM)

→

Generate Python Code

→

Execute (Sandboxed)

→

Observe Result

→

Repeat / Return

Module 1 — Core Models & Setup

Hugging Face Hub Models

Use HfApiModel to call any model hosted on the Hugging Face Hub via the Inference API — no GPU required. For local execution with full control, use TransformersModel to load weights directly.

# ── HfApiModel: serverless inference via HF Hub API ──────────────────────
from smolagents import HfApiModel

# Points to a hosted model; HF_TOKEN env var is picked up automatically.
# Use any model ID from hf.co/models — gated models require an approved token.
model = HfApiModel(
    model_id="Qwen/Qwen2.5-72B-Instruct",   # model to call
    token="hf_...",                        # or set HF_TOKEN env var
    timeout=120,                            # seconds before giving up
)

# Quick smoke-test — call the model directly (no agent wrapping yet)
response = model([{"role": "user", "content": "What is 2 + 2?"}])
print(response.content)  # → "4"

# ── TransformersModel: run weights locally on your GPU ───────────────────
from smolagents import TransformersModel

# Downloads (or reads from cache) and runs the model locally.
# Requires transformers + accelerate installed. Great for offline/privacy use.
local_model = TransformersModel(
    model_id="Qwen/Qwen2.5-7B-Instruct",    # smaller model for local GPU
    device_map="auto",                      # auto-assign to GPU/CPU
    torch_dtype="auto",                     # bfloat16 where possible
    max_new_tokens=2048,
)

response = local_model([{"role": "user", "content": "Explain gradient descent briefly."}])
print(response.content)

ℹ

Model requirements for agents: The model must support chat templates and ideally tool/function calling syntax. For CodeAgent, the model only needs to generate Python code blocks — making it compatible with a wider range of open models than JSON-tool-calling alternatives.

External Providers (OpenAI / Anthropic)

Use LiteLLMModel for a unified interface to 100+ providers, or OpenAIServerModel for any OpenAI-compatible endpoint (including Ollama, vLLM, Together AI).

# ── LiteLLMModel: unified interface to OpenAI, Anthropic, Gemini, etc. ───
import os
from smolagents import LiteLLMModel

# OpenAI GPT-4o — set OPENAI_API_KEY in your environment
model_openai = LiteLLMModel(
    model_id="openai/gpt-4o",              # litellm provider/model format
    api_key=os.environ["OPENAI_API_KEY"],
    temperature=0.1,                       # lower temp = more deterministic code
)

# Anthropic Claude — set ANTHROPIC_API_KEY in your environment
model_claude = LiteLLMModel(
    model_id="anthropic/claude-sonnet-4-5",
    api_key=os.environ["ANTHROPIC_API_KEY"],
)

# Google Gemini
model_gemini = LiteLLMModel(
    model_id="gemini/gemini-2.0-flash",
    api_key=os.environ["GEMINI_API_KEY"],
)

# ── OpenAIServerModel: any OpenAI-compatible REST endpoint ───────────────
from smolagents import OpenAIServerModel

# Works with Ollama, vLLM, Together AI, Groq, Fireworks, etc.
model_local = OpenAIServerModel(
    model_id="qwen2.5:7b",                 # model name as registered by the server
    api_base="http://localhost:11434/v1",  # Ollama local endpoint
    api_key="ollama",                       # Ollama ignores the key, but it's required
)

# Together AI example
model_together = OpenAIServerModel(
    model_id="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    api_base="https://api.together.xyz/v1",
    api_key=os.environ["TOGETHER_API_KEY"],
)

Basic Execution

Before wrapping a model in an agent, you can call it directly to verify connectivity and inspect the raw ChatMessage response object.

# ── Direct model call without any agent wrapper ───────────────────────────
from smolagents import HfApiModel
from smolagents.models import ChatMessage  # type returned by all model backends

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")

# Messages follow the OpenAI chat format: list of role/content dicts
messages = [
    {"role": "system", "content": "You are a concise Python expert."},
    {"role": "user",   "content": "Write a one-liner to flatten a nested list."},
]

response: ChatMessage = model(messages)

print(response.content)     # the text reply
print(response.role)        # "assistant"
print(response.tool_calls)  # None for plain text response

# Streaming (where supported)
for chunk in model.stream(messages):
    print(chunk, end="", flush=True)

Module 2 — The Core Agents

CodeAgent — The Star ⭐

The CodeAgent is smolagents' flagship: instead of emitting a JSON blob like {"tool": "search", "query": "..."}, it writes real Python code that calls tool functions, uses variables across steps, and leverages the full language (loops, list comprehensions, error handling). The code is executed in a sandboxed interpreter and the output is fed back as the next observation.

ℹ

Why code over JSON? JSON tool calling can express one tool call per step. Python can call multiple tools, combine results with arithmetic, iterate with loops, and compose complex logic — all in a single step. This dramatically reduces the number of LLM calls needed.

# ── Minimal CodeAgent ────────────────────────────────────────────────────
from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool

# 1. Choose a backbone model
model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")

# 2. Equip the agent with tools (built-ins or custom — covered in Module 3)
agent = CodeAgent(
    tools=[DuckDuckGoSearchTool()],  # tools the generated code may call
    model=model,
    max_steps=10,                   # max reasoning steps before giving up
    verbosity_level=1,              # 0=silent, 1=steps, 2=full code
)

# 3. Run a task — the agent writes Python to fulfill it
result = agent.run(
    "What is the current price of Bitcoin in USD? Search the web and return just the number."
)
print(result)

# ── What the agent generates internally (example) ────────────────────────
# The LLM produces something like this Python snippet, which smolagents
# executes in its sandboxed interpreter:
#
#   results = web_search("Bitcoin price USD today")
#   price_text = results[0]["snippet"]
#   # Extract just the dollar amount
#   import re
#   match = re.search(r'\$[\d,]+', price_text)
#   final_answer(match.group() if match else price_text)
#
# Notice: the agent used re (a standard library) inside its generated code,
# combined the tool call with string processing — impossible with JSON calls.

ToolCallingAgent

The ToolCallingAgent follows the traditional function-calling approach: the LLM outputs a structured JSON object specifying which tool to call and with which arguments. It's more predictable but less expressive — one tool call per step, no Python logic between calls.

Feature	CodeAgent	ToolCallingAgent
Output format	Python code block	JSON tool call
Multi-tool per step	✅ Yes	❌ One at a time
Uses variables across steps	✅ Yes	❌ No
Model requirement	Any chat model	Needs function-calling support
Predictability	Medium	High
Best for	Complex reasoning, data manipulation	Simple routing, structured APIs

# ── ToolCallingAgent: classic JSON-based tool routing ────────────────────
from smolagents import ToolCallingAgent, HfApiModel, DuckDuckGoSearchTool

# Requires a model that natively supports tool/function calling
model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")

agent = ToolCallingAgent(
    tools=[DuckDuckGoSearchTool()],
    model=model,
    max_steps=5,
)

# Same .run() interface as CodeAgent
result = agent.run("Search for the latest news about open-source LLMs.")
print(result)

# When to prefer ToolCallingAgent over CodeAgent:
#  - You need deterministic, auditable tool selection (compliance contexts)
#  - Your tools have strict input schemas and you want validation at the call site
#  - The model you're using doesn't generate good Python code
#  - Task is simple single-tool routing (e.g., always call one API endpoint)

System Prompts

Every smolagents agent has a default system prompt that instructs the model on how to use tools and format its output. You can override it entirely or use the template variables to extend it.

# ── Inspect the default system prompt ────────────────────────────────────
from smolagents import CodeAgent, HfApiModel

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")
agent = CodeAgent(tools=[], model=model)

# See what the default prompt looks like
print(agent.system_prompt_template)  # raw template with {{tool_descriptions}}

# ── Custom system prompt with persona and constraints ─────────────────────
from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool

# The template MUST include {{tool_descriptions}} and {{tool_names}}
# so smolagents can inject the available tools at runtime.
CUSTOM_SYSTEM_PROMPT = """You are FinBot, a senior quantitative analyst assistant.
You ONLY answer questions related to finance, economics, and markets.
For all other topics, politely decline and redirect to finance.

You have access to the following tools:
{{tool_descriptions}}

Rules you MUST follow:
1. Always cite the source of any data you retrieve.
2. Express monetary values with proper currency symbols and commas.
3. When computing statistics, show your work in the code.
4. Use `final_answer()` to return your response once you have enough data.

Available tool names: {{tool_names}}
"""

agent = CodeAgent(
    tools=[DuckDuckGoSearchTool()],
    model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"),
    system_prompt=CUSTOM_SYSTEM_PROMPT,  # pass your custom template here
)

result = agent.run("What is the current P/E ratio of the S&P 500?")
print(result)

⚠

Required template variables: Always include {{tool_descriptions}} and {{tool_names}} in your custom system prompt. Omitting them means the agent won't know which tools are available and will fail to call them.

Module 3 — Tools & Tool Building

Built-in Tools

smolagents ships a curated set of ready-to-use tools. Import them directly and pass to any agent.

Tool class	What it does	Extra deps
`DuckDuckGoSearchTool`	Web search via DuckDuckGo (no API key)	`duckduckgo-search`
`PythonInterpreterTool`	Execute arbitrary Python code	—
`WikipediaSearchTool`	Search and retrieve Wikipedia articles	`wikipedia-api`
`VisitWebpageTool`	Fetch and parse a URL's text content	`requests`, `markdownify`
`SpeechToTextTool`	Transcribe audio files via Whisper	`transformers`
`TextToImageTool`	Generate images from text prompts	`diffusers`
`GoogleSearchTool`	Web search via Google Custom Search API	API key required

# ── Agent with multiple built-in tools ───────────────────────────────────
from smolagents import (
    CodeAgent,
    HfApiModel,
    DuckDuckGoSearchTool,
    VisitWebpageTool,
    WikipediaSearchTool,
)

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")

# Equip the agent with several tools; it decides which to use
agent = CodeAgent(
    tools=[
        DuckDuckGoSearchTool(),         # live web search
        VisitWebpageTool(),             # read specific page content
        WikipediaSearchTool(),          # structured encyclopedia lookup
    ],
    model=model,
    max_steps=8,
)

result = agent.run(
    "Compare the market cap of NVIDIA and AMD. Use Wikipedia for historical context "
    "and DuckDuckGo for current figures."
)
print(result)

# ── PythonInterpreterTool: let the agent run Python ───────────────────────
from smolagents import CodeAgent, HfApiModel, PythonInterpreterTool

# PythonInterpreterTool lets sub-agents or ToolCallingAgents run Python
# (CodeAgent already executes code natively; this tool is mainly for
# ToolCallingAgents that need a Python execution capability).
python_tool = PythonInterpreterTool(
    authorized_imports=["math", "statistics", "numpy"],
)

agent = CodeAgent(tools=[python_tool], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"))
result = agent.run("Calculate the standard deviation of [2, 4, 4, 4, 5, 5, 7, 9].")
print(result)

Custom Tools — @tool Decorator

The fastest way to create a custom tool. Decorate a plain Python function with @tool. smolagents parses the docstring and type hints to build the tool's schema — both are mandatory.

🚨

Type hints and docstrings are NOT optional. smolagents uses them to automatically generate the tool description that's injected into the agent's system prompt. A missing type hint or incomplete docstring will cause the tool to be unusable or misused by the model.

# ── Custom tool via @tool decorator ──────────────────────────────────────
from smolagents import tool, CodeAgent, HfApiModel


@tool
def get_stock_price(ticker: str) -> str:
    """Retrieves the current stock price for a given ticker symbol.

    This tool fetches real-time stock price data from Yahoo Finance.
    Use it when you need the current market price of a publicly traded company.

    Args:
        ticker: The stock ticker symbol (e.g. 'AAPL', 'MSFT', 'GOOGL').
                Must be a valid US stock exchange ticker.

    Returns:
        A string describing the current price, e.g. 'AAPL: $182.34'
    """
    # In production, replace with a real finance API (yfinance, Alpha Vantage, etc.)
    import random
    fake_price = round(random.uniform(50, 500), 2)
    return f"{ticker.upper()}: ${fake_price}"


@tool
def calculate_percentage_change(old_value: float, new_value: float) -> float:
    """Calculates the percentage change between two numerical values.

    Use this tool for computing growth rates, price changes, or any
    relative change calculation.

    Args:
        old_value: The original or baseline value. Must be non-zero.
        new_value: The new or current value to compare against old_value.

    Returns:
        The percentage change as a float. Positive means increase, negative means decrease.
    """
    if old_value == 0:
        raise ValueError("old_value cannot be zero — division by zero.")
    return ((new_value - old_value) / abs(old_value)) * 100


# Pass the decorated functions directly — smolagents wraps them automatically
agent = CodeAgent(
    tools=[get_stock_price, calculate_percentage_change],
    model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"),
)

result = agent.run(
    "Get the price of AAPL and MSFT, then compute the percentage difference between them."
)
print(result)

Class-Based Tools

For tools that need initialization parameters (API keys, database connections, config) or stateful behavior, subclass Tool and define the required class attributes and a forward() method.

# ── Complex tool via Tool subclass ────────────────────────────────────────
from smolagents import Tool, CodeAgent, HfApiModel
from typing import Any


class DatabaseQueryTool(Tool):
    """Query a SQLite database with natural-language-friendly SQL."""

    # --- Required class attributes -------------------------------------------
    name = "database_query"                            # snake_case identifier
    description = (
        "Executes a SQL SELECT query against the application database and returns "
        "the results as a formatted string. Use this when you need to look up or "
        "aggregate data from the database."
    )
    inputs = {                                          # input schema dict
        "query": {
            "type": "string",
            "description": (
                "A valid SQL SELECT query. Only SELECT statements are allowed; "
                "INSERT/UPDATE/DELETE will be rejected."
            ),
        }
    }
    output_type = "string"                             # type returned by forward()

    def __init__(self, db_path: str = ":memory:", **kwargs: Any):
        # Call super().__init__() — smolagents sets up internal state here
        super().__init__(**kwargs)
        import sqlite3
        self.conn = sqlite3.connect(db_path, check_same_thread=False)
        self._seed_demo_data()

    def _seed_demo_data(self) -> None:
        # Populate in-memory DB with sample data for demonstration
        self.conn.executescript("""
            CREATE TABLE IF NOT EXISTS products (
                id INTEGER PRIMARY KEY,
                name TEXT,
                category TEXT,
                price REAL,
                stock INTEGER
            );
            INSERT OR IGNORE INTO products VALUES
                (1, 'Laptop Pro', 'Electronics', 1299.99, 45),
                (2, 'Wireless Mouse', 'Electronics', 29.99, 200),
                (3, 'Standing Desk', 'Furniture', 599.00, 12),
                (4, 'Coffee Mug', 'Kitchen', 14.99, 500),
                (5, 'Notebook', 'Stationery', 4.99, 1000);
        """)
        self.conn.commit()

    def forward(self, query: str) -> str:
        # Security: only allow SELECT queries
        stripped = query.strip().upper()
        if not stripped.startswith("SELECT"):
            return "Error: Only SELECT queries are permitted."
        try:
            cursor = self.conn.execute(query)
            columns = [desc[0] for desc in cursor.description]
            rows = cursor.fetchall()
            if not rows:
                return "Query returned no results."
            # Format as a simple ASCII table
            header = " | ".join(columns)
            divider = "-" * len(header)
            data = "\n".join(" | ".join(str(v) for v in row) for row in rows)
            return f"{header}\n{divider}\n{data}"
        except Exception as e:
            return f"SQL Error: {e}"


# Instantiate with a specific DB path (or use :memory: for in-process testing)
db_tool = DatabaseQueryTool(db_path=":memory:")

agent = CodeAgent(
    tools=[db_tool],
    model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"),
)

result = agent.run(
    "Which product category has the highest average price? Show me the top 3 products."
)
print(result)

Module 4 — Secure Code Execution

Local Sandboxing

When a CodeAgent generates Python code, smolagents does not use Python's built-in exec() directly. Instead it runs the code through a custom AST (Abstract Syntax Tree) evaluator that restricts dangerous operations.

🔒

What's blocked by default

File system access (open, os.remove), subprocess execution (subprocess, os.system), network access (socket, urllib), import of arbitrary modules, eval/exec, __import__, attribute access to dunder methods.

✅

What's allowed by default

Basic Python operations (arithmetic, string ops, list comprehensions, conditionals, loops), calling tool functions that the agent was explicitly given, built-ins like print, len, range, sorted, zip, map, filter.

# ── The AST evaluator rejects dangerous code patterns ────────────────────
from smolagents import CodeAgent, HfApiModel

agent = CodeAgent(tools=[], model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"))

# These would be blocked if the agent tried to generate them:
#   import os; os.system("rm -rf /")          ← subprocess blocked
#   open("/etc/passwd").read()                  ← file I/O blocked
#   __import__("subprocess").run(["ls"])         ← dynamic import blocked
#
# The agent CAN do:
#   result = [x**2 for x in range(10)]         ← pure Python ✅
#   answer = some_tool(arg)                     ← tool call ✅
#   final_answer(result)                        ← built-in special function ✅
print("Sandbox active — only safe operations permitted by default.")

Authorized Imports

To expand the agent's Python capabilities (e.g., enable pandas for data wrangling or numpy for math), pass a whitelist via additional_authorized_imports.

# ── Expanding imports safely ──────────────────────────────────────────────
from smolagents import CodeAgent, HfApiModel

agent = CodeAgent(
    tools=[],
    model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"),
    # Each string is a module (or submodule) the agent's code may import.
    # Only add modules you trust and that are installed in your environment.
    additional_authorized_imports=[
        "pandas",        # DataFrames
        "numpy",         # numerical operations
        "matplotlib",    # plotting
        "sklearn",       # machine learning
        "json",          # JSON parsing (stdlib)
        "re",            # regex (stdlib)
        "datetime",      # date/time utilities (stdlib)
    ],
    max_steps=12,
)

# Now the agent can do data analysis using pandas/numpy in its generated code
result = agent.run(
    """
    Create a small dataset of monthly sales figures for Q1-Q4:
    Q1: 120000, Q2: 145000, Q3: 132000, Q4: 198000.
    Calculate summary statistics (mean, std, min, max) using numpy,
    then identify which quarter had the highest growth rate.
    """
)
print(result)

⚠

Security trade-off: Adding requests or httpx to the import list allows the agent to make arbitrary HTTP requests. Add only the packages your use case genuinely requires. For untrusted inputs, use E2B cloud sandboxing instead.

E2B Cloud Sandbox

For maximum isolation — especially with untrusted user inputs or when the agent needs full OS access (file I/O, networking, etc.) — use E2BExecutor to run generated code in an ephemeral cloud micro-VM provided by E2B.

# ── E2B cloud sandbox execution ───────────────────────────────────────────
# Install: pip install smolagents[e2b]
# Requires: E2B_API_KEY environment variable

import os
from smolagents import CodeAgent, HfApiModel
from smolagents.executors import E2BExecutor

# E2B spins up an isolated Firecracker microVM for each agent run.
# The VM is destroyed after execution — no state leaks between runs.
e2b_executor = E2BExecutor(
    api_key=os.environ["E2B_API_KEY"],
    # Optional: specify a custom E2B sandbox template with pre-installed packages
    # template="my-custom-template-id",
)

agent = CodeAgent(
    tools=[],
    model=HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct"),
    executor=e2b_executor,               # swap out the local executor
    additional_authorized_imports=[      # fully isolated, so wider imports are safe
        "pandas", "numpy", "requests", "beautifulsoup4", "matplotlib"
    ],
)

# This code runs inside E2B's cloud VM, not on your machine
result = agent.run(
    """
    Write a Python script that fetches the top 5 Hacker News stories
    using the HN API (https://hacker-news.firebaseio.com/v0/topstories.json),
    then formats them as a numbered list with title, score, and URL.
    """
)
print(result)

💡

When to use E2B: Production deployments with end-user inputs, tasks requiring file I/O or network access, multi-tenant scenarios where one user's agent should never affect another's environment, or when you need a fresh Python environment per run.

Module 5 — Multi-Agent Orchestration

Manager & Worker Architecture

In smolagents, a multi-agent system is built by wrapping a worker agent as a tool and giving it to a manager agent. The manager delegates sub-tasks via normal tool calls; the worker executes them with its own specialized tools and returns the result.

User Task

→

Manager Agent

→

ManagedAgent wrapper

→

Worker Agent

→

Specialized Tools

→

Result → Manager

# ── Multi-agent: Manager delegates to a specialized Web Scraper worker ────
from smolagents import (
    CodeAgent,
    HfApiModel,
    DuckDuckGoSearchTool,
    VisitWebpageTool,
    ManagedAgent,  # wraps an agent as a callable tool
)

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")

# ── Worker Agent: specialized in web research ─────────────────────────────
web_research_agent = CodeAgent(
    tools=[DuckDuckGoSearchTool(), VisitWebpageTool()],
    model=model,
    name="web_researcher",    # used in manager's tool description
    max_steps=6,
)

# Wrap the worker as a ManagedAgent so the manager can call it like a tool.
# description tells the manager WHEN and HOW to invoke this sub-agent.
web_research_tool = ManagedAgent(
    agent=web_research_agent,
    name="web_research",
    description=(
        "Performs deep web research on any topic. "
        "Input: a detailed research question or task as a string. "
        "Output: a comprehensive summary with sources."
    ),
)

# ── Manager Agent: orchestrates the overall workflow ──────────────────────
# The manager sees web_research_tool as just another tool to call
manager_agent = CodeAgent(
    tools=[web_research_tool],   # worker agents appear as tools here
    model=model,
    max_steps=5,
)

# The manager breaks down the task and delegates research to the worker
result = manager_agent.run(
    """
    Research the three most popular open-source LLM frameworks in 2025
    (smolagents, LangChain, LlamaIndex). For each, find:
    1. GitHub star count
    2. Primary use case
    3. Latest version
    Then produce a comparison table.
    """
)
print(result)

State Passing Between Agents

Because both agents run Python, complex objects — DataFrames, images, dictionaries, even trained model weights — can be passed as return values and accessed in the manager's variable scope.

# ── Passing pandas DataFrames between agents ──────────────────────────────
from smolagents import CodeAgent, HfApiModel, ManagedAgent, tool
import pandas as pd

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")


@tool
def load_sales_data() -> str:
    """Loads the company's monthly sales data and returns it as a CSV string.

    Returns:
        A CSV-formatted string with columns: month, region, revenue, units_sold.
    """
    # In production, this would read from a database or file system
    data = {
        "month": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"],
        "region": ["North", "North", "South", "South", "East", "East"],
        "revenue": [120000, 135000, 98000, 112000, 145000, 160000],
        "units_sold": [240, 270, 196, 224, 290, 320],
    }
    return pd.DataFrame(data).to_csv(index=False)


# Worker: data loading and cleaning specialist
data_agent = CodeAgent(
    tools=[load_sales_data],
    model=model,
    additional_authorized_imports=["pandas", "io"],
    name="data_loader",
)

data_tool = ManagedAgent(
    agent=data_agent,
    name="load_and_clean_data",
    description=(
        "Loads sales data and returns a cleaned, analysis-ready CSV string. "
        "No inputs required. Call this first before doing any analysis."
    ),
)

# Manager: analysis and reporting
manager = CodeAgent(
    tools=[data_tool],
    model=model,
    additional_authorized_imports=["pandas", "io", "numpy"],
    max_steps=8,
)

# The manager will:
# 1. Call data_tool → gets CSV string
# 2. Parse CSV with pandas in its own code
# 3. Compute statistics and produce the final report
result = manager.run(
    """
    Load the sales data, then:
    - Compute total revenue and units sold per region
    - Find the best and worst performing months
    - Calculate month-over-month revenue growth rate
    - Summarize findings in a concise business report
    """
)
print(result)

💡

Passing images: Return PIL.Image objects or base64-encoded strings from worker agents. The manager can pass these to vision tools or save them. For large binary objects, use shared file paths as the exchange medium instead of in-memory passing.

Module 6 — Vision & Multimodality

Processing Images with a Multimodal Agent

smolagents supports vision models (Qwen-VL, LLaVA, Llama-3.2-Vision, etc.) natively. Pass images as URLs or local files alongside text prompts. The model receives both modalities in a single message.

# ── Multimodal CodeAgent with image input ────────────────────────────────
from smolagents import CodeAgent, HfApiModel
from smolagents.utils import encode_image_base64  # helper for local files

# Use a vision-capable model — Qwen2-VL and Llama-3.2-Vision are recommended
vision_model = HfApiModel(
    model_id="Qwen/Qwen2-VL-72B-Instruct",  # powerful vision-language model
)

agent = CodeAgent(
    tools=[],          # can add tools; the model will also reason about the image
    model=vision_model,
    max_steps=5,
)

# ── Option A: pass an image URL directly ─────────────────────────────────
result = agent.run(
    task="Describe this image in detail. Identify all objects, their colors, "
         "and the overall scene composition.",
    images=["https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"],
)
print(result)

# ── Option B: pass a local image file ────────────────────────────────────
from PIL import Image
from smolagents import CodeAgent, HfApiModel

vision_model = HfApiModel(model_id="Qwen/Qwen2-VL-72B-Instruct")
agent = CodeAgent(tools=[], model=vision_model)

# Load from disk as PIL Image — smolagents handles encoding automatically
image = Image.open("/path/to/chart.png")

result = agent.run(
    task="This is a business chart. Extract all the data points, axis labels, "
         "and title. Then describe the main trend shown.",
    images=[image],  # PIL Image objects are accepted directly
)
print(result)

# ── Option C: multi-image analysis ───────────────────────────────────────
from smolagents import CodeAgent, HfApiModel
from PIL import Image

vision_model = HfApiModel(model_id="meta-llama/Llama-3.2-11B-Vision-Instruct")

agent = CodeAgent(
    tools=[],
    model=vision_model,
    additional_authorized_imports=["PIL"],
)

before_img = Image.open("before.jpg")
after_img  = Image.open("after.jpg")

# Pass multiple images — the model receives them in order alongside the text
result = agent.run(
    task="You are given a BEFORE and AFTER image of a room renovation. "
         "List all the changes you can identify between the two images. "
         "Be specific about colors, furniture, and layout changes.",
    images=[before_img, after_img],  # order matches the text references
)
print(result)

ℹ

Model compatibility: Vision inputs require a model that supports them. HfApiModel with Qwen2-VL or Llama-3.2-Vision will work. LiteLLMModel with gpt-4o or claude-opus-4 also supports image inputs via the same API.

# ── Vision agent with LiteLLM (GPT-4o) ───────────────────────────────────
import os
from smolagents import CodeAgent, LiteLLMModel, DuckDuckGoSearchTool

vision_model = LiteLLMModel(
    model_id="openai/gpt-4o",
    api_key=os.environ["OPENAI_API_KEY"],
)

agent = CodeAgent(
    tools=[DuckDuckGoSearchTool()],
    model=vision_model,
)

# Mix vision and tool use in a single task
result = agent.run(
    task="Analyze the logo in this image. Identify the brand, then search the web "
         "to find their current stock price and latest news.",
    images=["https://example.com/logo.png"],
)
print(result)

Module 7 — Telemetry & Memory

Step Logs & Agent Memory

After (or during) a run, every intermediate step is captured in agent.logs. Each log entry contains the generated code, execution output, timestamps, and errors — invaluable for debugging and auditing.

# ── Inspecting agent.logs after a run ────────────────────────────────────
from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")
agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model, max_steps=5)

result = agent.run("What are the top 3 AI research papers published in 2025?")

# agent.logs is a list of ActionStep and PlanningStep objects
print(f"\n{'='*60}")
print(f"Total steps taken: {len(agent.logs)}")
print(f"{'='*60}\n")

for i, step in enumerate(agent.logs):
    step_type = type(step).__name__
    print(f"── Step {i+1} ({step_type}) ──────────────────────────")

    # ActionStep: contains the LLM reasoning and code execution
    if hasattr(step, "model_output") and step.model_output:
        print(f"[LLM Output]:\n{step.model_output[:300]}...")

    if hasattr(step, "tool_calls") and step.tool_calls:
        for call in step.tool_calls:
            print(f"[Code executed]:\n{call.arguments.get('code', '')[:400]}")

    if hasattr(step, "observations") and step.observations:
        print(f"[Observation]: {str(step.observations)[:200]}")

    if hasattr(step, "step_number"):
        print(f"[Step number]: {step.step_number}")

    # Duration is available on ActionStep if timing was captured
    if hasattr(step, "duration") and step.duration:
        print(f"[Duration]: {step.duration:.2f}s")

    print()

# ── Export logs for analysis or storage ───────────────────────────────────
import json
from smolagents import CodeAgent, HfApiModel

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")
agent = CodeAgent(tools=[], model=model)
agent.run("Calculate the sum of all prime numbers under 100.")

# Build a structured log record for each step
log_records = []
for step in agent.logs:
    record = {
        "type": type(step).__name__,
        "step_number": getattr(step, "step_number", None),
        "duration_s": getattr(step, "duration", None),
        "model_output": getattr(step, "model_output", None),
        "observations": str(getattr(step, "observations", "")),
        "error": str(getattr(step, "error", "")) if getattr(step, "error", None) else None,
    }
    log_records.append(record)

# Save to file for later analysis
with open("agent_run.json", "w") as f:
    json.dump(log_records, f, indent=2, default=str)

print(f"Saved {len(log_records)} step records to agent_run.json")

# ── Replaying agent memory for continued conversations ───────────────────
from smolagents import CodeAgent, HfApiModel

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")
agent = CodeAgent(tools=[], model=model)

# First run
agent.run("My name is Alice and I love Python. Remember this.")

# Access raw memory (list of messages passed to the LLM)
memory_snapshot = agent.memory.get_full_message_list()
print(f"Memory contains {len(memory_snapshot)} messages")

# Second run — agent retains context from the first run within the same session
result = agent.run("What is my name and what programming language do I love?")
print(result)  # → "Your name is Alice and you love Python."

# Reset memory for a fresh session
agent.memory.reset()
print("Memory cleared.")

Gradio UI

smolagents ships a GradioUI helper that wraps any agent in a polished web chat interface — zero frontend code required. One line to deploy.

# ── Instant web UI for any agent ──────────────────────────────────────────
# Install: pip install smolagents[gradio]

from smolagents import (
    CodeAgent,
    HfApiModel,
    DuckDuckGoSearchTool,
    VisitWebpageTool,
    GradioUI,
)

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")

agent = CodeAgent(
    tools=[DuckDuckGoSearchTool(), VisitWebpageTool()],
    model=model,
    max_steps=10,
    verbosity_level=1,
)

# Launch: opens a browser tab at http://localhost:7860
# The UI shows the chat, intermediate steps, and generated code in real time
ui = GradioUI(agent)
ui.launch(
    server_name="0.0.0.0",  # bind to all interfaces (for Docker/remote)
    server_port=7860,
    share=False,            # set True to get a public gradio.live URL
    debug=False,
)

# ── Customized Gradio UI ───────────────────────────────────────────────────
from smolagents import CodeAgent, HfApiModel, DuckDuckGoSearchTool, GradioUI
import gradio as gr

model = HfApiModel(model_id="Qwen/Qwen2.5-72B-Instruct")
agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model)

# GradioUI wraps your agent; you can extend the underlying gr.Blocks app
ui = GradioUI(
    agent,
    file_upload_folder="./uploads",  # allow file uploads to this folder
)

# Add custom components by accessing the underlying Gradio app
with ui.demo:
    gr.Markdown(
        """
        ## 🤖 SmolAgents Research Assistant
        Ask me anything — I can search the web, analyze data, and write code.
        """
    )

ui.launch()

💡

Hugging Face Spaces: Push your script to a HF Space for a public, permanent deployment. Set share=False and the Space handles routing. Add HF_TOKEN and any API keys as Space secrets.

Reference Links

Docs huggingface.co/docs/smolagents — Official smolagents documentation
GitHub github.com/huggingface/smolagents — Source code (~1,000 lines of core)
Course huggingface.co/learn/agents-course — HF Agents Course (free)
Models HF Hub — Text Generation models — All models usable with HfApiModel
E2B e2b.dev/docs — E2B cloud sandboxing documentation
LiteLLM docs.litellm.ai — All supported providers and model strings
Paper smolagents: Tiny Agents, Big Impact (HF Blog)
Gradio gradio.app/docs — Gradio documentation for UI customization