Trace your agents - Weights & Biases Documentation

Learn how to instrument a multi-turn agentic application using the Weave SDK so that you can view, debug, and evaluate your agent’s behavior in the Weave Agents tab. This is intended for developers who are building or integrating agents and want structured visibility into sessions, turns, LLM calls, and tool executions. The Weave SDK for Agents models the full lifecycle of a multi-turn agent conversation: the session that groups turns together, each user-agent exchange (turn), the LLM calls within a turn, and the tool executions that an LLM triggers. If you are tracing individual function calls with @weave.op, see Trace LLM applications instead. Traces appear in the Agents tab of your Weave project. Each session shows a multi-turn timeline with nested tool calls, token usage, and feedback.

Before you begin

To get started, install the weave package and initialize your project. This makes Weave aware of your team and project so that spans are routed to the correct location in the UI. Install Weave and initialize your project:

pip install weave

Replace [YOUR-TEAM] with your W&B team name and [YOUR-PROJECT] with your W&B project name.

import weave

weave.init("[YOUR-TEAM]/[YOUR-PROJECT]")

Call weave.init before any start_session, start_turn, start_llm, or start_tool call. All agent tracing functions no-op silently when tracing is disabled or the init call is absent, so you can leave instrumentation in production code and control it through configuration.

The agent data model

Weave’s agentic observability focuses on the following core concepts:

Concept	Python class	OTel span type	Description
Session	`Session`	(no span; turns are grouped by `conversation_id`)	A conversation or run that contains one or more turns
Turn	`Turn`	`invoke_agent`	One user message and the agent’s complete response
LLM call	`LLM`	`chat`	One call to a language model API
Tool call	`Tool`	`execute_tool`	One tool call triggered by an LLM response

A session groups turns by a shared conversation_id attribute rather than a parent span, so each turn starts its own OTel trace. This design supports distributed tracing and parallel execution. The client sends spans directly to the OTel collector without any server-side aggregation.

Session (conversation_id — no span)
└── Turn 1 (invoke_agent — root span, own trace)
    ├── LLM call (chat)
    │   └── Tool call (execute_tool)
    └── LLM call (chat)
└── Turn 2 (invoke_agent — root span, own trace)
    └── LLM call (chat)

Agent tracing functions

Weave exposes the following top-level functions on the weave module. Each function returns an object that works as a context manager (using with) or that you can close manually by calling .end().

`weave.start_session`

session = weave.start_session(
    agent_name="my-agent",    # Required: identifies the agent in the UI.
    session_id="",            # Optional: stable ID to group turns; auto-generated when empty.
    model="",                 # Optional: default model for turns in this session.
    session_name="",          # Optional: human-readable label shown in the UI.
    include_content=True,     # Optional: set False to omit message bodies from spans.
    continue_parent_trace=False,  # Optional: attach to an existing OTel trace instead of starting a new one.
)

start_session sets a conversation_id attribute on all child spans so that turns are grouped in the Agents tab. If you pass a session_id, it must be stable across the lifetime of the conversation. Re-use the same ID to add new turns to an existing session. When you omit session_id, the SDK generates a UUID automatically. The active session is stored in a Python ContextVar, so any code running in the same async context (or thread) can retrieve it with weave.get_current_session() without passing the session object explicitly.

`weave.start_turn`

turn = weave.start_turn(
    user_message="What is the weather in Tokyo?",  # The user's input text.
    agent_name="my-agent",   # Optional: overrides the session-level agent name.
    model="gpt-4o",          # Optional: model used for this turn.
)

start_turn creates a new invoke_agent span that becomes the root of a new OTel trace. Weave uses this span to represent one complete user-agent exchange in the timeline view. When called as a top-level function, start_turn resolves the active session from the contextvar and delegates to session.start_turn(...). If no session is active, the turn is created without a conversation_id and won’t be grouped with other turns.

`weave.start_llm`

llm = weave.start_llm(
    model="gpt-4o",             # The model identifier.
    provider_name="openai",     # Required: provider name, for example "openai", "anthropic".
    system_instructions=["Be concise."],  # Optional: system prompt strings.
)

start_llm creates a chat span nested under the current turn. Weave uses this span to display token usage, model name, input and output messages, and reasoning in the Agents view. After the LLM call completes, assign the response data to the llm object before it closes:

with weave.start_llm(model="gpt-4o", provider_name="openai") as llm:
    response = openai_client.chat.completions.create(...)
    llm.input_messages = [Message(role="user", content="...")]
    llm.output_messages = [Message(role="assistant", content=response.choices[0].message.content)]
    llm.usage = Usage(
        input_tokens=response.usage.prompt_tokens,
        output_tokens=response.usage.completion_tokens,
    )

Pass provider_name explicitly. Weave doesn’t infer it from the model string.

`weave.start_tool`

tool = weave.start_tool(
    name="get_weather",                  # Tool name as declared to the LLM.
    arguments='{"city": "Tokyo"}',       # JSON string of the tool arguments.
    tool_call_id="call_abc123",          # Optional: tool call ID from the LLM response.
)

start_tool creates an execute_tool span. The span becomes a child of whatever OTel span is active in context (typically the chat span of the LLM call that produced the tool call). Assign the tool result before closing:

with weave.start_tool(name="get_weather", arguments='{"city": "Tokyo"}') as tool:
    result = get_weather_api("Tokyo")
    tool.result = result  # Accepts dict, list, or string; JSON-encoded automatically.

Usage patterns for agent tracing

The following sections describe how to combine these functions depending on how your agent code is structured. The examples below use two types imported from the Weave SDK:

Message represents a single entry in a conversation — a user input, an assistant response, a system prompt, or a tool result — and is what you assign to llm.input_messages and llm.output_messages to record what the model received and produced.
Usage captures token counts from the LLM response and is assigned to llm.usage. Weave uses both to populate the Agents view with the inputs, outputs, and token usage of each LLM call. For all supported data types, see the API reference.

Context manager pattern

The recommended approach for most agents. Each level of the hierarchy is a context manager. The span closes and sends on __exit__, even if an exception occurs. Weave stores the active session, turn, and LLM call in Python ContextVars, so any function called within a with block can call weave.start_llm() or weave.start_tool() without holding an explicit reference to the parent. This works across module boundaries as long as the code runs in the same async context. Use weave.get_current_session(), weave.get_current_turn(), and weave.get_current_llm() to retrieve the active objects from anywhere in the call stack.

import weave
from weave.session.session import Message, Usage

weave.init("[YOUR-TEAM]/[YOUR-PROJECT]")

with weave.start_session(agent_name="weather-bot") as session:
    with session.start_turn(user_message="What is the weather in Tokyo?") as turn:

        # First LLM call: returns a tool call.
        with weave.start_llm(model="gpt-4o", provider_name="openai") as llm:
            response = call_openai(...)
            llm.input_messages = [Message(role="user", content="What is the weather?")]
            llm.think("User wants weather data, I should call get_weather.")
            llm.output("Let me check the weather for you.")
            llm.usage = Usage(input_tokens=100, output_tokens=20)

            # Tool call: child of the LLM call that requested it.
            with weave.start_tool(name="get_weather", arguments='{"city":"Tokyo"}') as tool:
                tool.result = get_weather_api("Tokyo")  # Returns "24°C, sunny".

        # Second LLM call: synthesizes the final answer.
        with weave.start_llm(model="gpt-4o", provider_name="openai") as llm:
            llm.input_messages = [Message(role="user", content="What is the weather?")]
            llm.output("It is 24°C and sunny in Tokyo today.")
            llm.usage = Usage(input_tokens=150, output_tokens=30)

Manual start and end pattern

Use .end() explicitly when you can’t use with blocks (for example, when spans are opened and closed in different function calls, or when managing async lifecycle outside a coroutine).

session = weave.start_session(agent_name="weather-bot")
turn = session.start_turn(user_message="What is the weather?")

llm = weave.start_llm(model="gpt-4o", provider_name="openai")
llm.input_messages = [Message(role="user", content="What is the weather?")]
llm.output("Let me check.")
llm.usage = Usage(input_tokens=100, output_tokens=20)

tool = weave.start_tool(name="get_weather", arguments='{"city": "Tokyo"}')
tool.result = "24°C, sunny"
tool.end()   # end() is idempotent — safe to call more than once.

llm.end()

llm2 = weave.start_llm(model="gpt-4o", provider_name="openai")
llm2.output("It is 24°C and sunny in Tokyo.")
llm2.usage = Usage(input_tokens=150, output_tokens=30)
llm2.end()

turn.end()
session.end()

Semantic conventions

The Weave SDK emits OTel spans that conform to the GenAI semantic conventions and GenAI agent span conventions. Any OTel span is accepted — Weave stores all attributes and makes them queryable. You can add arbitrary attributes to spans using the standard OTel span API alongside Weave’s tracing objects.

How spans appear in the Weave UI

Once you run instrumented code, your traces appear in the Agents tab of your Weave project at https://wandb.ai/[YOUR-TEAM]/[YOUR-PROJECT]/weave/agents.

The Sessions list shows all sessions with a minimap of turn activity.
Clicking a session opens the multi-turn session view showing each turn, its LLM calls, tool executions, token counts, and any attached feedback.
Each chat span shows the input messages, output messages, model name, and usage.
Each execute_tool span shows the tool name, arguments, and result.

For details on viewing Agents data in Weave, see Navigate the Agents view.

Documentation Index

​Before you begin

​The agent data model

​Agent tracing functions

​weave.start_session

​weave.start_turn

​weave.start_llm

​weave.start_tool

​Usage patterns for agent tracing

​Context manager pattern

​Manual start and end pattern

​Semantic conventions

​How spans appear in the Weave UI