summarization

Create a SummarizationToolMiddleware with model-aware defaults.

Convenience factory: builds a SummarizationMiddleware via create_summarization_middleware and wraps it in a SummarizationToolMiddleware. Saves a step and accepts a model string.

What you get

Only the tool layer is registered — the wrapped SummarizationMiddleware is the engine the tool calls into, not a middleware that runs on its own. The agent gains:

A compact_conversation tool to compact its own context window
A system-prompt nudge hinting when to call it
An eligibility gate at ~50% of the auto-summarization trigger so the tool refuses to compact too early

Pairing with auto-summarization

For automatic summarization at the trigger threshold, also register a SummarizationMiddleware. create_deep_agent adds one by default, so dropping create_summarization_tool_middleware(...) into its middleware=[...] gives you both layers; they share state via the _summarization_event key.

Summarization middleware for automatic and tool-based conversation compaction.

This module provides two middleware classes and a convenience factory:

SummarizationMiddleware — automatically compacts the conversation when token usage exceeds a configurable threshold.

Older messages are summarized via an LLM call and the full history is offloaded to a backend for later retrieval.
SummarizationToolMiddleware — exposes a compact_conversation tool that lets the agent (or a human-in-the-loop approval flow) trigger compaction on demand.

Composes with a SummarizationMiddleware instance and reuses its summarization engine.
create_summarization_tool_middleware — convenience factory that creates both middleware layers with model-aware defaults.

Usage

from deepagents import create_deep_agent
from deepagents.middleware.summarization import (
    SummarizationMiddleware,
    SummarizationToolMiddleware,
)
from deepagents.backends import FilesystemBackend

backend = FilesystemBackend(root_dir="/data")

summ = SummarizationMiddleware(
    model="gpt-5.5",
    backend=backend,
    trigger=("fraction", 0.85),
    keep=("fraction", 0.10),
)
tool_mw = SummarizationToolMiddleware(summ)

agent = create_deep_agent(middleware=[summ, tool_mw])

Storage

Offloaded messages are stored as markdown at /conversation_history/{thread_id}.md.

Each summarization event appends a new section to this file, creating a running log of all evicted messages. Base64 media in evicted messages is written separately under <artifacts_root>/conversation_history/media/ and referenced by path from the markdown, so the history file stays text-only (see _offload_inline_media for the exact path).

Summary prompt

DEEPAGENTS_DEFAULT_SUMMARY_PROMPT augments LangChain's DEFAULT_SUMMARY_PROMPT with a deepagents-specific addendum explaining the media reference tags that the offloading behavior introduces, so the summarizing model knows to preserve them. It is the default summary_prompt for SummarizationMiddleware and both factories.

LangChain Assistant

Menu

Attributes

Functions

Classes

Type Aliases

Why this exists in `deepagents`

What you get

Pairing with auto-summarization

Usage

Storage

Summary prompt

Menu

summarization

Attributes

Functions

Classes

Type Aliases

Why this exists in deepagents

What you get

Pairing with auto-summarization

Usage

Storage

Summary prompt

Why this exists in `deepagents`