10 Memory and Callbacks
Managing conversation history and customizing the agent loop.
Note
Code Reference: code/v0.7/src/agentsilex/
callbacks.pyrunner.py
10.1 The Memory Problem
Sessions grow indefinitely. After many turns:
session = Session()
# ... 100 conversation turns later ...
len(session.dialogs) # Hundreds of messages!
# Token limit exceeded, API errors, slow responsesWe need ways to manage history without building it into the core.
10.2 Solution: Callbacks
Hook into the agent loop with callbacks that run before each LLM call:
class Runner:
def __init__(
self,
session: Session,
context: dict | None = None,
before_llm_call_callbacks: list | None = None, # NEW
):
self.session = session
self.context = context or {}
self.before_llm_call_callbacks = before_llm_call_callbacks or []
def run(self, agent: Agent, prompt: str) -> RunResult:
# ...
while loop_count < 10 and not should_stop:
# Run callbacks before each LLM call
for callback_func in self.before_llm_call_callbacks:
callback_func(self.session)
dialogs = self.session.get_dialogs()
# ... LLM call ...Callbacks receive the session and can modify it before each LLM call.
10.3 Built-in: keep_most_recent_2_turns
A simple callback that keeps only recent conversation turns (callbacks.py):
from agentsilex.session import Session
def keep_most_recent_2_turns(session: Session):
MOST_RECENT = 2 # dialog turns to keep
msg_count = 2 * MOST_RECENT # include user and agent messages
if msg_count < len(session.dialogs):
session.dialogs = session.dialogs[-msg_count + 1:]This keeps only the last 2 turns (4 messages: 2 user + 2 assistant).
10.4 Usage
from agentsilex import Agent, Runner, Session
from agentsilex.callbacks import keep_most_recent_2_turns
agent = Agent(
name="assistant",
model="gpt-4o",
instructions="You are helpful.",
tools=[],
)
session = Session()
runner = Runner(
session,
before_llm_call_callbacks=[keep_most_recent_2_turns],
)
# Even after many turns, only recent history is sent to LLM
for i in range(100):
runner.run(agent, f"Message {i}")
# Session is pruned before each LLM call
len(session.dialogs) # Small number, not 100+10.5 Custom Callbacks
Create your own memory strategies:
10.5.1 Keep Last N Turns
def keep_most_recent_n_turns(n: int):
"""Factory for keeping N recent turns."""
def callback(session: Session):
msg_count = 2 * n
if msg_count < len(session.dialogs):
session.dialogs = session.dialogs[-msg_count + 1:]
return callback
# Usage
runner = Runner(
session,
before_llm_call_callbacks=[keep_most_recent_n_turns(5)],
)10.5.2 Token-Based Truncation
import tiktoken
def keep_under_token_limit(max_tokens: int = 4000):
"""Keep history under token limit."""
encoder = tiktoken.get_encoding("cl100k_base")
def callback(session: Session):
while True:
text = str(session.dialogs)
tokens = len(encoder.encode(text))
if tokens <= max_tokens:
break
# Remove oldest message (keep system prompt if any)
if len(session.dialogs) > 1:
session.dialogs.pop(0)
else:
break
return callback10.5.3 Summarization
def summarize_old_history(summarizer_agent: Agent, keep_recent: int = 3):
"""Summarize old turns, keep recent ones."""
def callback(session: Session):
if len(session.dialogs) <= keep_recent * 2:
return
# Split old and recent
cutoff = -(keep_recent * 2)
old = session.dialogs[:cutoff]
recent = session.dialogs[cutoff:]
# Summarize old history
summary_session = Session()
summary_runner = Runner(summary_session)
result = summary_runner.run(
summarizer_agent,
f"Summarize this conversation briefly:\n{old}"
)
# Replace with summary + recent
session.dialogs = [
{"role": "system", "content": f"Previous context: {result.final_output}"}
] + recent
return callback10.5.4 Logging Callback
def log_conversation():
"""Log each conversation turn."""
def callback(session: Session):
print(f"[Turn {len(session.dialogs)}] Messages: {len(session.dialogs)}")
# Or write to file, database, etc.
return callback10.6 Combining Callbacks
Callbacks run in order:
runner = Runner(
session,
before_llm_call_callbacks=[
log_conversation(),
keep_under_token_limit(8000),
inject_user_context(user_id),
],
)Order matters! In this example:
- Log current state
- Truncate if too long
- Inject user-specific context
10.7 Callback Signature
All callbacks must accept a Session:
def my_callback(session: Session) -> None:
# Modify session.dialogs as needed
pass10.8 Example: Stateful Counter
def count_turns():
"""Track turn count in callback closure."""
turn_count = 0
def callback(session: Session):
nonlocal turn_count
turn_count += 1
print(f"Turn {turn_count}")
return callback
runner = Runner(session, before_llm_call_callbacks=[count_turns()])10.9 Why Callbacks Instead of Built-in Memory?
| Built-in Memory | Callbacks |
|---|---|
| One-size-fits-all | You decide the strategy |
| Hidden behavior | Explicit and visible |
| Hard to customize | Easy to customize |
| Framework decides | You decide |
AgentSilex philosophy: give you the hooks, not the implementation.
10.10 Key Design Decisions
| Decision | Why |
|---|---|
| Callbacks as functions | Simple, no class hierarchy |
| Run before LLM call | Last chance to modify history |
| Session passed directly | Full access to modify |
| List of callbacks | Composable behaviors |
TipCheckpoint
cd code/v0.7Callback system ready! Implement any memory strategy you need.