Overview
Memories let a Realtime agent carry forward small, durable facts about each end-user — preferences, goals, constraints — from one conversation to the next. After a chat ends, an LLM extracts relevant information and stores it. The next time that user starts a session, the agent is primed with their context.
How It Works
---
config:
theme: redux
look: neo
---
flowchart LR
CE([fa:fa-comment Chat ends]) --> Q([fa:fa-bolt Extract])
Q --> M([fa:fa-database Memory])
M --> R([fa:fa-up-down Rerank])
R --> NS([fa:fa-thought-bubble Next session])
Extraction is async — it runs after the session closes and has no impact on latency. The extracted facts are injected into the agent's system prompt at the start of the next session.
Key Concepts
user (default): one memory per end-user, shared across all agents. agent: one memory per end-user × agent pair.
Up to 10 items per user, each up to 150 characters. Auto-deleted after 90 days of inactivity.
Runs async after chat ends. Only durable facts are stored — preferences, goals, constraints. Requires at least 4 messages.
When the 10-item limit is hit, items are evicted by priority and recency. High-priority, recently updated facts survive longest.
Next Steps
Ready to try it? Follow the Quickstart to enable memory on an agent.
FAQ
Updated about 13 hours ago
