Overview

Memories let a Realtime agent carry forward small, durable facts about each end-user — preferences, goals, constraints — from one conversation to the next. After a chat ends, an LLM extracts relevant information and stores it. The next time that user starts a session, the agent is primed with their context.

How It Works

---
config:
  theme: redux
  look: neo
---
flowchart LR
CE([fa:fa-comment Chat ends]) --> Q([fa:fa-bolt Extract])
    Q --> M([fa:fa-database Memory])
    M --> R([fa:fa-up-down Rerank])
    R --> NS([fa:fa-thought-bubble Next session])

Extraction is async — it runs after the session closes and has no impact on latency. The extracted facts are injected into the agent's system prompt at the start of the next session.

Key Concepts

Scope

user (default): one memory per end-user, shared across all agents. agent: one memory per end-user × agent pair.

Storage

Up to 10 items per user, each up to 150 characters. Auto-deleted after 90 days of inactivity.

Extraction

Runs async after chat ends. Only durable facts are stored — preferences, goals, constraints. Requires at least 4 messages.

Reranking

When the 10-item limit is hit, items are evicted by priority and recency. High-priority, recently updated facts survive longest.

Next Steps

Ready to try it? Follow the Quickstart to enable memory on an agent.

Overview

How It Works

Key Concepts

Next Steps

FAQ

What’s Next