AI Memory
Persistent storage across sessions — the Pensieve, but the AI can access it without drowning in someone else's memories.
By default, large language models are stateless — each API call is independent, and the model has no memory of previous conversations with a user. AI Memory systems overcome this limitation by externally storing information from previous interactions and retrieving it as context for future ones. Memory systems can store several types of information: episodic memory (summaries of past conversations and decisions), semantic memory (facts and preferences the user has stated), procedural memory (patterns of how the user likes to work), and entity memory (structured information about specific people, companies, or projects that have been discussed). At the start of a new conversation, the memory system retrieves the most relevant stored information and includes it in the context, creating the appearance of continuity across sessions.
Memory architecture varies in sophistication. Simple memory systems store a rolling summary of recent conversations that's always included in context. More sophisticated systems use vector embedding to index memories semantically, then retrieve only the most relevant memories for each new conversation rather than loading everything. The most advanced systems combine multiple memory types, updating them selectively as new information is learned — differentiating between temporary conversation context (which shouldn't persist) and long-term user preferences and facts (which should). Memory management also involves forgetting: outdated facts must be updated or removed, and privacy considerations require clear policies about what is stored, how long it's retained, and who can access it.
For B2B teams building AI products, persistent memory is often the difference between an AI tool that users tolerate and one they become genuinely dependent on. A sales copilot that remembers the key facts about an account, the concerns that came up in previous meetings, and the stakeholders' communication preferences is dramatically more useful than one that starts fresh with each interaction. A customer success AI that maintains a history of each account's health events, milestones, and product usage patterns can make proactive recommendations grounded in actual account history rather than generic advice. Implementing memory thoughtfully — storing what matters, forgetting what doesn't, and being transparent with users about what's retained — is a product design challenge as much as a technical one.
Related terms
- Context Window— How much the AI holds in working memory — the Pensieve has infinite capacity; LLMs are still catching up.
- Retrieval-Augmented Generation (RAG)— Giving your LLM access to the Restricted Section — so it answers from real knowledge instead of confident hallucination.
- AI Agent— Software that acts without being told what to do next — like house elves, except they work for everyone and can quit.
- Embedding— Numeric representation of meaning — Elvish rune encoding, but as floating-point vectors optimized for semantic search.