How to Add Persistent Memory to a LangChain Agent
LangChain's built-in memory classes reset when your process dies. This guide shows how to replace them with REM Labs -- a persistent memory backend that survives restarts, works across sessions, and scores 90% on LongMemEval. Three new lines of code. Nothing else changes.
The Problem with LangChain's Default Memory
LangChain ships with several memory classes -- ConversationBufferMemory, ConversationSummaryMemory, ConversationEntityMemory. They all work the same way: conversation history is stored in a Python object in RAM. When your server restarts, when your lambda cold-starts, when your notebook kernel dies -- the memory is gone.
For demos, that is fine. For anything in production, you need memory that persists. That means an external store with semantic search, entity extraction, and multi-signal retrieval. That is what REM Labs provides.
Step 1: Install
The Python SDK wraps the REM Labs API. You will also need an API key -- get one free at remlabs.ai/console or by running npx @remlabs/memory.
Step 2: Configure REM as the Memory Backend
The RemLabsMemory class implements LangChain's memory interface, so it drops in anywhere you would use ConversationBufferMemory. The namespace parameter isolates this agent's memories from other agents or users. The search_type="rrf" option enables multi-signal fusion search -- combining vector similarity, full-text, and entity graph lookups for 90% recall accuracy.
Step 3: Use It in a Chain
The second call returns an answer referencing Acme Corp -- even if the process was restarted between calls. REM stores each conversation turn as a separate memory unit with its own embedding, full-text index entry, and entity extraction pass. When the chain calls memory.load_memory_variables(), REM runs a multi-path retrieval query and returns the most relevant context.
Step 4: Search Memories Directly
You can also query the memory store outside the chain, which is useful for building agent tools or debugging.
What Gets Stored
Every memory stored through the REM Labs integration is automatically indexed three ways:
- Vector embedding -- for semantic similarity search
- Full-text index -- for exact keyword, proper noun, and acronym matching
- Entity graph -- extracted entities and relationships for structured queries
You do not need to configure any of this. It happens at write time, automatically.
Why Not Just Use a Vector Database?
You could wire Pinecone or Weaviate into LangChain's memory interface yourself. But vector-only retrieval tops out around 50-67% accuracy on real memory benchmarks. The hard queries -- proper nouns, temporal reasoning, knowledge updates -- require full-text search, entity graphs, and neural reranking working together. REM Labs runs all of these in parallel and fuses the results. That is how it reaches 90% on LongMemEval.
Full API docs: The complete LangChain integration reference, including namespace management, tag filtering, and metadata queries, is in the integration docs.
Give your LangChain agent a memory
Free tier. No credit card. pip install and go.
Get started free →