Integration Tutorial April 13, 2026

Add Memory to Haystack RAG Pipelines

Haystack pipelines excel at document retrieval and generation, but they have no built-in concept of conversational memory. This guide adds REM Labs as a custom Haystack component so your RAG pipelines remember previous interactions and retrieve context with 94.6% accuracy.

Why RAG Pipelines Need Memory

A standard Haystack RAG pipeline retrieves documents from a document store, then feeds them to an LLM. But it has no awareness of what the user asked yesterday, what the system answered last week, or what facts have been established across sessions. Every query starts from scratch.

Adding a persistent memory layer means your pipeline can blend document retrieval with conversational context -- giving answers that are both factually grounded and personally relevant.

Step 1: Install

pip install remlabs-memory haystack-ai

Step 2: Create a Haystack Memory Component

from haystack import component, Pipeline
from remlabs import RemMemory

@component
class RemMemoryRetriever:
    def __init__(self, api_key: str, namespace: str, limit: int = 5):
        self.mem = RemMemory(api_key=api_key)
        self.namespace = namespace
        self.limit = limit

    @component.output_types(context=str)
    def run(self, query: str):
        results = self.mem.search(query, namespace=self.namespace, limit=self.limit)
        context = "\n".join([r["value"] for r in results])
        return {"context": context}

@component
class RemMemoryWriter:
    def __init__(self, api_key: str, namespace: str):
        self.mem = RemMemory(api_key=api_key)
        self.namespace = namespace

    @component.output_types(stored=bool)
    def run(self, value: str, tags: list[str] = None):
        self.mem.store(value=value, namespace=self.namespace, tags=tags or [])
        return {"stored": True}

The RemMemoryRetriever component searches REM for relevant memories and returns them as a context string. The RemMemoryWriter persists new information after each interaction.

Step 3: Wire into a RAG Pipeline

from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder

template = """
Previous context: {{context}}

Documents: {{documents}}

Question: {{query}}
Answer the question using the documents and any relevant previous context.
"""

pipeline = Pipeline()
pipeline.add_component("memory", RemMemoryRetriever(
    api_key="sk-rem-...", namespace="haystack-rag"
))
pipeline.add_component("prompt", PromptBuilder(template=template))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))

pipeline.connect("memory.context", "prompt.context")

result = pipeline.run({
    "memory": {"query": "deployment architecture"},
    "prompt": {"query": "deployment architecture", "documents": docs}
})

The memory component runs in parallel with your document retriever. Both feed into the prompt builder. The LLM sees document results and conversational memory side by side.

Step 4: Store Interactions

writer = RemMemoryWriter(api_key="sk-rem-...", namespace="haystack-rag")

# After the pipeline produces a response
writer.run(
    value=f"Q: {query}\nA: {response}",
    tags=["rag-session", "deployment"]
)

What Gets Indexed

Vector embedding -- semantic similarity search across all stored memories
Full-text index -- exact keyword, proper noun, and acronym matching
Entity graph -- extracted entities and their relationships

All three indexes are built at write time. At query time, multi-signal fusion combines results from all three for 94.6% recall on LongMemEval.

Works with any document store: REM handles conversational memory. Your existing Haystack document store (Elasticsearch, Weaviate, Qdrant) handles document retrieval. They complement each other in the same pipeline.

Give your Haystack pipeline a memory

Free tier. No credit card. pip install and go.

Get started free →

Add Memory to Haystack RAG Pipelines

Why RAG Pipelines Need Memory

Step 1: Install

Step 2: Create a Haystack Memory Component

Step 3: Wire into a RAG Pipeline

Step 4: Store Interactions

What Gets Indexed

Related articles

Give your Haystack pipeline a memory