Persistent State for LangGraph Agent Graphs

LangGraph gives you fine-grained control over agent execution with graph-based state machines. But state resets on every invocation. This guide adds REM Labs as a persistent memory layer so your graph nodes can recall facts from previous runs, share context across branches, and retrieve with 90% accuracy.

The Gap in LangGraph

LangGraph manages state within a single graph invocation using a typed State dict. When the graph finishes, the state evaporates. LangGraph's built-in checkpointing saves graph execution state, but it does not provide semantic search over accumulated knowledge. For an agent that needs to recall "what did the user say about deployment last week," you need a real memory layer.

Step 1: Install

pip install remlabs-memory langgraph langchain-openai

Step 2: Define a Graph with Memory Nodes

from typing import TypedDict from langgraph.graph import StateGraph, END from langchain_openai import ChatOpenAI from remlabs import RemMemory mem = RemMemory(api_key="sk-slop-...") llm = ChatOpenAI(model="gpt-4o") class AgentState(TypedDict): query: str context: str response: str def retrieve_memory(state: AgentState) -> AgentState: """Search REM for relevant past context.""" results = mem.search(state["query"], namespace="langgraph-agent", limit=5) context = "\n".join([r["value"] for r in results]) return {**state, "context": context} def generate_response(state: AgentState) -> AgentState: """Generate a response using retrieved context.""" prompt = f"Context:\n{state['context']}\n\nQuestion: {state['query']}" response = llm.invoke(prompt) return {**state, "response": response.content} def store_memory(state: AgentState) -> AgentState: """Persist the exchange for future retrieval.""" mem.store( value=f"Q: {state['query']}\nA: {state['response']}", namespace="langgraph-agent", tags=["conversation"] ) return state

Step 3: Wire the Graph

graph = StateGraph(AgentState) graph.add_node("retrieve", retrieve_memory) graph.add_node("generate", generate_response) graph.add_node("store", store_memory) graph.set_entry_point("retrieve") graph.add_edge("retrieve", "generate") graph.add_edge("generate", "store") graph.add_edge("store", END) app = graph.compile() # Run it result = app.invoke({"query": "What region did we decide on for deployment?"}) print(result["response"])

The graph flows through three nodes: retrieve context from REM, generate a response with that context, then store the exchange back into REM. On the next invocation -- even days later -- the retrieve node surfaces relevant past conversations.

Step 4: Conditional Memory Routing

LangGraph's conditional edges let you route based on whether memory exists.

def has_context(state: AgentState) -> str: return "generate" if state["context"] else "ask_clarification" graph.add_conditional_edges("retrieve", has_context, { "generate": "generate", "ask_clarification": "clarify" })

If the memory search returns nothing, the graph routes to a clarification node instead of hallucinating from empty context.

Multi-Signal Retrieval

Every memory stored in REM is indexed three ways: vector embeddings for semantic similarity, full-text for exact keyword matching, and entity graphs for structured lookups. When the retrieve node calls mem.search(), all three signals are fused using reciprocal rank fusion. This is how REM reaches 90% on LongMemEval -- the hard queries involving proper nouns, temporal references, and knowledge updates all get handled.

Works with LangGraph Cloud: Since REM is an external API called via HTTP, it works identically in local development and LangGraph Cloud deployments. No infrastructure changes needed.

Give your LangGraph agents persistent state

Free tier. No credit card. pip install and go.

Get started free →