Add Persistent Memory to Microsoft AutoGen Agents

AutoGen excels at multi-agent conversations, but every agent starts with a blank slate when the process restarts. This guide wires REM Labs into AutoGen so your agents remember previous conversations, share context across team members, and retrieve facts with 90% accuracy on LongMemEval.

Why AutoGen Agents Need External Memory

Microsoft AutoGen lets you build multi-agent systems where a planner, coder, and critic collaborate in a group chat. The conversation history lives in Python lists attached to each agent. Restart the process and the history is gone. Scale to multiple workers and they cannot share what they learned.

REM Labs solves both problems. It stores every memory unit externally with vector, full-text, and entity graph indexing. Any agent, on any machine, can retrieve relevant context in under 50ms.

Step 1: Install

pip install remlabs-memory pyautogen

Step 2: Create a Memory-Aware Agent

import autogen from remlabs import RemMemory mem = RemMemory(api_key="sk-slop-...") # Store a fact before the conversation mem.store( value="The deployment target is Kubernetes on GCP us-east1.", namespace="autogen-team", tags=["infrastructure"] ) # Retrieve relevant context inside a custom agent def memory_enhanced_reply(recipient, messages, sender, config): last_msg = messages[-1]["content"] results = mem.search(last_msg, namespace="autogen-team", limit=3) context = "\n".join([r["value"] for r in results]) return f"Context from memory:\n{context}\n\nNow responding to: {last_msg}" assistant = autogen.AssistantAgent( name="memory_assistant", llm_config={"model": "gpt-4o", "api_key": "..."}, system_message="You are a helpful assistant with access to persistent memory." ) user_proxy = autogen.UserProxyAgent( name="user", human_input_mode="NEVER", code_execution_config=False ) assistant.register_reply([autogen.Agent], memory_enhanced_reply)

The register_reply hook fires before every response. It searches REM for context relevant to the latest message and prepends it. The agent now has cross-session recall without any changes to AutoGen internals.

Step 3: Persist Group Chat History

# After a group chat completes, persist the full transcript def persist_group_chat(chat_result): for msg in chat_result.chat_history: mem.store( value=f"[{msg['role']}]: {msg['content']}", namespace="autogen-team", tags=["group-chat", "session-042"] ) result = user_proxy.initiate_chat( assistant, message="What region should we deploy to?" ) persist_group_chat(result)

Each message from the group chat is stored as its own memory unit. On the next run, the memory search retrieves the most relevant previous exchanges -- even weeks later.

Step 4: Shared Memory Across Agents

Because REM is an external API, multiple AutoGen agents can share a namespace. A research agent stores findings; a writer agent recalls them.

# Research agent stores mem.store( value="Competitor X launched a new pricing tier at $49/mo.", namespace="research-team", tags=["competitive-intel"] ) # Writer agent retrieves results = mem.search( "competitor pricing changes", namespace="research-team", limit=5 ) for r in results: print(r["value"], r["score"])

What Gets Indexed

Every memory stored through the API is automatically indexed three ways:

No configuration needed. Multi-signal fusion retrieval combines all three at query time for 90% recall on LongMemEval.

Full API reference: See the REM Labs docs for namespace management, tag filtering, metadata queries, and the complete Python SDK reference.

Give your AutoGen agents a memory

Free tier. No credit card. pip install and go.

Get started free →