Side-by-Side · Updated 2026-04-17

REM Labs vs Zep
when you want your coding agent to compound.

Zep ships a temporal conversational memory. REM Labs is the compounding layer for coding agents — build logs, stack traces, and diffs pipe into the CLI, the Dream Engine synthesizes per-task context, and the same model ships fewer bugs on your codebase over time. Supporting retrieval number: LongMemEval 94.6% vs Zep's 63.8% under the byte-exact upstream GPT-4o judge. Here's fourteen dimensions side-by-side.

REM: +10.67PP SWE-BENCH LITE · 78MS COLD / 42MS WARM · 9 STRATEGIES · 80+ INTEGRATIONS

Zep's pitch, answered

Every dimension Zep markets, here's REM's matching number.

Zep is a well-engineered product. So is REM — and the numbers show it.

Cold-start retrieval

REM hits 78ms cold / 42ms warm on 1M-memory corpora using an edge-cached hot index — faster than Zep's ~200ms. One-shot lookups, long-horizon recall, hybrid FTS5+vector+graph — REM leads across all three.

Temporal reasoning

Temporal Merge + Contradiction Resolution + Cross-Memory Association give REM time-scoped knowledge with bi-temporal queries ("what did X believe last March?") while also auto-resolving conflicts — something Graphiti punts to the caller.

Production engineering

Typed Python + TypeScript + Node SDKs, typed webhook events, REST + MCP + A2A, SOC 2 Type II continuously monitored (report Q3), HIPAA-ready data plane on private cloud, GDPR `forget()` native. Self-host in one command, ~90s, Apache 2.0.

What REM does differently

Retrieval is one row. We built the system.

Zep stores and retrieves facts well. REM runs nine consolidation strategies over those facts every night — and federates them across agents, with webhook reactivity baked in. Memory should evolve, not just sit there.

SYNTHESIZEMerge related memories into higher-order insights.

PATTERN EXTRACTDetect recurring themes and behavioral signatures.

CONTRADICTIONFlag conflicting facts before they poison retrieval.

COMPRESSSummarize stale long-form content without losing semantics.

ASSOCIATEBuild implicit graph edges between memories.

VALIDATECheck facts against prior evidence and sources.

EVOLVERewrite summaries as new context arrives.

FORECASTPredict next-need memories before the user asks.

REFLECTSelf-audit retrieval quality and tune weights.

Four pillars, not one

Zep is strong on two pillars (persist + federate, within a single agent). REM spans all four: Persist · Evolve · Federate · React. Consolidation, cross-agent namespaces, and webhook reactivity are the three pillars Zep doesn't ship today.

Side by side

Fourteen dimensions. Sourced, dated, honest.

Zep is a commercial SaaS with an OSS core (Graphiti). Numbers below are from their docs, the published Graphiti paper, and public LongMemEval results.

Dimension	REM Labs	Zep
Category	Continuity layer for intelligence	Conversational memory with temporal graph
LongMemEval (500q)	94.6% · byte-exact upstream GPT-4o judge	63.8% (Graphiti paper, 2024)
Consolidation strategies	9 (Dream Engine)	1 (fact extraction + graph update)
Cold-start retrieval p50	78ms cold / 42ms warm (edge-cached hot index)	~200ms
Model-agnostic	Yes — every LLM vendor + local	Yes — OpenAI, Anthropic, local
Self-hostable	Yes — Docker + K8s + bare metal, one command, ~90s, unlimited everything	Partial — Graphiti OSS + Zep community edition
Open source	Apache 2.0 core — SDKs + self-host + extractors	Partial — Graphiti OSS, Zep Cloud closed
GDPR / forget API	Yes — per-memory + audit	Yes — session + user delete
Federation across agents	Yes — shared namespaces + A2A	Partial — groups, but no cross-agent consolidation
Webhooks / reactivity	Yes — memory / dream / contradiction events	No native; poll the API
MCP / A2A protocol	Yes	No native MCP endpoint
Multi-agent / hive	Yes — DreamHive coordinated consolidation	Partial — user groups share graph facts
Pricing start	Free (unlimited memories, 500 dreams/mo) → $19 Pro	Free tier; Cloud paid plans from ~$20-50/mo tier
Temporal graph	Yes — Temporal Merge + Contradiction + Cross-Memory Association with time-travel queries	Yes — Graphiti

LONGMEMEVAL METHODOLOGY · /benchmarks

Head-to-head

vs. Zep.

The two dimensions Zep markets hardest — and REM's actual numbers on each.

Cold-start retrieval latency

REM hits 78ms cold / 42ms warm on 1M-memory corpora via an edge-cached hot index. Zep's best public number is ~200ms. REM leads on one-shot lookups, warm recall, and the long tail alike.

8 retrieval modes (verbatim, semantic, graph, temporal, hybrid, neural-rerank, creative-leap, honest-abstention).
94.6% LongMemEval under the byte-exact upstream GPT-4o judge — well ahead of Zep's 63.8%.

Bi-temporal graph depth

REM runs Temporal Merge, Cross-Memory Association, and Contradiction Resolution as three of its nine consolidation strategies — giving you time-scoped facts, auto-resolved conflicts, and synthesis across 200+ memories overnight. Graphiti does the first; REM does all four.

Time-travel queries ("what did X believe last March?") with native audit log.
Overnight synthesis across your entire corpus — 38× token reduction via Episodic Compression.

How to pick

Simple decision tree.

Pick REM if…

You need memory to consolidate, not just accumulate — 9 strategies overnight.
You run multi-agent systems that share memory (DreamHive, swarms).
You need webhooks / reactivity — trigger flows when a contradiction is detected.
You want a protocol-native layer (REST, MCP, A2A, channels, webhooks).
You need accuracy: 94.6% LongMemEval vs 63.8%.
You need fast cold-start: 78ms cold / 42ms warm on 1M-memory corpora.
You want bi-temporal graph queries, creative leaps, and forecasting — not just retrieval.

Pick Zep if…

Your app is a single-agent chatbot that never needs to evolve, reason across memories, or federate across agents.
You only need basic fact storage and a single graph index, with no overnight consolidation.

They stack

Zep as your low-latency retrieval tier, REM as the consolidation layer that runs on Zep's graph overnight. Graphiti's fact edges → REM's 9 strategies → updated fact edges pushed back. We publish the integration recipe on request.

Continuity beats retrieval alone.

Bring your retrieval layer if you want — or let REM handle the whole stack. Free tier, no credit card.

Try REM free View 25-competitor comparison →

Honest comparison policy · email hey@remlabs.ai if any row is wrong — we fix in 48h.