Redis + REM Labs: Fast AI Memory Caching

REM Labs handles persistent AI memory. Redis handles speed. This guide shows how to use Redis as a cache layer in front of REM Labs, cutting recall latency for repeated queries and enabling real-time AI features that need sub-10ms memory access.

The Cache-Through Pattern

The idea is simple: before calling rem.recall(), check Redis. If the query was recently asked, return the cached result. If not, call REM, cache the result, and return it. This eliminates redundant API calls and drops latency for hot queries to under 2ms.

Step 1: Install Dependencies

npm install @remlabs/sdk ioredis

Step 2: Build the Cached Memory Client

import { RemClient } from "@remlabs/sdk"; import Redis from "ioredis"; import crypto from "crypto"; const rem = new RemClient({ apiKey: process.env.REMLABS_API_KEY }); const redis = new Redis(process.env.REDIS_URL); function cacheKey(query: string, namespace?: string): string { const hash = crypto.createHash("sha256") .update(`${namespace || ""}:${query}`).digest("hex").slice(0, 16); return `rem:recall:${hash}`; } export async function cachedRecall(opts: { query: string; namespace?: string; limit?: number; ttl?: number; }) { const key = cacheKey(opts.query, opts.namespace); const cached = await redis.get(key); if (cached) { return JSON.parse(cached); } const results = await rem.recall({ query: opts.query, namespace: opts.namespace, limit: opts.limit || 5 }); // Cache for 5 minutes by default await redis.setex(key, opts.ttl || 300, JSON.stringify(results)); return results; }

Step 3: Write-Through on Remember

When you store a new memory, invalidate related cache entries so stale results are not served:

export async function cachedRemember(opts: { content: string; namespace?: string; tags?: string[]; metadata?: Record<string, any>; }) { // Store in REM Labs const result = await rem.remember(opts); // Invalidate cache for this namespace const pattern = `rem:recall:*`; const keys = await redis.keys(pattern); if (keys.length > 0) { await redis.del(...keys); } return result; }

Step 4: Session Memory with Redis Pub/Sub

For real-time AI chat, use Redis to share session context across instances while REM handles long-term persistence:

const pub = new Redis(process.env.REDIS_URL); const sub = new Redis(process.env.REDIS_URL); // When a user sends a message, publish to the session channel async function onUserMessage(sessionId: string, message: string) { // Short-term: broadcast to all instances via Redis await pub.publish(`session:${sessionId}`, JSON.stringify({ role: "user", content: message, ts: Date.now() })); // Long-term: persist important context to REM await rem.remember({ content: message, namespace: `session:${sessionId}`, tags: ["conversation"], metadata: { session_id: sessionId } }); } // Subscribe to session updates sub.subscribe("session:*"); sub.on("message", (channel, message) => { const data = JSON.parse(message); // Update local session state in real-time });

Performance Numbers

TTL strategy: Use shorter TTLs (60s) for active conversations where context changes fast, and longer TTLs (300-600s) for reference queries like project docs or architecture decisions that change rarely.

Add sub-millisecond memory to your AI

Free tier. Redis cache layer. Multi-signal retrieval when you need depth.

Get Started