Integration April 13, 2026

Redis + REM Labs: Fast AI Memory Caching

REM Labs handles persistent AI memory. Redis handles speed. This guide shows how to use Redis as a cache layer in front of REM Labs, cutting recall latency for repeated queries and enabling real-time AI features that need sub-10ms memory access.

The Cache-Through Pattern

The idea is simple: before calling rem.recall(), check Redis. If the query was recently asked, return the cached result. If not, call REM, cache the result, and return it. This eliminates redundant API calls and drops latency for hot queries to under 2ms.

Step 1: Install Dependencies

npm install @remlabs/sdk ioredis

Step 2: Build the Cached Memory Client

import { RemClient } from "@remlabs/sdk";
import Redis from "ioredis";
import crypto from "crypto";

const rem = new RemClient({ apiKey: process.env.REMLABS_API_KEY });
const redis = new Redis(process.env.REDIS_URL);

function cacheKey(query: string, namespace?: string): string {
  const hash = crypto.createHash("sha256")
    .update(`${namespace || ""}:${query}`).digest("hex").slice(0, 16);
  return `rem:recall:${hash}`;
}

export async function cachedRecall(opts: {
  query: string;
  namespace?: string;
  limit?: number;
  ttl?: number;
}) {
  const key = cacheKey(opts.query, opts.namespace);
  const cached = await redis.get(key);

  if (cached) {
    return JSON.parse(cached);
  }

  const results = await rem.recall({
    query: opts.query,
    namespace: opts.namespace,
    limit: opts.limit || 5
  });

  // Cache for 5 minutes by default
  await redis.setex(key, opts.ttl || 300, JSON.stringify(results));
  return results;
}

Step 3: Write-Through on Remember

When you store a new memory, invalidate related cache entries so stale results are not served:

export async function cachedRemember(opts: {
  content: string;
  namespace?: string;
  tags?: string[];
  metadata?: Record<string, any>;
}) {
  // Store in REM Labs
  const result = await rem.remember(opts);

  // Invalidate cache for this namespace
  const pattern = `rem:recall:*`;
  const keys = await redis.keys(pattern);
  if (keys.length > 0) {
    await redis.del(...keys);
  }

  return result;
}

Step 4: Session Memory with Redis Pub/Sub

For real-time AI chat, use Redis to share session context across instances while REM handles long-term persistence:

const pub = new Redis(process.env.REDIS_URL);
const sub = new Redis(process.env.REDIS_URL);

// When a user sends a message, publish to the session channel
async function onUserMessage(sessionId: string, message: string) {
  // Short-term: broadcast to all instances via Redis
  await pub.publish(`session:${sessionId}`, JSON.stringify({
    role: "user", content: message, ts: Date.now()
  }));

  // Long-term: persist important context to REM
  await rem.remember({
    content: message,
    namespace: `session:${sessionId}`,
    tags: ["conversation"],
    metadata: { session_id: sessionId }
  });
}

// Subscribe to session updates
sub.subscribe("session:*");
sub.on("message", (channel, message) => {
  const data = JSON.parse(message);
  // Update local session state in real-time
});

Performance Numbers

Redis cache hit: 1-3ms (local) / 5-8ms (remote)
REM Labs recall: 40-80ms (multi-signal fusion)
Cache hit rate: Typically 60-80% for conversational AI workloads

TTL strategy: Use shorter TTLs (60s) for active conversations where context changes fast, and longer TTLs (300-600s) for reference queries like project docs or architecture decisions that change rarely.

Add sub-millisecond memory to your AI

Free tier. Redis cache layer. Multi-signal retrieval when you need depth.

Get Started