Add Memory to Cohere Applications

Cohere's Command models and Embed API are popular choices for enterprise AI applications, especially for RAG and search. But Cohere's chat endpoint does not persist context between sessions. This guide shows how to add persistent memory to any Cohere-powered application using REM Labs.

Why Cohere Apps Need External Memory

Cohere is built for enterprise search and generation. Their Embed models are among the best for creating embeddings, and Command R is a strong choice for RAG workloads. But like all LLM APIs, Cohere's chat endpoint is stateless. Each request starts fresh.

For enterprise applications -- customer support agents, internal knowledge assistants, personalized recommendations -- you need memory that persists across sessions and users. REM Labs provides this with a single API: store facts, search by meaning, and retrieve the most relevant context automatically.

Step 1: Get Your API Keys

Get a Cohere API key from dashboard.cohere.com and a REM Labs API key from remlabs.ai/console or by running npx @remlabs/memory.

Step 2: Store Memories from Cohere Conversations

import cohere import requests co = cohere.ClientV2(api_key="...") REM_KEY = "sk-slop-..." REM_BASE = "https://api.api.remlabs.ai" user_msg = "Our company uses Snowflake for our data warehouse and dbt for transforms." resp = co.chat( model="command-r-plus", messages=[{"role": "user", "content": user_msg}] ) reply = resp.message.content[0].text # Store the interaction requests.post(f"{REM_BASE}/v1/memory-set", json={ "key": "cohere-support", "value": f"User: {user_msg}\nAssistant: {reply}", "namespace": "enterprise-acme", "tags": ["tech-stack", "data"] }, headers={"Authorization": f"Bearer {REM_KEY}"})

The memory is automatically indexed with vector embeddings, full-text search, and entity extraction. The namespace isolates this company's data from all others.

Step 3: Recall Context in Future Sessions

# New session -- different support agent, same customer user_msg = "What data warehouse does Acme use?" # Search for relevant memories search = requests.post(f"{REM_BASE}/v1/memory/search", json={ "query": user_msg, "namespace": "enterprise-acme", "limit": 5 }, headers={"Authorization": f"Bearer {REM_KEY}"}) memories = search.json().get("results", []) context = "\n".join([m["value"] for m in memories]) resp = co.chat( model="command-r-plus", messages=[ {"role": "system", "content": f"Customer context from prior interactions:\n{context}"}, {"role": "user", "content": user_msg} ] ) print(resp.message.content[0].text) # "Acme uses Snowflake for their data warehouse, with dbt for transforms."

Step 4: Node.js Example

import { CohereClientV2 } from "cohere-ai"; const co = new CohereClientV2({ token: "..." }); const REM_BASE = "https://api.api.remlabs.ai"; const REM_KEY = "sk-slop-..."; // Store a memory await fetch(`${REM_BASE}/v1/memory-set`, { method: "POST", headers: { "Authorization": `Bearer ${REM_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ key: "cohere-support", value: "Acme uses Snowflake + dbt. Data team of 5. On Enterprise plan.", namespace: "enterprise-acme" }) }); // Recall const search = await fetch(`${REM_BASE}/v1/memory/search`, { method: "POST", headers: { "Authorization": `Bearer ${REM_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ query: "Acme data stack", namespace: "enterprise-acme", limit: 5 }) }); const { results } = await search.json(); const context = results.map(r => r.value).join("\n");

Cohere Embed + REM Labs

If you already use Cohere's Embed API to create document embeddings, REM Labs complements it by handling the memory layer for conversational context. Your document embeddings live in your vector store for RAG, while REM Labs handles user-level memory -- preferences, history, extracted facts -- that enriches each query with personal context.

Enterprise-ready: REM Labs supports namespace isolation, tag-based filtering, and API key scoping -- designed for multi-tenant enterprise applications. See the developer docs for details.

Give your Cohere app a memory

Free tier. No credit card. Works with Command R, Command R+, and all Cohere models.

Get started free →