Integrations Tutorial April 13, 2026

Add Memory to Cohere Applications

Cohere's Command models and Embed API are popular choices for enterprise AI applications, especially for RAG and search. But Cohere's chat endpoint does not persist context between sessions. This guide shows how to add persistent memory to any Cohere-powered application using REM Labs.

Why Cohere Apps Need External Memory

Cohere is built for enterprise search and generation. Their Embed models are among the best for creating embeddings, and Command R is a strong choice for RAG workloads. But like all LLM APIs, Cohere's chat endpoint is stateless. Each request starts fresh.

For enterprise applications -- customer support agents, internal knowledge assistants, personalized recommendations -- you need memory that persists across sessions and users. REM Labs provides this with a single API: store facts, search by meaning, and retrieve the most relevant context automatically.

Step 1: Get Your API Keys

Get a Cohere API key from dashboard.cohere.com and a REM Labs API key from remlabs.ai/console or by running npx @remlabs/memory.

Step 2: Store Memories from Cohere Conversations

import cohere
import requests

co = cohere.ClientV2(api_key="...")
REM_KEY = "sk-rem-..."
REM_BASE = "https://remlabs.ai"

user_msg = "Our company uses Snowflake for our data warehouse and dbt for transforms."

resp = co.chat(
    model="command-r-plus",
    messages=[{"role": "user", "content": user_msg}]
)
reply = resp.message.content[0].text

# Store the interaction
requests.post(f"{REM_BASE}/v1/memory-set", json={
    "key": "cohere-support",
    "value": f"User: {user_msg}\nAssistant: {reply}",
    "namespace": "enterprise-acme",
    "tags": ["tech-stack", "data"]
}, headers={"Authorization": f"Bearer {REM_KEY}"})

The memory is automatically indexed with vector embeddings, full-text search, and entity extraction. The namespace isolates this company's data from all others.

Step 3: Recall Context in Future Sessions

# New session -- different support agent, same customer
user_msg = "What data warehouse does Acme use?"

# Search for relevant memories
search = requests.post(f"{REM_BASE}/v1/memory/search", json={
    "query": user_msg,
    "namespace": "enterprise-acme",
    "limit": 5
}, headers={"Authorization": f"Bearer {REM_KEY}"})

memories = search.json().get("results", [])
context = "\n".join([m["value"] for m in memories])

resp = co.chat(
    model="command-r-plus",
    messages=[
        {"role": "system", "content": f"Customer context from prior interactions:\n{context}"},
        {"role": "user", "content": user_msg}
    ]
)
print(resp.message.content[0].text)
# "Acme uses Snowflake for their data warehouse, with dbt for transforms."

Step 4: Node.js Example

import { CohereClientV2 } from "cohere-ai";

const co = new CohereClientV2({ token: "..." });
const REM_BASE = "https://remlabs.ai";
const REM_KEY = "sk-rem-...";

// Store a memory
await fetch(`${REM_BASE}/v1/memory-set`, {
  method: "POST",
  headers: { "Authorization": `Bearer ${REM_KEY}`, "Content-Type": "application/json" },
  body: JSON.stringify({
    key: "cohere-support",
    value: "Acme uses Snowflake + dbt. Data team of 5. On Enterprise plan.",
    namespace: "enterprise-acme"
  })
});

// Recall
const search = await fetch(`${REM_BASE}/v1/memory/search`, {
  method: "POST",
  headers: { "Authorization": `Bearer ${REM_KEY}`, "Content-Type": "application/json" },
  body: JSON.stringify({ query: "Acme data stack", namespace: "enterprise-acme", limit: 5 })
});
const { results } = await search.json();
const context = results.map(r => r.value).join("\n");

Cohere Embed + REM Labs

If you already use Cohere's Embed API to create document embeddings, REM Labs complements it by handling the memory layer for conversational context. Your document embeddings live in your vector store for RAG, while REM Labs handles user-level memory -- preferences, history, extracted facts -- that enriches each query with personal context.

Enterprise-ready: REM Labs supports namespace isolation, tag-based filtering, and API key scoping -- designed for multi-tenant enterprise applications. See the developer docs for details.

Give your Cohere app a memory

Free tier. No credit card. Works with Command R, Command R+, and all Cohere models.

Get started free →

Add Memory to Cohere Applications

Why Cohere Apps Need External Memory

Step 1: Get Your API Keys

Step 2: Store Memories from Cohere Conversations

Step 3: Recall Context in Future Sessions

Step 4: Node.js Example

Cohere Embed + REM Labs

Related articles

Give your Cohere app a memory