Knowledge that compiles itself

Drop anything in.
Understanding comes out.

PDFs, URLs, transcripts, conversations — the Second Brain extracts entities, concepts, and cross-references automatically. Every source strengthens every other source. It doesn't just store what you feed it. It understands it, connects it, and tells you what's missing.

What 40 raw sources become

Most AI memory tools dump isolated notes. The Second Brain builds a structured knowledge graph where everything is linked, indexed, and queryable.

Without Second Brain
40 disconnected files
PDF: "Q2 Strategy Deck" — sits in Downloads
Article: "Why RAG is failing" — bookmarked, never read again
Meeting notes: "Acme wants SSO" — lost in chat history
Transcript: "Podcast on AI agents" — who has time to re-listen?
+ 36 more unlinked, unsearchable, forgotten...
With Second Brain
Structured knowledge base
127 entities extracted — people, companies, products, repos — each with their own page, cross-referenced automatically.
43 concepts documented — frameworks, patterns, ideas — linked to the sources that introduced them.
3 contradictions flagged — the podcast claims X, but the strategy deck says Y. Both cited. You decide.
1 hot cache — 500-word session context loaded instantly. Your AI always knows where you left off.

Drop any source. The AI does the rest.

PDFs, URLs, transcripts, images, conversations, code — the ingestion pipeline handles them all. Sources are archived immutably. Knowledge pages are generated automatically.

📄
PDFs
Papers, decks, reports
🌐
URLs
Articles, docs, blog posts
🎤
Transcripts
Meetings, podcasts, calls
📸
Images
Screenshots, diagrams, OCR
💬
Conversations
ChatGPT, Claude, Grok

One source in. Eight to fifteen pages out.

Every source goes through an 8-step pipeline. A single 20-page PDF typically produces 8–15 interconnected knowledge pages with an average of 12 cross-references each.

01
Archive
Source is saved immutably. The original is never modified — it's your permanent record. Delta tracking prevents duplicate processing.
02
Extract
AI reads the full source. Key claims, entities, concepts, dates, and relationships are identified and structured.
03
Entity Resolution
People, companies, products, and repos get their own pages. If the entity already exists, the page is updated — never duplicated.
04
Concept Mapping
Significant ideas and frameworks are documented as concept pages with definitions, examples, and connections to related concepts.
05
Cross-Reference
Every mention of an existing entity or concept is linked bidirectionally. The knowledge graph grows denser with every source.
06
Contradiction Detection
New claims are compared against existing knowledge. Conflicts are flagged on both pages — never silently overwritten.
07
Index Update
Master index, domain sub-indexes, and the hot cache are all updated. Every page is findable in under 500 tokens of reads.
08
Log & Notify
Operation is logged with timestamp, pages created, pages updated, and key insights. Full audit trail, always.

The hot cache. ~500 tokens. Instant recall.

Your AI loads a 500-word summary of recent context at the start of every session. It always knows where you left off, what changed, and what matters right now. No crawling thousands of notes.

knowledge/
  ├── index.md          # Master catalog (~1,000 tokens)
  ├── hot.md            # Session cache (~500 tokens) ← loaded first
  ├── entities/         # People, orgs, products (100-300 tokens each)
  ├── concepts/         # Ideas, patterns, frameworks
  ├── sources/          # One summary per ingested source
  ├── questions/        # Filed answers and research results
  ├── log.md            # Chronological operation history
  └── overview.md       # Executive summary of everything
sources/                   # Immutable source archive
  ├── articles/         # Web articles, blog posts
  ├── transcripts/      # Video/audio transcripts
  └── .manifest.json    # Delta tracking (hash, ingested_at)
~500
Hot cache tokens
Loaded every session. Recent context, active threads, key facts.
~1,500
Quick query cost
Hot cache + index scan. Enough for factual lookups.
~3,000
Standard query cost
Hot cache + index + 3-5 relevant pages. Handles most questions.

How your knowledge organizes itself

Three layers, each with a job. The hot cache rides in every prompt. The index maps everything. Pages hold the depth. Your AI reads exactly what it needs — nothing more.

Layer 1
Hot Cache
~500 tokens
Always loaded. Recent context, active threads, key facts. Your AI picks up exactly where you left off, every session. No retrieval latency — it's already there.
Layer 2
Index
~1,000 tokens
Topic-level catalog of every entity, concept, and source in your knowledge base. One scan tells the AI which pages to read. No embedding search needed.
Layer 3
Pages
100–500 tokens each
Deep knowledge articles auto-generated from your sources. Entity profiles, concept definitions, research summaries — cross-referenced and cited. Loaded on demand.

A factual lookup costs ~1,500 tokens (hot cache + index). A full synthesis costs ~3,000–8,000 (cache + index + relevant pages). Knowledge is compiled once and queried many times — the opposite of RAG, which re-derives everything per query.

Knowledge Graph

Every entity, concept, and source becomes a node. Every cross-reference becomes an edge. The graph renders in real time with physics-based layout that scales to thousands of nodes.

Layout engine
Barnes-Hut Quadtree Physics
O(n log n) node repulsion using spatial partitioning. Smooth force-directed layout without the O(n²) cost of naive simulation. Handles 5,000+ nodes at 60fps.
3 layout modes
Force / Radial / Hierarchical
Force-directed for organic exploration. Radial for centered topic views. Hierarchical for dependency chains. Switch modes live without losing position context.
Edge rendering
Angular Bucketing & Animated Particles
Edge bundling groups parallel connections by angular sector, reducing visual clutter. Animated particles trace information flow along edges in real time.
Clustering
Real-Time Cluster Detection
Densely connected subgraphs are identified and color-coded automatically. Clusters emerge as you add knowledge — domains, projects, people networks.

From chaos to clarity

Three messy meeting notes go in. A structured, cross-referenced knowledge base comes out.

Raw input: 3 meeting notes
"Talked to Sarah Chen from Acme Corp about SSO integration. She wants SAML. Budget approved Q3. Also David Lee mentioned their team is evaluating Zep."
"Product sync: RAG latency is 2.3s p95. Mike thinks we should try compiled knowledge approach. Sarah from Acme pinged again about timeline."
"Investor call with Priya at Maple VC. She asked about our LongMemEval numbers and how Dream Engine differs from Mem0's consolidation."
Organized output
Entity: Sarah Chen — Acme Corp, wants SAML SSO, budget approved Q3, followed up on timeline. 2 sources
Entity: Acme Corp — Evaluating SSO + Zep. Key contact: Sarah Chen, David Lee. 2 sources
Entity: Priya — Maple VC, interested in benchmarks and Dream Engine differentiation. 1 source
Concept: Compiled Knowledge — Alternative to RAG. Mentioned by Mike re: 2.3s latency issue. 1 source
Contradiction — Acme evaluating Zep (David) vs. requesting SSO integration (Sarah). Flagged for review.
Timeline — Q3: Acme SSO budget. Cross-ref: Sarah Chen, Acme Corp, SAML.

8 search modes. One API.

Different questions need different retrieval strategies. The Second Brain picks the right mode automatically, or you specify it.

01
Semantic
Meaning-based vector search. Finds relevant content even when wording differs.
02
Full-Text (FTS)
Exact term matching with AND-first ranking. Fast, precise, zero ambiguity.
03
Hybrid
Combines semantic + FTS with reciprocal rank fusion. Best of both worlds.
04
Graph
Traverses entity relationships. "Who is connected to Sarah Chen?" in one hop.
05
Temporal
Time-windowed retrieval. "What happened last week?" with chronological ordering.
06
Episodic
Retrieves full event sequences. Reconstructs conversations and meeting flows.
07
Associative
Follows cross-references outward. Surfaces knowledge you didn't think to search for.
08
Contextual
Uses your hot cache to bias results toward your current work. Always relevant.

Ask anything. Get cited answers.

Every answer cites the specific knowledge pages it drew from. Never fabricates. If there's a gap, it tells you what's missing and suggests a source to ingest.

Quick
Factual lookups
Reads only the hot cache and master index. Instant answers for simple factual questions. Zero page loads.
~1,500 tokens
Standard
Most questions
Hot cache + index + 3-5 relevant pages, following cross-references to depth 2. Cited, synthesized answers.
~3,000 tokens
Deep
Comprehensive synthesis
Full knowledge base scan + optional web supplementation. Files the result back as a permanent knowledge page.
~8,000+ tokens

Give it a topic. Get a research dossier.

The autoresearch engine runs 3-5 rounds of web research. Broad search first, then gap-filling, then synthesis. Everything gets filed into your knowledge base with citations and cross-references.

R1
Broad Search
Decomposes the topic into 3-5 angles. Runs 2-3 searches per angle. Fetches top results. Extracts claims, entities, concepts, open questions.
R2
Gap Fill
Identifies contradictions and missing pieces from Round 1. Runs targeted searches (max 5 queries). Resolves conflicts with additional sources.
R3
Synthesis
Creates source pages, concept pages, entity pages, and a master synthesis page. Confidence scored: high (multiple authoritative), medium (single good source), low (unverified).
// Trigger autoresearch via API
const research = await fetch('/v1/memory/research', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({
    topic: 'LLM memory architectures vs RAG pipelines',
    max_rounds: 3,
    max_sources: 15,
    confidence_threshold: 'medium'
  })
});

// Returns: source pages, concept pages, entity pages,
// master synthesis with citations and confidence scores

Eight checks keep your brain clean.

The lint engine scans your entire knowledge base for structural issues, contradictions, dead links, and gaps. Run it daily or on-demand.

0
Orphan pages
Pages with no inbound links
0
Dead links
References to nonexistent pages
3
Contradictions
Conflicting claims flagged for review
12
Missing pages
Concepts mentioned but not yet documented
Missing cross-references
Entity names appearing in text without links. Auto-fixable.
Frontmatter gaps
Pages missing required metadata fields. Auto-fixable.
Stale claims
Assertions contradicted by newer sources. Flagged for review.

The complete second brain toolkit.

Every feature from the Karpathy LLM Wiki pattern, reimagined as a cloud-native API with visual tools.

Source Archive
Immutable storage. Originals never modified. Delta tracking prevents re-processing. Full provenance chain.
Dream Engine Integration
Second Brain feeds the Dream Engine. Raw knowledge becomes insights, patterns, and forecasts overnight.
Contradiction Detection
New claims checked against existing knowledge. Conflicts flagged on both pages. Never silently overwrites.
Auto Cross-References
Every entity and concept mention is linked bidirectionally. The knowledge graph grows denser with each source.
Conversation Save
File any conversation as structured knowledge. Rewritten as declarative facts, not chat transcripts. Indexed and linked.
Visual Canvas
Knowledge graphs, flowcharts, timelines, presentation layouts. Auto-positioned nodes with zone grouping.
Confidence Scoring
High (multiple authoritative sources), medium (single good source), low (speculation). Every claim is rated.
Namespace Isolation
Separate knowledge bases per project, client, or domain. Cross-namespace queries when you need the full picture.
Gap Detection
When a query finds missing knowledge, it tells you exactly what's missing and suggests sources to fill the gap.

Not another RAG pipeline.

RAG re-derives answers from raw chunks every query. The Second Brain compiles knowledge once into persistent, cross-referenced pages. Knowledge compounds — retrieval doesn't.

Feature RAG Pipelines Note-Taking Apps REM Labs Second Brain
Knowledge compounds over timeYes
Auto entity extractionYes
Contradiction detectionYes
Bidirectional cross-referencesManualAutomatic
Source archive (immutable)Yes
Session continuity (hot cache)~500 tokens
Autonomous research loops3-5 rounds
Knowledge health checks8 categories
API accessYesYes
Works with any LLMYesYes

Three API calls to a second brain.

Ingest sources, query knowledge, and check health — all through the same API you already use for memory.

// 1. Ingest a source
await fetch('/v1/memory/ingest', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({
    source: 'https://arxiv.org/abs/2402.15449',  // URL, file path, or raw text
    namespace: 'research'
  })
});
// Creates: 1 source page, 4 entity pages, 3 concept pages, 12 cross-refs

// 2. Query your knowledge base
const answer = await fetch('/v1/memory/ask', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({
    question: 'What are the tradeoffs between RAG and compiled knowledge?',
    depth: 'standard',       // 'quick' | 'standard' | 'deep'
    namespace: 'research'
  })
});
// Returns: cited answer with source pages referenced

// 3. Check knowledge health
const health = await fetch('/v1/memory/health', {
  headers: { 'Authorization': `Bearer ${apiKey}` }
});
// { orphans: 0, dead_links: 0, contradictions: 3, missing_pages: 12 }

Other tools give you a search box. REM gives you a knowledge base that builds itself.

Stop storing documents. Start building understanding.

Three API calls to a knowledge base that understands what you feed it, connects it to everything else, and tells you what's missing. Compounds over time. Feeds the Dream Engine.

Get Your API Key See pricing