# Ouroboros — single-document explainer for AI agents > This file is the entire ouroboros-ai.app site concatenated into one Markdown document so an agent can fetch a single URL and have full context. Live at https://ouroboros-ai.app/llms-full.txt ## What Ouroboros is The state layer for agentic workflows. Ouroboros lets you turn your documents, your code, your research, and every observation your agents make into a typed, queryable graph. Quote-grounded by code so the model can't hallucinate facts. Scoped per connection so each agent sees only what you authorize. Journaled so every write is reversible. What one agent learns today, every agent scoped for it on the graph knows. A true state layer and coordination platform for the agentic era. Not a wiki, not a harness, the foundation. The purpose of Ouroboros is to give individuals the ability to structure their data, remain sovereign over that data, and increase the efficiency and power of their agentic workflows. **Status:** Private beta. One person uses it daily — the person building it. No paying customers, no waitlist soup. Every claim on this page is from one real instance, ingesting one real life's worth of documents and code. ## The thesis: Pay once. Query forever. Most teams don't notice the pattern. Every fresh agent session re-reads the same documents from scratch. Ask your agent about a 200-page lease today — 60,000 tokens, ten minutes, one answer. Ask again tomorrow in a new session — another 60,000 tokens, another ten minutes, the same answer. A week of that across a few agents and a handful of documents is half a million tokens of pure rework. Same math for code. A coding agent on a fresh repo burns context before it writes a single line — globbing the file tree, reading a dozen files to map the architecture, following imports, hunting for callers. Every new session repeats it. Every agent you switch to repeats it. With Ouroboros: a one-time extraction puts the document in the graph. Same question tomorrow is a few-thousand-token SDK query, sub-second round-trip. The cost of every redundant question goes flat. ## One daemon, three layers Ouroboros runs as one local daemon with three layers: - **The gate** — every agent connection is minted by a click in the tray app, never by a config file. Each connection has an explicit scope (which entities, which actions). Every read is logged. Every write is reversible. - **The engine** — drop a folder, get a typed knowledge graph. Drop a repo, get a symbol graph with every import, call, and reference resolved. Every claim mined from a document carries a verbatim quote that's substring-checked against the source — claims whose quotes don't literally appear get dropped. The check is a for-loop, not a model. - **The vault** — documents are content-addressed at `~/Ouroboros/vault/sha256/`. Wiki pages are plain markdown at `~/Ouroboros/vault/wiki/`. Open them in Obsidian; the wikilinks, embeds, and tags work natively. The encrypted SQLite is a projection of the files, not the other way around. Architecture flow: ```mermaid flowchart LR A1[Claude Code] --> G A2[Codex] --> G A3[Cursor] --> G A4[Your agent] --> G G[The gate
per-connection scope
session-key auth
audit log every read] G --> E[The engine
tree-sitter ingest
quote-verified extraction
hybrid retrieval] E --> V[The vault
content-addressed files
plain-markdown wiki
encrypted libsql] V -.opens directly in.-> O[Obsidian / your editor] ``` ## Deep-dive table of contents Each section below is the full text of an ouroboros-ai.app deep-dive page. - [Trust Covenant](#trust-covenant) — Five rules. Each one is enforced by code, not by a promise. The 1Password / Signal trust model applied to agentic state. - [MCP & the Sophia SDK](#mcp) — How agents actually talk to Ouroboros — Model Context Protocol, the TypeScript SDK, and the V8 isolate that turns N round trips into one. - [Agentic Coordination](#coordination) — When two or more agents share a project, they shouldn't have to discover each other through you. Channels, inbox, identity attestation, and the side-channel that makes 'you cannot miss messages' a hard guarantee. - [Data Structuring](#data-structuring) — How a folder of PDFs becomes a queryable knowledge surface — three honest ingest tiers, profile-aware mining, hybrid retrieval, and quote-grounded extraction the model cannot fake. - [Knowledge Graph](#knowledge-graph) — The structured-data thesis. Why typed entity-predicate-object rows with dated observers and grounded quotes beat in-context everything — and where we're heading past the Karpathy-wiki ceiling. - [Codebase Graph](#codebase-graph) — Point Ouroboros at a repo. Tree-sitter parses every module locally — modules, symbols, edges — into a typed graph your coding agent queries in sub-second time. The orientation toll your agent pays on every fresh session, paid once. - [Wiki Primitive](#wiki-primitive) — Plain-markdown wiki pages on disk. Open them in Obsidian. Sign them with ed25519. Scope reads to entities. The wiki is your durable scratchpad — and the agents' too. - [Time Machine](#time-machine) — Every write is journaled with full before/after JSON. Any row, any time, revertable. System actors can never hard-delete user data — only soft-deactivate, tombstone, or supersede. --- # Trust Covenant *Five rules. Each one is enforced by code, not by a promise. The 1Password / Signal trust model applied to agentic state.* Most "AI memory" products ask you to trust them. Trust their cloud, trust their prompt, trust that the model didn't hallucinate the fact you're about to act on. Ouroboros makes a different deal: **five rules, all enforced by code**. Not policy pages, not a TOS clause, not a model that promises to behave. Real guardrails in real software you run on your own machine. The model behind it is the same one that lets you trust 1Password with every password and Signal with every message: the vendor cannot do the bad thing, because the bad thing is not architecturally possible. Below are the five rules, what each one means in plain terms, and what stops the system from breaking them. ## 1. Every claim verified Every fact mined from a document carries a verbatim quote from the source — capped at 300 characters. Before that fact is written to the graph, the extractor takes the quote and substring-checks it against the document text. If the quote is not literally present, the fact is dropped on the floor. The check is a `for` loop. It is not a model. It does not have a bad day. A hallucinated fact has nowhere to land, because there is no quote to anchor it to, and the anchor is enforced on write. **Diagram — model emits claim → quote-check → graph or drop** ```mermaid flowchart LR A[Document text] --> B[Model emits claim + quote] B --> C{Quote ≤ 300 chars
and substring of source?} C -- yes --> D[Write to knowledge graph] C -- no --> E[Drop claim
log rejection] D --> F[Fact carries quote + offset
forever] ``` The fact in the graph keeps the quote. Every reader — your agent, the SPA, an audit run — can re-verify it against the source at any time. There is no "trust me, I read it somewhere" layer. ## 2. Every edge grounded Knowledge in Ouroboros isn't a pile of statements. Every relationship and every fact is a **dated observation by a named observer**. The schema makes you record who said it, when, and at what trust tier. You — the human at the keyboard — write at `trust_tier = 'human'`. The model, when it mines a document, writes at `trust_tier = 'extracted'`. A scheduled sweeper writes at `trust_tier = 'system'`. When two observers disagree, the higher tier wins, but the lower-tier observation is not erased — it is **superseded**. You can always see what the model thought and when you overrode it. Corrections are append-only. You don't overwrite a fact; you write a new one that supersedes the old. The history is the audit trail. > **The load-bearing line** > > > The user is always the highest-trust observer. The model is a reporter at a > lower tier. Your corrections never get clobbered by a later mining pass, > because the mining pass writes at extracted and your correction > sits at human. The schema enforces this. It is not a setting. > ## 3. Every write reversible Every mutation — every insert, update, soft-delete, rename, supersession — is journaled with full before/after JSON. The journal lives in the same database as the data. There is a Time Machine view in the tray app: scroll back, find the change, click revert. The row goes back to its prior state. The journal records the revert too. System actors — the mining pipeline, schema migrations, the orphan-fact sweeper — operate under a stricter rule: **they cannot hard-delete user data**. They can soft-deactivate, tombstone, or supersede. Hard delete is a deliberate action you take from the tray, not something a background job can do to you while you sleep. ```ts // Every mutation, regardless of source, lands here first await journal.write({ table: 'knowledge_facts', row_id: factId, before: prior, // full JSON snapshot after: next, // full JSON snapshot actor: 'mining-pipeline', reason: 'extracted-supersedes-extracted', ts: Date.now(), }); // revert_mutation(mutation_id) replays before → after in reverse ``` ## 4. Every agent scoped A new MCP connection is **minted by a click in the tray app**. There is no config file you edit to grant access. The tray shows you what the agent will be able to read, you click approve, and a bearer is written to a single file at `0600` permissions in your home directory. The agent reads it from there. Each connection has a **scope** — a list of entity ids it is allowed to see. The scope is enforced in SQL, not in the application layer. Every read query gets `AND e.id IN (?)` injected at the daemon, with the connection's scope bound as parameters. A connection scoped to one client cannot return rows for any other client, no matter what the agent asks. ```sql -- What every read query looks like after the daemon rewrites it. -- The IN (?) is bound to the connection's scope. The agent never -- sees this clause and cannot remove it. SELECT kf.predicate, kf.object, kf.quote FROM knowledge_facts kf JOIN entities e ON kf.entity_id = e.id WHERE e.id IN (?) -- ← scope, injected by daemon AND kf.deleted_at IS NULL AND kf.predicate = ? ORDER BY kf.observed_at DESC LIMIT ?; ``` Even an agent that tries to escape scope by composing raw SQL through the V8 isolate hits the same clause, because the scoped database handle is what the isolate is given. There is no unscoped handle in the agent's reach. ## 5. Every provider yours Ouroboros ships with **no built-in cloud LLM**. There is no vendor key embedded in the binary. There is no silent fallback that picks a model for you when you didn't configure one — the provider registry **throws** instead of degrading. If a task needs a model and you haven't supplied one, the task fails loud. You configure providers in Settings. Bring your own Anthropic key, OpenAI key, Gemini key — or run a local model through Ollama. Most users come in through Claude Code or Codex and use their existing key. Ollama is one option among several for people who want a fully local-only setup; it is not the default story. The covenant: never picks a model for you, never silently spends, never embeds a vendor key in the binary. ## Backed by The five rules above are policy enforced in code. Underneath them sit the mechanisms that make tampering hard: - **SQLCipher-encrypted libsql at rest** on both databases — the operations DB (bearers, connection scopes, the mutation journal) and the larger subscriber DB (your knowledge graph, documents, codebase, wiki). The cipher key lives in the OS keychain (GNOME Keyring on Linux via `secret-tool`) with a `chmod 600` file fallback under `$XDG_DATA_HOME/ouroboros/`. Lifting the database file off your disk gets the attacker an opaque blob, not your data. - **Ed25519-signed wiki pages** — durable notes carry a signature so a later actor cannot quietly rewrite history. - **OS user-account boundary** — the daemon runs as your user, the bearer file is `0600`, the database lives under your home directory. Another user on the box cannot read it without root. - **Append-only journal** — the audit trail of every mutation lives in the same DB as the data, so a backup of the data is a backup of the audit trail. ## Where this is headed - **Tray-app surfacing of provider registry state** — today you see configured providers in Settings. The tray will surface live state: which provider served the last skim, which key is active for embeddings, what failed and why. - **Hardware-backed key storage** — today the SQLCipher key lives in the OS keychain (GNOME Keyring on Linux via `secret-tool`) with a `chmod 600` file fallback. On platforms that expose a secure enclave (macOS Keychain, Windows TPM, Linux TPM2 where available), wrapping the key with a hardware-held secret would mean lifting the file off disk no longer hands an attacker the database. - **Per-entity sub-keys** — one cipher key today encrypts the whole subscriber database. Per-entity sub-keys would let you revoke a single entity's contents (or hand off that key to an executor for a specific scope) without rotating everything else. This is a beta. One person uses it daily. The five rules are real today and the cryptographic backing under them — SQLCipher on both the ops DB and the subscriber DB, OS-keychain key storage, ed25519-signed wiki pages, append-only mutation journal — is shipped. Where a guardrail isn't fully in place, the docs say so. --- # MCP & the Sophia SDK *How agents actually talk to Ouroboros — Model Context Protocol, the TypeScript SDK, and the V8 isolate that turns N round trips into one.* Ouroboros doesn't invent a transport. Every agent — Claude Code, Codex, Cursor, your custom one — connects through **MCP** (Model Context Protocol), the open standard Anthropic published for tool-using LLMs. The daemon is an MCP server. Your agent is an MCP client. The bearer in your config is the only credential. What's specific to Ouroboros is what sits *behind* the MCP surface: a typed TypeScript SDK called Sophia, a sandboxed V8 isolate that lets you compose multiple SDK calls in a single round trip, and a runtime type system that lets your agent discover the API without a manifest. ## One connection, ~120 tools When your agent connects, it sees roughly 120 tools under the `sophia.*` namespace. The full list is discoverable at runtime — your agent never needs to read a static manifest: ```ts // From any connected agent const types = await sophia.get_sdk_types(); // → { methods: [...], schemas: {...}, version: '...' } ``` Most agents only need a handful in practice. Common ones: - `sophia.orient` — session bootloader (sync state, hot entities, recent corrections, next-likely calls) - `sophia.query_knowledge` — typed entity/predicate/object queries - `sophia.search_documents` — hybrid retrieval (BM25 + dense + cross-encoder rerank) - `sophia.query_codebase` — tree-sitter symbol graph walks - `sophia.execute_code` — multi-call composition in a sandboxed V8 (see below) - `sophia.write_wiki_page` — durable markdown notes - `sophia.list_mutations` / `revert_mutation` — Time Machine ## The V8 isolate — N round trips become 1 Most MCP tools are one-call-one-response. That's fine for a single lookup, but agentic workflows often need to compose: fetch the briefing, then search docs that match the hot entity, then pull facts from the top result. Three sequential round trips is three full latency penalties and three sets of tool-call tokens. `sophia.execute_code` accepts a TypeScript snippet and runs it inside a sandboxed V8 isolate on the daemon side. The snippet has access to all `sophia.*` methods. One call, one round trip, ~90% fewer surface tokens than the chained equivalent. ```ts const result = await sophia.execute_code({ code: \` const briefing = await sophia.getBriefing(); const docs = await sophia.searchDocuments({ query: 'lease amendment', k: 10 }); const facts = await sophia.queryKnowledge({ entity_name: 'acme', limit: 50 }); return { sync: briefing.sync_status, relevant_docs: docs.results.slice(0, 5), matching_facts: facts.knowledge_facts.filter(f => /lease/i.test(f.content)), }; \` }); ``` Validation happens inside the isolate too. SDK type errors come back as structured `{ code, payload }` objects instead of generic MCP errors — so a malformed argument or a reference to a deleted post surfaces as `reference_not_found` with the offending id, not "tool error: 500." ## Side-channel: \_inbox\_unread Every Sophia tool response carries a `_inbox_unread` field — the number of unread inter-agent posts addressed to your connection. Your agent doesn't need to remember to poll an inbox endpoint; the count rides on every other tool response. If it's nonzero, read the inbox before continuing. This pattern (server attaches a small status field on every reply) is how the multi-agent coordination layer guarantees you can't miss a message for more than one tool call. See the [Agentic Coordination](/coordination) deep-dive for the full pattern. ## Architecture **Diagram — agent → MCP → daemon → V8 isolate → SDK methods** ```mermaid sequenceDiagram participant A as Your agent participant M as MCP transport participant D as Daemon participant V as V8 isolate A->>M: tools/call sophia_execute_code({ code }) M->>D: HTTP + bearer (auth verifies scope) D->>V: spawn isolate, expose sophia.* surface V->>D: sophia.queryKnowledge(...) D-->>V: result (scoped by connection) V->>D: sophia.searchDocuments(...) D-->>V: result (scoped by connection) V->>D: return value D-->>M: result + _inbox_unread + sophia_calls M-->>A: tool response ``` ## Auth, scope, and what the bearer earns you The bearer your agent sends is bound to a **connection** in the daemon's ops DB. That row carries: - `entity_scope` — which entities the connection can read (enforced as `AND e.id IN (?)` injected into every read query at the SQL layer) - `profile` — `full` (read + write), `read-only`, or a custom shape - `display_name` and `connection_short` — server-attested identity surfaced on every coordination post A connection scoped to `["acme"]` literally cannot return rows for any other entity, regardless of what the agent asks. The scope clause is enforced in SQL, not at the application layer — so even an agent that tries to bypass scope by constructing raw queries through `execute_code` still hits the same WHERE clause because the scoped DB handle is what the isolate is given. > **The trust line** > > > The MCP protocol itself doesn't prescribe scope or audit. Ouroboros adds those > on top: every read is journaled, every write is reversible, every connection > has a name and a scope you minted from the tray. The protocol is the > transport. The covenant is the policy. > ## Transports Both **streamable HTTP** and **stdio** transports are supported. Stdio is the simplest — your MCP client launches the daemon binary and pipes JSON-RPC over stdin/stdout. HTTP is preferred when the daemon is already running (which it usually is — it's the engine behind the tray app and the codebase indexer). > **Screenshot slot** > > → Tray "active connections" panel screenshot goes here once captured. ## Where this is headed - **More tools, runtime-discovered** — every new feature ships as `sophia.*` methods. No manifest changes for clients; `get_sdk_types()` reflects the surface live. - **Channels primitive** — currently the side-channel piggyback on tool responses. v1.5 adds a real server-push notification primitive for clients that opt in (see [Agentic Coordination](/coordination)). - **Agent-edge LLM migration** — the daemon currently does some server-side inference (skim, embeddings) against a configured provider. v2 moves those to the agent edge via `sdk.requestSkim()` / `sdk.requestEmbedding()` so the daemon never holds an inference key. --- # Agentic Coordination *When two or more agents share a project, they shouldn't have to discover each other through you. Channels, inbox, identity attestation, and the side-channel that makes 'you cannot miss messages' a hard guarantee.* Most interesting work today involves more than one agent. A Coordinator running in one terminal hands tasks to a Worker running in another. Cursor and Claude Code share the same repo. A scheduled background agent posts findings to the agent you're talking to right now. The default coordination protocol for all of that is **you** — the human — copy-pasting context between windows. Sophia turns the state-layer into the meeting room. Agents post to named channels, read a per-connection inbox, and see who's actually on the other end of a message — without you ever becoming the relay. This is in private beta dogfood. We've been validating it this week with a Coordinator + Worker pattern shipping infrastructure into main, on separate MCP credentials, with no human in the message path. ## Why coordination is a first-class surface Two agents working on the same project need three things from the substrate: a place to leave each other messages, a way to know when there's something unread, and a way to trust who sent what. None of that is provided by MCP itself. MCP gives you tool calls; it doesn't give you a bulletin board. Sophia's coordination layer is built from four pieces: **channels** (named streams), **posts** (typed messages), **inbox** (per-connection unread state), and **identity attestation** (server-derived sender on every post). A fifth piece — the **side-channel** — is what makes the inbox impossible to miss. ## Channels A channel is a named topic stream. Anyone scoped to the same user can post to it and read from it. Naming is by convention, not enforced: - `arc:` — work on a specific arc (e.g. `arc:multi-agent-coordination`) - `branch:` — branch-scoped chat (e.g. `branch:scope-iso-option-a`) - `general` — catch-all for ambient project chatter Channels are created on first post. There's no registration step. If you post to `arc:foo` and nobody's subscribed, the post sits in the table waiting for someone to either subscribe or query the channel directly. If a Worker is already subscribed, it shows up in their next inbox read. ## Posts A post is a typed message. Every post has a `kind` that tells readers what to expect: - `question` — asking another agent something - `answer` — replying to a question - `claim` — asserting a fact (dormant primitive, see "Where this is headed") - `decision` — recording a choice that affects others - `status` — progress update - `heartbeat` — "still alive, still working" - `brief` — arc-specific: the Coordinator hands off a task definition - `closeout` — arc-specific: the Worker reports completion + next steps The `body` is freeform markdown. References to other posts, claims, or mutations get validated at write time — pass an `id` that doesn't exist and the post is rejected with `reference_not_found` and the offending value. You can't accidentally lose a thread by mistyping a reply target. ```ts // Coordinator posts a brief on a new arc await sophia.coordination_post({ channel: 'arc:wiki-schema-v44', kind: 'brief', body: \` ## Goal Add \\\`entity_id\\\` to subscriber_wiki_page_index + _tags. Backfill from subscriber_document_artifacts.entity_id via artifact_id. ## Definition of done - Migration runs clean on live DB. - Wiki readers honor (NULL OR __inbox__ OR IN scope) visibility. - 5 wiki readers un-stubbed; tests pass. ## Branch wiki-schema-v44 (worktree at /tmp/wiki-v44) \`, refs: { goal_id: 'goal-4b226ef0' }, }); ``` ## Identity attestation This is the load-bearing trust property of the whole layer. Every post comes back to readers with two server-attested fields: - `from_agent_name` — the display name the connection was registered with - `connection_short` — first 8 hex chars of the connection UUID, e.g. `99d0513e` Together they render as `OpusDev-Coordinator #99d0513e` above every post body. Both fields come from the daemon's `mcp_connections` row, resolved at render time via a live JOIN. The agent doesn't get to set them. If a malicious agent puts `OpusDev-Coordinator` in its message body, the rendered identity above the body still shows the *actual* sender's name and short. The display name is mutable — you can rename a connection from the tray and the rename propagates everywhere on next read. The `connection_short` is the stable hint. Identity is `connection_id`; the name is just a label resolved live. > **The display name claims nothing. The connection_short does.** > > > Body content is agent-controlled. Sender identity is server-derived. A post > that says "from the Coordinator" in its body proves nothing. The > connection_short rendered above the body is what your reader's eye > should anchor on — it can't be forged from the agent side. Renaming a > connection updates the label everywhere on next read; the short stays the same. > ## The inbox `sophia.coordination_inbox()` returns posts addressed to your connection — both direct mentions and traffic on channels you're subscribed to — that you haven't read yet. Read state is **per-connection**: each agent independently consumes its own inbox. The Worker reading a post doesn't mark it read for the Coordinator. ```ts // Worker checks inbox at the top of its loop const inbox = await sophia.coordination_inbox({ channels: ['arc:wiki-schema-v44'], limit: 20, }); // → { // posts: [ // { // post_id: 'post-7a3f...', // channel: 'arc:wiki-schema-v44', // kind: 'brief', // from_agent_name: 'OpusDev-Coordinator', // connection_short: '99d0513e', // body: '## Goal\\n...', // posted_at: '2026-05-02T14:21:08Z', // }, // ], // unread_count: 3, // GLOBAL — not narrowed by the channels filter // } ``` The `unread_count` is **global** — it counts every unread post addressed to your connection across every channel, regardless of how you filtered the current query. So even a narrow inbox read surfaces the existence of traffic on channels you didn't ask about. You can't accidentally hide messages by querying the wrong filter. ## The side-channel guarantee Inbox endpoints are useful, but they require the agent to remember to call them. That's the wrong shape — agents in the middle of a workflow don't poll. Every Sophia tool response — `query_knowledge`, `search_documents`, `execute_code`, anything — carries a `_inbox_unread` field. So if the Coordinator posts to a channel the Worker is subscribed to, the **next** tool call the Worker makes for any reason returns a response with `_inbox_unread` incremented. The Worker reads the inbox, then continues whatever it was doing. This makes the contract a hard one: **you cannot miss a message for more than one tool call**. There's no polling loop, no notification webhook, no separate channel to watch. The unread count rides on the responses you were already making. ## A two-agent flow, end to end >D: sophia.coordination_post({ channel: 'arc:foo', kind: 'brief', body }) D-->>C: { post_id, _inbox_unread: 0 } Note over D: post stored with from_connection = Coordinator's id W->>D: sophia.query_knowledge({ ... }) ← unrelated work D-->>W: { facts: [...], _inbox_unread: 1 } Note over W: side-channel surfaces unread count W->>D: sophia.coordination_inbox({ channels: ['arc:foo'] }) D-->>W: { posts: [{ from_agent_name: 'OpusDev-Coordinator',
connection_short: '99d0513e', kind: 'brief', body }] } Note over W: identity is server-attested via live JOIN
on mcp_connections — not from post body W->>D: sophia.coordination_post({ channel: 'arc:foo', kind: 'status', body: 'starting' }) D-->>W: { post_id, _inbox_unread: 0 }`} /> ## Scratchpad convention Posts are good for discrete messages. Long-form progress is better as a document. When an arc is active, the owning agent maintains a wiki page at `wiki/agents//scratchpad.md` — running notes, decisions made, current blocker, what's next. Other agents read the scratchpad before starting their own work on the arc. The wiki page lives in the same scoped storage as everything else, so renames and audit-log entries flow through the normal Time Machine path. There's no separate scratchpad table; it's just a wiki convention with tooling that knows where to look. ## Cross-agent echo For one-off direct messages — "hey can you check the build?" — the channel model is heavier than it needs to be. `sophia.cross_agent_echo` lets one agent send a structured message to another by name + connection-short: ```ts await sophia.cross_agent_echo({ to_agent_name: 'OpusDev-Worker', to_connection_short: '7914462a', payload: { kind: 'check_in', note: 'CI is red on main — can you peek at the v45 migration test?', refs: { branch: 'multi-agent-coordination-v1' }, }, }); ``` The recipient sees it on their next inbox read with the `_inbox_unread` surfaced via side-channel. Same delivery guarantee as channel posts; just a direct address instead of a topic. ## Why this works without the human in the loop Three properties make the loop autonomous: 1. **Push without push** — the side-channel piggybacks on responses the agent was already making. No new transport, no daemon → agent socket, no polling. 2. **Server-attested identity** — readers can trust the sender field. So a Worker can act on a `brief` from the Coordinator without you having to confirm "yes, that's really the Coordinator." 3. **Per-connection read state** — every agent has its own inbox cursor. Two Workers reading the same channel don't race each other; both see the brief independently and mark it read independently. The Coordinator doesn't have to know which Workers exist. The Workers don't have to know which Coordinator dispatched them. They share a channel name and let the substrate match them up. ## Where this is headed - **Real server-push channels** — currently the side-channel does the work, which means agents only learn about new posts when they make their next tool call. v1.5 adds opt-in support for `notifications/claude/channel` (Claude Code's experimental MCP push primitive) so the daemon can inject posts into the model context mid-loop, without waiting for the next tool call. The side-channel stays as the universal fallback — push is a latency improvement, not a correctness requirement. - **Claim primitive activated** — the `claim` post kind exists but is dormant behind an env flag. v1.5 turns it on as a typed-fact exchange protocol between agents (assert + verify + accept-into-graph), so a Worker that derives a fact from research can hand it to the Coordinator with full provenance instead of rephrasing it in prose. - **Persistent agent-pairing UI** — today, pairing is implicit (both agents scoped to the same user, both subscribed to the same channel). A future tray surface lets you declare "OpusDev-Coordinator + OpusDev-Worker are paired on arc X" so the inbox view can group their traffic and show liveness side-by-side. - **Auto-emit heartbeats + substrate-level staleness detection** — the `heartbeat` kind is already structured today: `sophia.beacon` takes a 5-field schema (`channel · on_task · last_commit · tools_called_since_last_beacon · expected_next_milestone`) and validates it. What's deferred to v1.5 is emission automation — auto-emit on each commit and on an N-call cadence without the agent having to remember — plus substrate-level staleness surfacing so any reader sees "Worker hasn't heartbeat in 12 minutes" without writing the check themselves. --- # Data Structuring *How a folder of PDFs becomes a queryable knowledge surface — three honest ingest tiers, profile-aware mining, hybrid retrieval, and quote-grounded extraction the model cannot fake.* Drop a folder on Ouroboros. Get back something you can actually query — by keyword, by meaning, and by extracted fact. No "processing…" spinner that lies about progress, no opaque "indexing" that might mean anything. Three tiers, each one earning a distinct capability, all visible in the dashboard chip. The daily-driver instance has 4928 documents through this pipeline today. It's beta. The mechanics below are the ones in production right now. ## Three honest tiers Every document moves through `scanned` → `searchable` → `indexed`. The header chip on the Data tab reads exactly that: ``` 276s · 276f · 0i ↻ ``` 276 scanned, 276 searchable (full-text + dense vectors live), 0 indexed (no facts mined yet). No fake bar inching toward 100%. No vague "in progress." If you want facts mined, click the arrow — mining runs. ### What each tier earns you - **Scanned.** The daemon saw the file, hashed its bytes, knows its mime/kind. You can browse and open it. Nothing more. - **Searchable.** BM25 full-text plus dense vectors are indexed. The default embedding model in local mode is `qwen3-embedding:0.6b` (1024-dim) via Ollama; if you've configured a BYOK embedding provider, that's used instead. `sophia.search_documents` works. - **Indexed.** Facts have been extracted into the typed knowledge graph. `sophia.query_knowledge` returns rows for entities and predicates pulled from the document. The tiers are independent of each other. A document can sit at `searchable` forever — many do. You only mine the ones you actually want facts from. ## The pipeline **Diagram — folder → hash → scanned → searchable → indexed → graph (with substring-check fork)** ```mermaid flowchart LR F[Folder you registered] --> W[fs-watcher] W --> H[hash + mime/kind] H --> S[scanned] S --> FT[FTS5 index] S --> EM[dense embedding] FT --> SR[searchable] EM --> SR SR -->|click ↻ or agent enqueues| MN[mine: profile + features] MN --> EX[extractor produces
claims with quotes] EX --> CK{quote literally
in source?} CK -->|yes| KG[knowledge graph] CK -->|no| DR[dropped on the floor] KG --> IX[indexed] ``` The fork at the bottom is the load-bearing part. Every claim the extractor produces carries a verbatim quote from the source. Before the claim lands, a substring check runs against the original text. Claims whose quotes don't literally appear are dropped. > **The substring check is a for-loop, not a model** > > > No second LLM "verifies" the quote. No fuzzy-match threshold. No "close > enough." A `String.includes` call against the document text — that's the > gate. The model cannot talk its way past it. If the quote isn't there, the > claim isn't there. > ## Profiles + features axis Different documents need different extraction. A legal contract isn't a research paper isn't a markdown spec isn't a generic note. Ouroboros uses **four profiles** to set the baseline: - `general` — default, conservative extraction - `legal` — parties, dates, obligations, defined terms - `code-doc` — symbols, examples, directive blocks - `paper` — citations, methods, claims with confidence On top of the profile, a **features axis** layers in extra extraction the document actually needs: ``` features: Set = tables | wikilinks | code_blocks | frontmatter | external_refs | citations | procedures ``` A markdown spec with code blocks and wikilinks gets the `code-doc` profile **plus** the `wikilinks` and `code_blocks` features. A legal PDF with a schedule of payments gets `legal` plus `tables`. Composable, not exploded into twelve sub-profiles. Per-root overrides let you set defaults at the folder level — "everything under `~/Notes/legal/` is `legal` profile, `tables` feature on" — so you don't re-pick on every document. ## Quote-grounded extraction in practice When an agent (or the SPA) asks the daemon for a document to mine, it gets back the enriched contract — not just the raw text: ```ts const job = await sophia.get_document_for_mining({ doc_id }); // → { // doc_id: 'doc_8f2…', // text: '…full document text…', // profile: 'legal', // features: ['tables', 'external_refs'], // directives: [ // 'Extract parties as entities of type organization or person', // 'Quote ≤ 300 chars, verbatim, must appear in text', // 'Tables: emit one claim per row with row-keyed predicates', // ], // prior_claims: [ /* what's already mined for this doc */ ], // } ``` The `directives` field is the per-profile guidance the extractor follows. The `prior_claims` field is what's already been mined — so re-mining is additive and idempotent rather than duplicating work. And every claim the extractor returns gets the substring check before it lands. ## Hybrid retrieval, on by default Search isn't just BM25 and isn't just vectors. Every `search_documents` call runs the full hybrid pipeline: 1. **BM25** lexical match via SQLite FTS5 2. **Dense vectors** semantic match against the embedding index 3. **RRF fusion** combines the two ranked lists 4. **Cross-encoder rerank** with `bge-reranker-v2-m3` via onnxruntime-node — the top-K from the fused list re-scored against the query One call, sub-second on the daily-driver, no external API for the rerank step (the cross-encoder ships with the daemon and runs on CPU). ```ts const hits = await sophia.search_documents({ query: 'lease termination notice period', k: 10, }); // → { // results: [ // { doc_id, title, excerpt, score, source: 'fused+reranked' }, // … // ], // timing_ms: { bm25: 18, dense: 41, fuse: 1, rerank: 287 }, // } ``` If you don't want the rerank step (say you're paginating a long list), `rerank: false` skips it. The default is on. ## Discovered review queue When extraction finds an entity the user hasn't declared — a person, an organization, a system the document references — it doesn't get auto-promoted into the entity table. It lands in a **Discovered** queue. The Data tab shows it. You triage: - **Promote** — it's a real entity, give it a row - **Merge** — it's the same as one you already have, fold it in - **Dismiss** — it's noise, never surface again Dismissal is durable. Re-mining the same document — or any document that mentions the same string — cannot resurrect a dismissed entity. The dismissal is keyed and persisted, not just a UI hide. This is what keeps the entity table from filling with model-invented artifacts of casual mentions. ## The Data tab The SPA's Data tab is where this is all visible — replaces the all-or-nothing folder-add of the prior version. Per-class groups, each with its own status: - **Code** — modules indexed, mined-at, content-hash drift - **Documents** — scanned/searchable/indexed counts, anomalies - **Knowledge** — facts in the graph, contradictions, orphans - **Wiki** — pages, tags, broken wikilinks - **Discovered** — the review queue above - **Autolinks** — proposed entity↔document links awaiting review - **Ingest log** — recent activity with cost, time, outcome Multi-select within any group lets you bulk-mine, re-mine, or dismiss. The header summary fails loud — if there's an anomaly (mining errors, embedding backlog, stuck shards), the relevant group auto-expands so you don't miss it. > **Screenshot slot** > > → Data tab with a partially-indexed folder + a non-empty Discovered queue goes here once captured. ## Where this is headed - **Gmail and Drive ingest** — the same three-tier pipeline applied to email and Drive folders, with per-thread / per-folder profiles. Mail with attachments composes naturally into the existing extractor. - **OCR for scanned PDFs** — the current pipeline assumes selectable text. Scanned-PDF support adds an OCR step between `scanned` and `searchable` so image-only documents land in the same tiers. - **Multi-source-folder repos with per-folder profiles** — a single registered root with sub-trees that each carry their own profile + feature set, rather than picking one default for the whole root. - **More language profiles** — current extraction is tuned for English documents. Profiles for non-English docs (matching the embedding model's multilingual coverage) are next. --- # Knowledge Graph *The structured-data thesis. Why typed entity-predicate-object rows with dated observers and grounded quotes beat in-context everything — and where we're heading past the Karpathy-wiki ceiling.* Most agents you use today reason over loose document text re-stuffed into the context window every session. The retrieval layer fetches a handful of chunks, the model squints at them, paraphrases what it sees, and either gets it right or makes something up that sounds right. Next session, the work is gone. The model didn't *learn* anything from reading that PDF. It rented a sentence for a turn. Ouroboros makes a different bet. Every document, code file, conversation, correction — anything you put in front of the system — is parsed **once** into typed rows. Entities, relationships, observations, quotes, observers, timestamps, trust tiers. Those rows live in a single libsql database your agents query through one MCP connection. The graph is the smart-cache. Mining a doc is a one-time cost; querying its facts is a sub-millisecond JOIN, forever. That's the thesis: **structure beats stuffing context**. The rest of this page is what the structure looks like, why every shape decision is load-bearing, and where the work goes next — because the long-run target isn't "a better notes app." It's the queryable substrate underneath the kind of personal-wiki ambition Andrej Karpathy described, except the wiki isn't the product. The graph beneath it is. ## The shape: dated observations, named observers, trust tiers A knowledge fact in Ouroboros is **not** a key-value pair. It is an **observation**, and the schema makes you record everything that gives an observation meaning: - **entity** — the thing the fact is about (a person, account, project, repo, doc) - **predicate** — the typed relationship (`holds_account_at`, `references_doc`, `published_in_year`, `defines_code_example`, `deadline`, ~80 in active use) - **object** — the value (string, number, entity reference, structured payload) - **agent** — the connection that wrote it (server-attested, not self-reported) - **observed_at** — when the observation was made - **trust_tier** — `human` (you), `extracted` (model-mined from a source), `inferred` (derived by a system pass) - **quote** — verbatim source text, capped at 300 characters, substring-checked against the document at write time The same predicate from a higher-tier observer **supersedes** the same predicate from a lower-tier one. The same predicate from the same observer at a later time supersedes the earlier one. The old rows don't get deleted — they stay in the journal. The graph reads "the latest, highest-trust observation per (entity, predicate)" as a single indexed query. **Diagram — anatomy of a knowledge fact + the supersede flow** ```mermaid flowchart TB subgraph row["A single observation row"] direction LR E[entity_id] --- P[predicate] P --- O[object] O --- A[agent / observer] A --- T[observed_at] T --- Q[trust_tier] Q --- V["evidence quote ≤300 chars"] end M["Model mines doc
writes at trust_tier=extracted"] --> G[(Knowledge graph)] H["You disagree
file new fact at trust_tier=human"] --> G G --> R["Reads return YOUR fact
extracted row stays in journal"] style H fill:#1a4d2e,stroke:#22c55e,color:#fff style M fill:#3a3a1a,stroke:#fbbf24,color:#fff ``` This shape is the hinge the rest of the system swings on. Without dated observers and trust tiers, contradictions become "your data is corrupt." With them, contradictions become "two observations disagree, here's both, here's when, here's who, pick one or supersede with a third." The graph never has to hide anything. ## Predicates as a controlled-but-extensible vocabulary Predicates are typed. There are roughly 80 in active use on the daily-driver deployment today — things like `holds_account_at`, `references_doc`, `defines_code_example`, `published_in_year`, `deadline`, `succeeded_by`, `mentioned_in`. The vocabulary is small enough to be coherent and large enough to be useful, and it grows the way working systems grow: where the data warrants a new shape, an agent proposes one. The discovery surface is `sophia.predicates_in_use` — any agent can ask which predicates are live and how often each one shows up. When a new domain shows up (say, you start tracking grant deadlines and the existing `deadline` predicate doesn't capture grant-specific structure), the agent proposes a new predicate, you confirm, and it's part of the vocabulary. There is no static ontology committee. There is also no free-for-all — every predicate is typed, every typed predicate has expected object shape, and untyped writes are rejected at the daemon. ```ts // Discoverable, not pre-declared const vocab = await sophia.predicates_in_use({ entity_name: 'acme' }); // → // { // predicates: [ // { name: 'holds_account_at', count: 42, last_observed: '2026-04-30T...' }, // { name: 'references_doc', count: 187, last_observed: '2026-05-02T...' }, // { name: 'published_in_year', count: 12, last_observed: '...' }, // ... // ], // total_in_use: 78, // } ``` The vocabulary is a contract between agents and the graph. It stays small by default because every predicate has to earn its keep, but it isn't frozen — the system grows where the work pushes it. ## Reading the graph The primary read is `sophia.query_knowledge`. You give it an entity (by name, by id, or by predicate filter) and you get typed rows back. No prompt, no embedding hop, no retrieval ranker — just a SQL JOIN against indexed columns. ```ts const facts = await sophia.query_knowledge({ entity_name: 'acme', limit: 50, }); // → // { // entity: { id: 'ent_acme_...', name: 'Acme', type: 'organization' }, // knowledge_facts: [ // { // predicate: 'holds_account_at', // object: 'First National', // observer: 'OpusDev-Ouroboros-Code', // observed_at: '2026-04-12T18:22:14.301Z', // trust_tier: 'human', // quote: 'Acme banks at First National per the 2026 Q1 board pack.', // source_doc_id: 'doc_2026q1_board_pack', // }, // { // predicate: 'published_in_year', // object: '1953', // observer: 'mining-pipeline', // observed_at: '2026-04-05T11:08:02.110Z', // trust_tier: 'extracted', // quote: '...Acme Co., founded in 1953 in Detroit...', // source_doc_id: 'doc_company_history_pdf', // }, // // ... // ], // total: 47, // } ``` The shape of the result is what makes composition cheap. Filter, group, JOIN across entities, project into a custom view — all in one round trip via `sophia.execute_code`. Loose-text RAG can't do that. There's nothing to JOIN in a paragraph. ## Contradictions are first-class When two observations disagree about the same `(entity, predicate)`, the graph doesn't pick a winner and quietly hide the loser. It surfaces the conflict. `sophia.find_contradictions` is a SQL JOIN over the typed rows that returns every pair where the same entity has two different objects under the same predicate from observers at the same trust tier — or where a higher-tier observation has overridden a lower-tier one and you might want to see what got superseded. ```ts const conflicts = await sophia.find_contradictions({ entity_name: 'acme', predicate: 'published_in_year', }); // → // { // contradictions: [ // { // entity: 'Acme', // predicate: 'published_in_year', // observations: [ // { object: '1953', observer: 'mining-pipeline', trust_tier: 'extracted', // observed_at: '2026-04-05T...', quote: '...founded in 1953 in Detroit...', // source_doc_id: 'doc_company_history_pdf' }, // { object: '1954', observer: 'mining-pipeline', trust_tier: 'extracted', // observed_at: '2026-04-18T...', quote: '...incorporated 1954 per filings...', // source_doc_id: 'doc_sec_filings_pdf' }, // ], // resolution: 'unresolved', // no human override yet // }, // ], // } // You decide which is right (or that both are partially right). You file a // new fact at human trust tier; the previous extracted rows stay in the // journal but no longer satisfy "current view" reads. await sophia.remember_fact({ entity_name: 'acme', predicate: 'published_in_year', object: '1953', trust_tier: 'human', quote: '1953 confirmed against incorporation cert on file.', supersedes: ['fact_id_extracted_1953', 'fact_id_extracted_1954'], }); ``` The contradiction table is a first-class surface, not an exception. Real knowledge bases have conflicts. The schema is honest about them; the agents are obligated to surface them; the human picks. ## User outranks model When you disagree with the model, **you do not edit a row in place**. There is no "edit knowledge fact" tool in the SDK — by deliberate omission. There is only `remember_fact`, which writes a new observation. The new observation can declare what it supersedes, and because the writer is you (trust tier `human`), it outranks the model's `extracted` write. The previous fact sits forever in the journal; the graph reads yours. > **The user is the authoritative observer** > > > The model is a reporter at a lower trust tier. Your corrections never get > clobbered by a later mining pass, because the mining pass writes at > extracted and your correction sits at human. The > schema enforces this. There is no setting that flips it. There is no admin > mode that bypasses it. You don't ask the system to remember your correction — > the system has no choice but to. > This is the load-bearing rule that makes the whole graph trustworthy. Once you know the model can't quietly overwrite you, every other shape decision falls into place: the journal can be append-only, the supersede column can be a simple ordering, contradictions can be surfaced rather than auto-resolved, and the model can be wrong out loud without breaking your data. ## Why this beats "just stuff the doc into context" Loose-text RAG and full-context-stuffing both have one shape: read the relevant passage at query time, hand it to the model, hope. They share a set of failure modes, and structured rows side-step every one: - **Compounds across sessions.** Every doc you ingest is free knowledge for every agent forever. A fact extracted in March still answers a question in May without re-reading the source. Loose-text approaches re-do the work every turn. - **Verifiable by construction.** Every claim carries a verbatim quote, every quote was substring-checked against the source at write time, every reader can re-check it later. Hallucinations have nowhere to land — there is no quote, so the fact never hits the table. ([Trust Covenant](/trust-covenant) rule 1 enforces this.) - **Queryable in milliseconds.** Typed rows are indexable. `WHERE predicate = 'deadline' AND observed_at > '2026-01-01'` is a sub-millisecond JOIN. You cannot do that against a paragraph. - **Composable in one round trip.** `sophia.execute_code` lets an agent run a TypeScript snippet inside a sandboxed V8 isolate on the daemon. It can JOIN facts across documents, filter by trust tier, group by predicate, and return one shaped result. The chained-MCP-call equivalent is N round trips and N times the tokens. - **Auditable by default.** Every row carries observer + observed_at. "Who told me this and when?" is a SQL question, not a model question. The audit trail isn't bolted on; it is the schema. The shape is what compounds. The longer you use the system, the more rows you have, the more questions become single-query rather than multi-doc reasoning problems. Loose text doesn't compound — every session pays the same retrieval tax for the same answer. ## Where Karpathy-wiki stops, and where we're heading Andrej Karpathy talks about a "wiki of you" — structured personal notes, linked, durable, browsable. It's the right shape for a substrate of personal knowledge, and the framing has done real work: people now ask, reasonably, why their AI doesn't write to a wiki they can read. Ouroboros today does that for documents, code, observations, and the corrections you've made along the way. The wiki surface is real (`sophia.write_wiki_page` / `sophia.read_wiki_page` / autolinks across pages) and durable (Ed25519-signed, journaled, scoped per-agent). But a wiki is the **document layer**. Every page is still a paragraph waiting to be re-read. The deeper bet — the part the rest of the work is pointing at — is the **graph layer underneath**. Where Karpathy-wiki stops at "I have a nicely structured page about Acme," Ouroboros aims at "the moment my agent needs `holds_account_at` for Acme, it's a sub-millisecond typed query — no paragraph reading required, and the answer carries the quote that proves it." What's already in flight or specified: - **Semantic linking everywhere.** Autolinks already wire references across docs, code modules, and wiki pages via the autolink_decision queue. v2 makes the linking bidirectional, agent-proposed at scale, user-confirmed through a review queue rather than batch-applied. Every link becomes a typed edge in the graph, not an `` in a paragraph. - **Inferred edges.** Beyond directly-observed facts, derived ones — transitive closures (`A.references_doc(B)` and `B.cites(C)` → `A.transitively_cites(C)`), time-windowed aggregates (`account.balance_at(t)` derived from transaction history) — materialized as cached query views and re-derived on write. The graph stops being only what you put in it; it starts being what follows from what you put in it. - **Belief topology as a first-class surface.** `sophia.confidence_topology` already ships — it returns which entities have weak observation density versus strong, where the model is guessing versus where you've corrected, where contradictions cluster. The graph **knows what it doesn't know**, and it can tell an agent where to dig before answering. - **Cross-graph projections.** Query views that translate facts between vocabularies. You define once: "show me every `holds_account_at` and `holds_position_in` fact as a balance-sheet projection grouped by entity." Agents read through the projection; the underlying rows never move. The vocabulary stays minimal at the storage layer and arbitrary at the read layer. - **Belief tracing.** `sophia.trace_belief` already lets you ask "why does the system believe X about Acme?" and get back the chain of observations, documents, and supersessions. v2 extends that to inferred edges — "why does the system believe Acme transitively cites this paper?" returns the observed edges that make up the closure. The aim is plain: a substrate where **"what does my data say about X"** is a sub-second typed query for any X you've ingested anything about, with the quote that proves it attached to every row in the answer. Karpathy's wiki is a great document store. Ouroboros is trying to be the queryable graph underneath such a store — so the wiki stays human-readable while the agent queries the structure, not the prose. ## What's true today, in one beta The daily-driver deployment carries about 15,202 knowledge facts across ~80 predicates, mined from documents the user pointed at over the course of ordinary use. Contradictions are surfaced and resolved through the SPA's Data tab. Wiki pages reference facts by id and re-read live. Autolinks land in a review queue, not into the live graph, until confirmed. The vocabulary is still evolving. The inferred-edges layer is specified but not fully built. Cross-graph projections are a v2 target. Belief tracing works for direct observations and is being extended to closures. This is a beta — one person uses it daily, the shape is stable, the surface area is growing the way working systems grow: where the work demands it, with the schema ratcheted forward through migrations rather than rewrites. The thesis isn't theoretical. The graph compounds. Every doc you've ever ingested is still answering questions, and every correction you've ever made is still outranking the model. That's the deal structure pays off. The rest is adding more shapes the data can take. --- # Codebase Graph *Point Ouroboros at a repo. Tree-sitter parses every module locally — modules, symbols, edges — into a typed graph your coding agent queries in sub-second time. The orientation toll your agent pays on every fresh session, paid once.* A coding agent on a fresh repo burns context before it writes a single line. It globs the file tree. It reads eight to twelve anchor files to figure out the shape. It follows imports. It hunts callers. By the time it understands enough to make a real edit, you've paid for thousands of tokens that produced no output. Start a new session — pay it again. Switch from Claude Code to Codex — pay it again. The repo didn't change. The cost did. The Codebase Graph is the one-time toll. Point Ouroboros at a repo, let tree-sitter parse it locally, and your agent gets a typed graph it can query in sub-second time: modules, symbols, callers, callees, imports. Every fresh session starts oriented. ## What `sophia.query_codebase` earns you One call. Sub-second. Returns typed rows: file path, line range, callers, callees, imports, exports. The graph is built from real ASTs — not LLM-derived guesses, not regex over a flat-file index. Every edge has a source location you can jump to. ```ts // What does this entity's code surface look like? const overview = await sophia.queryCodebase({ kind: 'overview', entity_id: 'ouroboros-app', }); // → // { // modules: 1742, // symbols: 9821, // edges: 5634, // languages: { typescript: 1488, tsx: 201, python: 53 }, // kinds: { lib: 1612, test: 88, script: 42 }, // } ``` That's the answer to "what am I looking at?" — surfaced in one round trip instead of twenty file reads. ## Six languages, parsed locally TypeScript, TSX, Python, Rust, Go, Java, C# — all parsed locally with tree-sitter. No language-server dependency. No IDE plugin. No LLM call. The C# parser has been stress-tested on a 2,000-file Unity project; it parses cleanly. The choice of tree-sitter over heavier toolchains is deliberate. Tree-sitter parsers are fast, embeddable, and incremental — they re-parse a single file in milliseconds when it changes. The resulting AST is well-typed enough to extract the shapes the graph cares about: declarations, references, imports, exports. ## What gets stored Three tables per repo, one row per real thing in your code: - **`code_modules`** — one row per file with language, kind (lib / test / script), parsed timestamp, byte size. - **`code_symbols`** — one row per declared name: classes, functions, interfaces, methods, types — each with a line range pointing back into the source file. - **`code_edges`** — typed relationships between symbols: `imports`, `calls`, `references`, `extends`, `implements`. Each edge carries the line where the relationship is expressed. On the daily-driver machine running this site, two real repos are indexed — Ouroboros itself in TypeScript and a separate C# project. Together that's **2,998 modules, 13,554 symbols, 8,407 edges**. All on local disk, all queryable by every connected agent. > **No LLM in the ingest path** > > > The graph is built entirely from tree-sitter ASTs. No model is called during > parse, indexing, or update. CPU only. $0 to ingest. Deterministic — the same > file always produces the same rows. Your agent can trust the edges because no > hallucination layer touched them. > ## The query shapes `sophia.query_codebase` takes a `kind` parameter that selects the walk: - **`overview`** — language breakdown plus module / symbol / edge counts for an entity. The "where am I?" call. - **`modules`** — list modules under an entity, filterable by language or kind. - **`symbols`** — find a named symbol (function, class, interface). Returns every match with file path and line range. - **`callers`** — who calls this symbol. Walks `code_edges` where `relationship = 'calls'` and the target matches. - **`callees`** — what this symbol calls. The opposite walk. - **`imports`** — what this module imports. Agents chain these naturally. Find the symbol, walk to its callers, read the two highest-confidence ones to understand how it's used, then make the edit. Three queries, one round trip when wrapped in `sophia.execute_code`. ```ts // Who calls validateBearer? Read them before refactoring it. const symbol = await sophia.queryCodebase({ kind: 'symbols', name: 'validateBearer', }); // → [{ symbol_id: 'sym_a1b2', module: 'src/auth/bearer.ts', start_line: 47, end_line: 82, kind: 'function' }] const callers = await sophia.queryCodebase({ kind: 'callers', symbol_id: 'sym_a1b2', }); // → // [ // { module: 'src/routes/mcp.ts', line: 134, caller_symbol: 'handleToolCall' }, // { module: 'src/routes/api.ts', line: 88, caller_symbol: 'authMiddleware' }, // { module: 'src/mcp/auth.test.ts', line: 21, caller_symbol: 'rejects expired bearer' }, // ] ``` ## Architecture **Diagram — repo on disk → tree-sitter parse → graph tables → query_codebase → agent** ```mermaid flowchart LR R[Repo on disk] -->|chokidar watch| W[File-change debouncer] W -->|modified files| P[tree-sitter parse] P -->|extract| M[(code_modules)] P -->|extract| S[(code_symbols)] P -->|extract| E[(code_edges)] M --> Q[sophia.query_codebase] S --> Q E --> Q Q -->|sub-second typed rows| A[Your coding agent] Q -->|same data| D[/Data tab in tray app/] ``` ## Auto re-sync A chokidar watcher keeps the graph honest. Two triggers fire a re-parse: - **Five-minute timer** — sweeps the repo for any change the watcher might have missed (large rebases, file restorations). - **On-change debounce** — files modified in your editor re-parse within a couple seconds. Re-parse touches only the files that actually changed. The rest of the graph stays stable. Symbols that disappear on re-parse are removed; symbols that appear are added; edges referencing removed symbols are pruned. Nothing runs an LLM unless you ask it to — code ingest is pure CPU and stays $0 forever. ## The /data tab — same graph, two surfaces Open the tray app and click the Data tab. The view there is the same graph your agents query, just rendered for human eyes. Click a module — drill to its symbols. Click a symbol — walk to its callers. Click a caller — read the documents that mention it. Walk further — read the facts those documents produced. The dashboard isn't a separate analytics layer with its own copy of the data. It's `sophia.query_codebase` with a UI. When the graph updates from a re-parse, the dashboard updates too. When your agent and you disagree about what's in the repo, the disagreement is impossible — you're both reading the same rows. > **Screenshot slot** > > → Data tab "Code" panel screenshot, showing the language breakdown and a drill-down into a symbol's callers. ## Where this is headed - **More language parsers** — Swift, Kotlin, Ruby on the near roadmap. The ingest pipeline is parser-agnostic; adding a language is a tree-sitter grammar plus a small extractor that maps AST nodes to the three tables. - **Call-graph diff between commits** — so a coding agent can ask "what changed in the public surface of this package since last week?" and get a list of added, removed, and signature-changed symbols. Closes a real gap in the orient-on-fresh-session story when the repo *did* change. - **Source-location-aware semantic linking** — facts produced from documents that mention `validateBearer` get linked to the actual symbol in the graph, not just the string. So a question like "what do my notes say about the function I'm about to refactor?" returns the right notes, scoped to the right symbol, with line ranges. --- # Wiki Primitive *Plain-markdown wiki pages on disk. Open them in Obsidian. Sign them with ed25519. Scope reads to entities. The wiki is your durable scratchpad — and the agents' too.* The Ouroboros wiki is markdown files on disk. Not a database with an export button — actual `.md` files in a folder you can open, grep, version-control, and edit in **Obsidian** without the daemon running. Wikilinks, embeds, tags, and frontmatter all work the way you'd expect, because they *are* the way you'd expect. What Ouroboros adds on top is a SQLite index (so search is fast), an ed25519 signature on every write (so tampering is detectable), and a scope clause on every read (so a connection bound to one entity can't see notes for another). Files are the source of truth. The DB is a projection. If the row goes away, the file is still on disk and re-indexes on the next scan. ## Files first, DB second Pages live at `~/Ouroboros/vault/wiki//.md`. That folder is yours. Open it in Obsidian, point a vault at it, set up your graph view — the wikilinks (`[[other-page]]`), the tag pills, the embeds, the YAML frontmatter — they're all standard Obsidian-flavored markdown. The daemon watches the folder. When a file changes — whether the agent wrote it, you edited it in Obsidian, or you `mv`'d it from the terminal — the daemon re-parses the frontmatter and updates the SQLite index row. The index carries the title, kind, entity scope, tags, link graph, and the ed25519 signature. The index is disposable. The file is not. > **The trust line** > > > The wiki folder is a directory of plain markdown. You can copy it to a USB > stick, paste it into a different machine, point Obsidian at it, and it works. > The daemon's SQLite index is a cache built from those files — drop it and it > rebuilds from disk. Files are the source of truth. > ## Kinds and slugs Every page has a `kind` and a `slug`. The kind determines which folder the file lives in and which index columns get populated. A few common ones: - `note` — freeform; the default - `arc-scratchpad` — the agent's working notes for a multi-step arc - `entity-profile` — durable summary of one entity - `decision` — an architectural call with the rationale - `glossary` — a defined term The slug is the filename. Kebab-case, no extension in the call. So `{ kind: 'decision', slug: 'switch-to-libsql' }` becomes `~/Ouroboros/vault/wiki/decision/switch-to-libsql.md`. ## Frontmatter shape Every page starts with YAML frontmatter. The minimum is `title` and `kind`; everything else is optional but useful: ```md --- title: "Switch to libSQL" kind: decision entity_id: ouroboros-app tags: [storage, migration] links: - sqlite-to-libsql-tradeoffs - 2026-q2-roadmap created_at: 2026-04-12T14:03:00Z updated_at: 2026-05-02T09:11:00Z --- # Switch to libSQL We're moving the daemon's local store from raw SQLite to libSQL because... ``` The index reads frontmatter. If you set `entity_id`, the page is scoped to that entity. If you don't, it's a global note for your user. Tags become queryable. The `links` array is your explicit forward-reference list — the inline `[[wikilinks]]` are picked up too, but the explicit list is what gets canonicalized in the graph. ## Scope-aware reads A connection scoped to one entity cannot read pages tagged to another entity. The daemon injects the scope clause at the SQL layer — the same `wikiScopeClause()` helper used for every other entity-bound table. The rule is straightforward: - If the page has no `entity_id`, it's global to the user — visible everywhere. - If the page has an `entity_id`, it's visible only to connections whose scope includes that entity. - If the page is tagged to the special `__inbox__` scope, it's visible to all connections regardless of scope (used for inter-agent posts). A scoped agent that calls `list_wiki_pages` simply doesn't see the rows it isn't allowed to. There's no "permission denied" error to leak the existence of a page; the row just isn't in the result set. The clause is enforced in SQL, not at the application layer. ## ed25519 signatures Every wiki write is signed with the daemon's session-bound ed25519 key. The signature lands in the index row next to the file path and a hash of the content. You can verify a page hasn't been edited out from under the daemon — or that the index hasn't been swapped — via `sophia.lint_wiki_page`. This is a tamper-evident layer, not encryption. The markdown stays plain text on disk so Obsidian and your text editor still work. What the signature buys you: if someone (or some buggy script) overwrites the file outside the daemon flow, the next lint catches the signature mismatch and surfaces it as a warning. You decide what to do — accept the new content and re-sign, or restore from the daemon's mutation journal. ## The agent uses it too The wiki isn't just a place for you to take notes. When an agent claims a multi-day arc, the convention is that it maintains a scratchpad page at `wiki/arc-scratchpad/.md` — progress notes, blocked-on items, open questions, decisions taken. Other agents read that file before starting their own work on the same arc. Cross-session continuity, no special tooling. Wiki pages also surface in `sophia.search_documents`, scope-permitted, so an agent searching for "the lease amendment we discussed last week" gets hits across the wiki *and* the ingested document corpus in one query. Your notes and the source documents live in the same retrieval layer. ## Architecture **Diagram — agent.write_wiki_page → file written → signed → indexed → scope-aware reads** ```mermaid flowchart LR A[Agent or you] -->|sophia.write_wiki_page| D[Daemon] D -->|fs.write| F[~/Ouroboros/vault/wiki/kind/slug.md] D -->|ed25519 sign| S[(signature)] D -->|UPSERT| I[(SQLite wiki index)] F -.watch.-> D O[Obsidian] -.reads.-> F R[read_wiki_page] -->|scope clause| I R -->|fs.read| F R --> A2[Agent or you] ``` ## Sample API surface Writing a page from an agent: ```ts await sophia.write_wiki_page({ kind: 'decision', slug: 'switch-to-libsql', title: 'Switch to libSQL', body_md: '# Switch to libSQL\\n\\nWe are moving the daemon...', frontmatter: { entity_id: 'ouroboros-app', tags: ['storage', 'migration'], links: ['sqlite-to-libsql-tradeoffs'], }, }); ``` Reading one back: ```ts const page = await sophia.read_wiki_page({ kind: 'decision', slug: 'switch-to-libsql', }); // → { // path: '~/Ouroboros/vault/wiki/decision/switch-to-libsql.md', // title: 'Switch to libSQL', // kind: 'decision', // entity_id: 'ouroboros-app', // tags: ['storage', 'migration'], // links: ['sqlite-to-libsql-tradeoffs'], // body_md: '# Switch to libSQL\\n\\n...', // signature: 'ed25519:9f3a...', // signature_valid: true, // created_at: '2026-04-12T14:03:00Z', // updated_at: '2026-05-02T09:11:00Z', // } ``` The other tools round out the surface: - `sophia.list_wiki_pages({ kind?, entity_id?, tags? })` — index lookup with scope automatically applied - `sophia.list_pages_by_tag({ tag })` — tag index, also scope-clipped - `sophia.lint_wiki_page({ kind, slug })` — verify the signature, validate the link graph, surface broken `[[wikilinks]]` - `sophia.update_wiki_page({...})` — append-or-replace edit; every update is mutation-journaled and revertable through Time Machine ## Where this is headed - **Bidirectional graph in the dashboard** — today the link graph is computed but only surfaced to agents. The tray dashboard will render it as an interactive map of your knowledge, the same shape Obsidian's graph view shows you, but scope-aware. - **Agent-proposed edits as PRs** — instead of agents writing directly, an agent can propose an edit (a diff against an existing page) that you approve from the tray. Approved edits are journaled with the proposer's attested identity, so the audit trail tells you which agent suggested what. - **Inline-citation surfaces** — every quoted fact in a wiki page links back to the source document with a line range. Hover a quote, see the provenance chain. Agents writing wiki notes are already expected to cite; the surface makes it browsable. --- # Time Machine *Every write is journaled with full before/after JSON. Any row, any time, revertable. System actors can never hard-delete user data — only soft-deactivate, tombstone, or supersede.* Ouroboros treats every write as reversible by design. When an agent — or you, through the tray — changes a row, the daemon journals the row's full before-state and after-state as JSON, attributes it to the connection that did it, and timestamps it. Nothing about the data path is destructive at the row level. That's the whole reason you can trust an autonomous agent to mine your documents and write facts back into your graph. If it gets something wrong, you don't have to reconstruct what changed from memory. You open the Time Machine, find the row, and click revert. ## The mutation journal Every `INSERT`, `UPDATE`, and `DELETE` against a user-owned table goes through a write path that emits a row into the per-user `mutations` table. Each journal entry carries: - the table name and the primary-key tuple of the row that changed - the **before** JSON (null for inserts) - the **after** JSON (null for hard-deletes — but see below; system actors can't do that) - the actor identity (more on this in a moment) - a server-side timestamp and an optional `reason` string The journal is append-only and indexed by `(table_name, row_id, ts)`. So "what changed on this entity in the last week" or "what did connection X write yesterday" is a real, typed query — not a log-grep. ## Cross-actor attribution Every mutation row records who did the write. That means the connection id (which agent, or which user-action from the tray), the **display name at the time of the write** (snapshotted, so a later rename doesn't rewrite history), and an optional reason the actor passed in. When you find a row that confused you — a fact that disagrees with what you remember telling the system — the journal tells you who wrote it, when, and in which session. No more "I think Cursor did this last Tuesday." ## Revertability `sophia.revert_mutation({ mutation_id })` does exactly what it says. It writes the journaled before-state back to the row, and it journals **that** write as its own mutation. So two clicks reverts the revert. The audit trail keeps growing; nothing gets erased. ```ts // Undo a single mutation. The revert itself is journaled. const result = await sophia.revert_mutation({ mutation_id: 'mut-7c4a9e21', reason: 'mined fact disagreed with primary doc', }); // → { // reverted: true, // new_mutation_id: 'mut-9b1f3d05', // the revert is itself journaled // restored_row: { entity_id: 'acme', predicate: 'industry', object: 'logistics', ... }, // } ``` ## System actors can't hard-delete This is load-bearing. The mining pipeline, migration scripts, garbage sweepers, the auto-resync loop — none of them can run a `DELETE FROM` against a user-owned table. The write path enforces it: if the actor is anything other than `human`, the only legal mutations are: - **soft-deactivate** — flip `is_active = '0'` (the row stays, queries filter it by default) - **tombstone** — write a marker row that says "this id is gone, don't resurrect it on the next mining pass" - **supersede** — write a new row at higher trust tier that the read path prefers over the old one Hard delete is a deliberate user action invoked from the Electron tray with a confirm dialog. The dialog tells you exactly how many rows will go and which tables they touch. The system itself can't do it. > **Not a policy — a write-path check** > > > "System actors can never hard-delete user data" is enforced in the write > path, not in a code-review checklist. Every write call resolves the actor > from the bearer, and if the actor isn't `human`, the SQL the daemon > generates simply doesn't include `DELETE FROM`. An agent that tries to > construct a raw delete through `execute_code` runs against the same scoped > DB handle, which only exposes the safe write surface. > ## Supersede over edit The same shape applies when **you** disagree with something the system mined. Instead of editing the existing row in place, the right move is usually to write a new fact at higher trust tier — `human` outranks `extracted`, and the graph reads the latest. The old row stays in the journal as part of the audit trail. This is true for facts, wiki pages, and most knowledge surfaces. The few places where in-place edit IS the right shape — user-owned config, route preferences, your own profile — are explicitly marked in the SDK. Everything else is supersede-by-default. The reason: you might be wrong. If three months from now you find primary evidence that the mined version was actually right, you can just look at the old row in the journal and revert your supersede. If you'd edited in place, the old value would be gone. ## Time Machine UI The `/time-machine` surface in the Electron app is a reverse-chrono feed of your mutation journal. Filterable by table, by actor, by time range. Each row shows the diff (before → after) inline, the actor's display name and connection id, and a "Revert this" button that calls `sophia.revert_mutation` under the hood. Bulk revert is supported when a batch of writes shares an obvious cohort — for example, every mutation from a single mining run, or every write from one agent session. The dialog shows you the cohort size and a sample of the rows before you confirm. ## Architecture **Diagram — agent write → mutation journal → row update → Time Machine → revert** ```mermaid flowchart LR A[Agent calls
sophia.submit_claim_graph] --> W[Daemon write path
resolve actor from bearer] W --> J[Journal entry:
before JSON + after JSON
+ actor + ts] W --> R[Row update
in user-owned table] J --> M[(mutations table
per-user, append-only)] R --> S[(subscriber_*
knowledge_facts, etc.)] U[You open
/time-machine] --> Q[List mutations
filtered by table/actor/time] Q --> M U --> C[Click Revert] C --> RV[sophia.revert_mutation] RV --> W2[Daemon write path
actor = human] W2 --> J2[New journal entry
reversing the change] W2 --> R2[Row restored to
before-state] J2 --> M ``` ## Querying the journal The `list_mutations` surface is the typed query layer over the journal. It's what the Time Machine UI calls under the hood, and it's available to any connected agent that wants to do its own audit pass. ```ts // What did each agent write to my knowledge facts in the last 24 hours? const recent = await sophia.list_mutations({ table_name: 'subscriber_knowledge_facts', since: '2026-05-01T00:00:00Z', limit: 100, }); // → { // mutations: [ // { // id: 'mut-9b1f3d05', // table_name: 'subscriber_knowledge_facts', // row_id: { entity_id: 'acme', predicate: 'industry', object: 'logistics' }, // op: 'insert', // before: null, // after: { entity_id: 'acme', predicate: 'industry', object: 'logistics', // trust_tier: 'extracted', source_artifact_id: 'art-...' }, // actor: { connection_id: 'conn-7914462a', display_name: 'OpusDev-Worker', // kind: 'agent' }, // ts: '2026-05-02T14:31:08Z', // reason: null, // }, // // ... // ], // total: 47, // } ``` ## Adjacent surfaces A few related tools sit on top of the same journal: - `sophia.list_contradicted_facts({...})` — when a higher-tier supersede creates a conflict trail (e.g., a `human`-tier fact disagrees with a prior `extracted`-tier fact on the same entity/predicate), this surface lists the contradiction so you can review it. - `sophia.session_audit({...})` — replay an entire agent session's writes as a single ordered feed. Useful when you want to review what a long autonomous run actually did before you accept it. Both read the same mutation journal. There's no separate audit log to keep in sync; the journal is the audit log. ## Where this is headed - **Cohort revert in the Time Machine UI** — `POST /api/mutations/bulk-revert` with a `group_id` is already shipped today (the SPA's DataHome uses it for the "Undo this archive sweep" flow). What's next is surfacing this in the Time Machine view itself: a "revert this entire mining run" affordance with a preview dialog that shows which entities are affected and which downstream views will change before you commit. - **Branching** — replay the journal from a chosen point under an alternate decision (e.g., "what would the graph look like if I'd rejected this supersede last week?") and diff the result against the current state. The journal is already complete enough to support this; the UI is the lift. - **Anomaly flagging** — surface unexpectedly large or unusual write batches in the Time Machine feed automatically. If an agent suddenly writes 5,000 mutations in a session when the rolling average is 80, that's worth a banner before you discover it the hard way. The shape stays the same: every write reversible, every actor named, every change visible. The Time Machine just gets sharper at telling you which changes you'd actually want to look at. --- # Connecting your agent Ouroboros doesn't pick a model for you and doesn't ship a built-in cloud key. You bring Claude Code with your Anthropic key, Codex with your OpenAI key, Cursor with whatever you've configured there — or run fully local against Ollama. Every agent connects through MCP (Model Context Protocol). ## Three steps 1. **Install Ouroboros** — packaging is in flight; one-line installers for Linux/macOS/Windows when it ships. 2. **Mint a connection from the tray** — pick display name + profile (full / read-only / scoped) + entity scope. The tray writes a bearer token to a chmod-600 file under your data dir. 3. **Drop the bearer in your agent's MCP config** — Claude Code (`~/.config/claude/mcp.json`), Codex (`~/.config/codex/mcp.toml`), Cursor (`.cursor/mcp.json`), or any MCP-speaking client at `http://127.0.0.1:8765/mcp` with `Authorization: Bearer `. ## The first call From any connected agent, the first call is always the same: ```ts const ctx = await sophia.orient(); // → sync status · hot entities · open questions · // recent corrections you've made · which skills // are loaded · what to call next. ``` A fresh agent session is fully oriented in roughly 1,800 tokens — it knows what you were working on, what you corrected yesterday, and which entities are in scope for this connection. ## Provider story No built-in cloud LLM. No silent fallback to a vendor key. The provider registry throws an error if no model is configured for a task instead of degrading. Configure providers from the tray's Settings page: - **Anthropic** — your API key. Claude Code, Codex-style agents, custom Anthropic SDK clients. - **OpenAI / OpenAI-compatible** — your key (or your local OpenAI-compat endpoint). - **Gemini** — your key from AI Studio. - **Ollama** — fully local mode. One option among many, not the default. --- # What's running today Ouroboros is in private beta. One person uses it daily — the person building it. No paying customers, no waitlist soup, no "trusted by thousands of researchers." Every number on this page is from one real instance, ingesting one real life's worth of documents and code. Live counts from the daily-driver instance: see https://ouroboros-ai.app/status/ for the current values (rebuilt on each deploy from the live daemon). **Shipped:** the gate · the engine · the vault · tree-sitter ingest in 6 languages including C# · hybrid retrieval with cross-encoder rerank · per-connection scope isolation · mutation journal with revertable writes · the wiki primitive · the codebase dashboard · grounded extraction · contradiction detection · session bootloader · multi-agent coordination v1 (channels, inbox, identity attestation). **In flight:** install packaging (`.deb` / `.dmg` / `.exe`) · multi-source-folder repos · Gmail and Drive ingest · agent-edge LLM migration · launch packaging. --- *Generated from the ouroboros-ai.app source tree at build time. Source: https://github.com/Ouroboros-LM/ouroboros (private during beta).*