Deep dive

Knowledge Graph

The structured-data thesis. Why typed entity-predicate-object rows with dated observers and grounded quotes beat in-context everything — and where we're heading past the Karpathy-wiki ceiling.

Updated May 2, 2026


Most agents you use today reason over loose document text re-stuffed into the context window every session. The retrieval layer fetches a handful of chunks, the model squints at them, paraphrases what it sees, and either gets it right or makes something up that sounds right. Next session, the work is gone. The model didn’t learn anything from reading that PDF. It rented a sentence for a turn.

Ouroboros makes a different bet. Every document, code file, conversation, correction — anything you put in front of the system — is parsed once into typed rows. Entities, relationships, observations, quotes, observers, timestamps, trust tiers. Those rows live in a single libsql database your agents query through one MCP connection. The graph is the smart-cache. Mining a doc is a one-time cost; querying its facts is a sub-millisecond JOIN, forever.

That’s the thesis: structure beats stuffing context. The rest of this page is what the structure looks like, why every shape decision is load-bearing, and where the work goes next — because the long-run target isn’t “a better notes app.” It’s the queryable substrate underneath the kind of personal-wiki ambition Andrej Karpathy described, except the wiki isn’t the product. The graph beneath it is.

The shape: dated observations, named observers, trust tiers

A knowledge fact in Ouroboros is not a key-value pair. It is an observation, and the schema makes you record everything that gives an observation meaning:

  • entity — the thing the fact is about (a person, account, project, repo, doc)
  • predicate — the typed relationship (holds_account_at, references_doc, published_in_year, defines_code_example, deadline, ~80 in active use)
  • object — the value (string, number, entity reference, structured payload)
  • agent — the connection that wrote it (server-attested, not self-reported)
  • observed_at — when the observation was made
  • trust_tierhuman (you), extracted (model-mined from a source), inferred (derived by a system pass)
  • quote — verbatim source text, capped at 300 characters, substring-checked against the document at write time

The same predicate from a higher-tier observer supersedes the same predicate from a lower-tier one. The same predicate from the same observer at a later time supersedes the earlier one. The old rows don’t get deleted — they stay in the journal. The graph reads “the latest, highest-trust observation per (entity, predicate)” as a single indexed query.

anatomy of a knowledge fact + the supersede flow
flowchart TB
  subgraph row["A single observation row"]
    direction LR
    E[entity_id] --- P[predicate]
    P --- O[object]
    O --- A[agent / observer]
    A --- T[observed_at]
    T --- Q[trust_tier]
    Q --- V["evidence quote ≤300 chars"]
  end

  M["Model mines doc<br/>writes at trust_tier=extracted"] --> G[(Knowledge graph)]
  H["You disagree<br/>file new fact at trust_tier=human"] --> G
  G --> R["Reads return YOUR fact<br/>extracted row stays in journal"]

  style H fill:#1a4d2e,stroke:#22c55e,color:#fff
  style M fill:#3a3a1a,stroke:#fbbf24,color:#fff

This shape is the hinge the rest of the system swings on. Without dated observers and trust tiers, contradictions become “your data is corrupt.” With them, contradictions become “two observations disagree, here’s both, here’s when, here’s who, pick one or supersede with a third.” The graph never has to hide anything.

Predicates as a controlled-but-extensible vocabulary

Predicates are typed. There are roughly 80 in active use on the daily-driver deployment today — things like holds_account_at, references_doc, defines_code_example, published_in_year, deadline, succeeded_by, mentioned_in. The vocabulary is small enough to be coherent and large enough to be useful, and it grows the way working systems grow: where the data warrants a new shape, an agent proposes one.

The discovery surface is sophia.predicates_in_use — any agent can ask which predicates are live and how often each one shows up. When a new domain shows up (say, you start tracking grant deadlines and the existing deadline predicate doesn’t capture grant-specific structure), the agent proposes a new predicate, you confirm, and it’s part of the vocabulary. There is no static ontology committee. There is also no free-for-all — every predicate is typed, every typed predicate has expected object shape, and untyped writes are rejected at the daemon.

// Discoverable, not pre-declared
const vocab = await sophia.predicates_in_use({ entity_name: 'acme' });
// →
// {
//   predicates: [
//     { name: 'holds_account_at', count: 42, last_observed: '2026-04-30T...' },
//     { name: 'references_doc',   count: 187, last_observed: '2026-05-02T...' },
//     { name: 'published_in_year', count: 12, last_observed: '...' },
//     ...
//   ],
//   total_in_use: 78,
// }

The vocabulary is a contract between agents and the graph. It stays small by default because every predicate has to earn its keep, but it isn’t frozen — the system grows where the work pushes it.

Reading the graph

The primary read is sophia.query_knowledge. You give it an entity (by name, by id, or by predicate filter) and you get typed rows back. No prompt, no embedding hop, no retrieval ranker — just a SQL JOIN against indexed columns.

const facts = await sophia.query_knowledge({
entity_name: 'acme',
limit: 50,
});
// →
// {
//   entity: { id: 'ent_acme_...', name: 'Acme', type: 'organization' },
//   knowledge_facts: [
//     {
//       predicate: 'holds_account_at',
//       object:    'First National',
//       observer:  'OpusDev-Ouroboros-Code',
//       observed_at: '2026-04-12T18:22:14.301Z',
//       trust_tier: 'human',
//       quote:     'Acme banks at First National per the 2026 Q1 board pack.',
//       source_doc_id: 'doc_2026q1_board_pack',
//     },
//     {
//       predicate: 'published_in_year',
//       object:    '1953',
//       observer:  'mining-pipeline',
//       observed_at: '2026-04-05T11:08:02.110Z',
//       trust_tier: 'extracted',
//       quote:     '...Acme Co., founded in 1953 in Detroit...',
//       source_doc_id: 'doc_company_history_pdf',
//     },
//     // ...
//   ],
//   total: 47,
// }

The shape of the result is what makes composition cheap. Filter, group, JOIN across entities, project into a custom view — all in one round trip via sophia.execute_code. Loose-text RAG can’t do that. There’s nothing to JOIN in a paragraph.

Contradictions are first-class

When two observations disagree about the same (entity, predicate), the graph doesn’t pick a winner and quietly hide the loser. It surfaces the conflict. sophia.find_contradictions is a SQL JOIN over the typed rows that returns every pair where the same entity has two different objects under the same predicate from observers at the same trust tier — or where a higher-tier observation has overridden a lower-tier one and you might want to see what got superseded.

const conflicts = await sophia.find_contradictions({
entity_name: 'acme',
predicate:   'published_in_year',
});
// →
// {
//   contradictions: [
//     {
//       entity:    'Acme',
//       predicate: 'published_in_year',
//       observations: [
//         { object: '1953', observer: 'mining-pipeline', trust_tier: 'extracted',
//           observed_at: '2026-04-05T...', quote: '...founded in 1953 in Detroit...',
//           source_doc_id: 'doc_company_history_pdf' },
//         { object: '1954', observer: 'mining-pipeline', trust_tier: 'extracted',
//           observed_at: '2026-04-18T...', quote: '...incorporated 1954 per filings...',
//           source_doc_id: 'doc_sec_filings_pdf' },
//       ],
//       resolution: 'unresolved',  // no human override yet
//     },
//   ],
// }

// You decide which is right (or that both are partially right). You file a
// new fact at human trust tier; the previous extracted rows stay in the
// journal but no longer satisfy "current view" reads.
await sophia.remember_fact({
entity_name: 'acme',
predicate:   'published_in_year',
object:      '1953',
trust_tier:  'human',
quote:       '1953 confirmed against incorporation cert on file.',
supersedes:  ['fact_id_extracted_1953', 'fact_id_extracted_1954'],
});

The contradiction table is a first-class surface, not an exception. Real knowledge bases have conflicts. The schema is honest about them; the agents are obligated to surface them; the human picks.

User outranks model

When you disagree with the model, you do not edit a row in place. There is no “edit knowledge fact” tool in the SDK — by deliberate omission. There is only remember_fact, which writes a new observation. The new observation can declare what it supersedes, and because the writer is you (trust tier human), it outranks the model’s extracted write. The previous fact sits forever in the journal; the graph reads yours.

This is the load-bearing rule that makes the whole graph trustworthy. Once you know the model can’t quietly overwrite you, every other shape decision falls into place: the journal can be append-only, the supersede column can be a simple ordering, contradictions can be surfaced rather than auto-resolved, and the model can be wrong out loud without breaking your data.

Why this beats “just stuff the doc into context”

Loose-text RAG and full-context-stuffing both have one shape: read the relevant passage at query time, hand it to the model, hope. They share a set of failure modes, and structured rows side-step every one:

  • Compounds across sessions. Every doc you ingest is free knowledge for every agent forever. A fact extracted in March still answers a question in May without re-reading the source. Loose-text approaches re-do the work every turn.
  • Verifiable by construction. Every claim carries a verbatim quote, every quote was substring-checked against the source at write time, every reader can re-check it later. Hallucinations have nowhere to land — there is no quote, so the fact never hits the table. (Trust Covenant rule 1 enforces this.)
  • Queryable in milliseconds. Typed rows are indexable. WHERE predicate = 'deadline' AND observed_at > '2026-01-01' is a sub-millisecond JOIN. You cannot do that against a paragraph.
  • Composable in one round trip. sophia.execute_code lets an agent run a TypeScript snippet inside a sandboxed V8 isolate on the daemon. It can JOIN facts across documents, filter by trust tier, group by predicate, and return one shaped result. The chained-MCP-call equivalent is N round trips and N times the tokens.
  • Auditable by default. Every row carries observer + observed_at. “Who told me this and when?” is a SQL question, not a model question. The audit trail isn’t bolted on; it is the schema.

The shape is what compounds. The longer you use the system, the more rows you have, the more questions become single-query rather than multi-doc reasoning problems. Loose text doesn’t compound — every session pays the same retrieval tax for the same answer.

Where Karpathy-wiki stops, and where we’re heading

Andrej Karpathy talks about a “wiki of you” — structured personal notes, linked, durable, browsable. It’s the right shape for a substrate of personal knowledge, and the framing has done real work: people now ask, reasonably, why their AI doesn’t write to a wiki they can read. Ouroboros today does that for documents, code, observations, and the corrections you’ve made along the way. The wiki surface is real (sophia.write_wiki_page / sophia.read_wiki_page / autolinks across pages) and durable (Ed25519-signed, journaled, scoped per-agent).

But a wiki is the document layer. Every page is still a paragraph waiting to be re-read. The deeper bet — the part the rest of the work is pointing at — is the graph layer underneath. Where Karpathy-wiki stops at “I have a nicely structured page about Acme,” Ouroboros aims at “the moment my agent needs holds_account_at for Acme, it’s a sub-millisecond typed query — no paragraph reading required, and the answer carries the quote that proves it.”

What’s already in flight or specified:

  • Semantic linking everywhere. Autolinks already wire references across docs, code modules, and wiki pages via the autolink_decision queue. v2 makes the linking bidirectional, agent-proposed at scale, user-confirmed through a review queue rather than batch-applied. Every link becomes a typed edge in the graph, not an <a href> in a paragraph.
  • Inferred edges. Beyond directly-observed facts, derived ones — transitive closures (A.references_doc(B) and B.cites(C)A.transitively_cites(C)), time-windowed aggregates (account.balance_at(t) derived from transaction history) — materialized as cached query views and re-derived on write. The graph stops being only what you put in it; it starts being what follows from what you put in it.
  • Belief topology as a first-class surface. sophia.confidence_topology already ships — it returns which entities have weak observation density versus strong, where the model is guessing versus where you’ve corrected, where contradictions cluster. The graph knows what it doesn’t know, and it can tell an agent where to dig before answering.
  • Cross-graph projections. Query views that translate facts between vocabularies. You define once: “show me every holds_account_at and holds_position_in fact as a balance-sheet projection grouped by entity.” Agents read through the projection; the underlying rows never move. The vocabulary stays minimal at the storage layer and arbitrary at the read layer.
  • Belief tracing. sophia.trace_belief already lets you ask “why does the system believe X about Acme?” and get back the chain of observations, documents, and supersessions. v2 extends that to inferred edges — “why does the system believe Acme transitively cites this paper?” returns the observed edges that make up the closure.

The aim is plain: a substrate where “what does my data say about X” is a sub-second typed query for any X you’ve ingested anything about, with the quote that proves it attached to every row in the answer. Karpathy’s wiki is a great document store. Ouroboros is trying to be the queryable graph underneath such a store — so the wiki stays human-readable while the agent queries the structure, not the prose.

What’s true today, in one beta

The daily-driver deployment carries about 15,202 knowledge facts across ~80 predicates, mined from documents the user pointed at over the course of ordinary use. Contradictions are surfaced and resolved through the SPA’s Data tab. Wiki pages reference facts by id and re-read live. Autolinks land in a review queue, not into the live graph, until confirmed.

The vocabulary is still evolving. The inferred-edges layer is specified but not fully built. Cross-graph projections are a v2 target. Belief tracing works for direct observations and is being extended to closures. This is a beta — one person uses it daily, the shape is stable, the surface area is growing the way working systems grow: where the work demands it, with the schema ratcheted forward through migrations rather than rewrites.

The thesis isn’t theoretical. The graph compounds. Every doc you’ve ever ingested is still answering questions, and every correction you’ve ever made is still outranking the model. That’s the deal structure pays off. The rest is adding more shapes the data can take.


← Back to overview