LLM Wiki — Andrej Karpathy

Source: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f

A pattern for building personal knowledge bases using LLMs.

Core Idea

Instead of RAG (re-deriving knowledge from raw documents on every query), the LLM incrementally builds and maintains a persistent wiki — structured, interlinked markdown files between user and raw sources. Knowledge is compiled once and kept current, not re-derived on every query.

The wiki is a persistent, compounding artifact. Cross-references are already there. Contradictions already flagged. Synthesis reflects everything read.

The human curates sources and asks questions. The LLM does summarizing, cross-referencing, filing, and bookkeeping.

Architecture — Three Layers

Raw sources — Immutable curated documents. LLM reads but never modifies.
The wiki — LLM-generated markdown files. LLM owns entirely — creates, updates, maintains cross-references.
The schema — Configuration (e.g. CLAUDE.md) telling the LLM how the wiki is structured, conventions, workflows. Co-evolved with user.

Operations

Ingest: New source → LLM reads, discusses takeaways, writes summary, updates index, updates entity/concept pages, appends to log. Single source may touch 10-15 pages.
Query: Search relevant pages via index, synthesize answer with citations. Good answers filed back as new wiki pages — explorations compound.
Lint: Health-check for contradictions, stale claims, orphan pages, missing cross-references, data gaps.

Special Files

index.md — Content-oriented catalog. Each page with link, one-line summary. LLM reads index first to find relevant pages. Works well at ~100 sources / hundreds of pages.
log.md — Chronological append-only record. Format: ## [YYYY-MM-DD] action | Title.

Use Cases

Personal tracking, research depth, book reading (companion wiki), team knowledge, competitive analysis, due diligence, trip planning, hobbies.

Why It Works

The tedious part of knowledge bases is bookkeeping, not reading or thinking. Humans abandon wikis because maintenance burden outgrows value. LLMs don’t get bored and can touch 15 files in one pass. Maintenance cost near zero.

Related to Vannevar Bush’s Memex (1945) — personal curated knowledge with associative trails. Bush couldn’t solve who does the maintenance. The LLM handles that.

Tooling

Obsidian Web Clipper for source collection
Obsidian Graph View for visualization
Dataview plugin for dynamic queries over frontmatter
qmd for scaled search (BM25/vector/LLM reranking)
Marp for slide decks from wiki content
Git for version history

Wiki

探索

karpathy-llm-wiki