Site icon Efficient Coder

Build a Permanent Knowledge Base with LLM Wiki: Stop Reinventing the Wheel

Stop Reinventing the Wheel: Build a Permanent Knowledge Base with LLM Wiki

Have you ever read a brilliant article, felt like you gained something profound, and then — months later — completely forgotten where those insights were? Or spent hours digging through dozens of documents and web pages just to piece together information for a single report, feeling like you were starting from scratch every single time?

We live in an age of information overload, yet most of us still manage knowledge using a “collect, forget, repeat” cycle. Even modern AI tools that retrieve documents on demand (commonly known as RAG, or Retrieval-Augmented Generation) only scratch the surface of what’s possible.

Recently, renowned AI scientist Andrej Karpathy proposed a fundamentally different approach: Stop asking AI to passively search your files. Instead, let it actively build and maintain a growing personal wiki for you. This isn’t just a tool swap — it’s a paradigm shift in how we think about knowledge management.

This article breaks down Karpathy’s “LLM Wiki” concept and walks you through a practical, step-by-step implementation using Obsidian. No jargon, no fluff — just a clear path to building a self-evolving knowledge system of your own.

Part 1: Why Your Current Knowledge Management Probably Isn’t Working

The Limitation of RAG: A Perpetual Temp Worker

Right now, the most popular way people use AI with their documents is RAG. The workflow looks like this: you upload a pile of PDFs, notes, and web articles to a platform (like NotebookLM or ChatGPT’s file upload feature). When you ask a question, the AI temporarily searches through those documents for relevant snippets and stitches together an answer.

Karpathy identified the fundamental flaw in this approach: there is no accumulation.

Every time you ask a question, the AI is like a temporary worker rummaging through a massive pile of files from scratch. Need an answer that synthesizes five different documents? The AI finds five fragments on the spot, assembles them, and that’s it. Nothing gets saved. Nothing builds on itself. The next time you ask a similar question, it starts over.

Under this model, your knowledge base is static and passive. The AI acts as a temporary search engine, not a knowledge steward working on your behalf over time.

Karpathy’s Alternative: From “Retrieval” to “Construction”

Karpathy’s replacement is what he calls the “LLM Wiki.”

The core idea: let AI incrementally build and maintain a persistent, interlinked wiki — essentially a collection of interconnected Markdown files with clear structure.

In this model, the division of labor becomes crystal clear:

  • Your role (the human): You handle input (reading, observing, thinking), set direction, ask high-quality questions, and make final judgments.
  • AI’s role: It handles all the tedious maintenance — organizing notes, updating cross-references, keeping summaries current, flagging contradictions, and maintaining consistency across dozens of pages.

Think of it this way:

  • RAG model: Every time you cook, you go to the market and buy ingredients from scratch.
  • LLM Wiki model: You have a smart kitchen and a tireless assistant. Tell them “I want to learn Italian cooking,” and they’ll stock ingredients, organize recipes, build connections between “pasta,” “sauces,” and “herbs,” and keep updating your private cookbook. You just focus on tasting and creating.

Part 2: Building Your LLM Wiki Step by Step (With Obsidian)

Great theory — but how do you actually do it? Obsidian, a local-first Markdown-based note-taking app, is the perfect foundation for an LLM Wiki thanks to its open architecture, flexibility, and powerful bidirectional linking.

Here’s a detailed walkthrough.

Step 1: Capture Content Fast with the Web Clipper Extension

Knowledge input is the first bottleneck. Manually copying and pasting web content is slow and loses formatting.

Solution: Install the Obsidian Web Clipper browser extension.

  1. Search for and install it from the Chrome, Edge, or Firefox extension store.
  2. When you find a valuable article, click the Web Clipper icon in your browser toolbar.
  3. Select “Add to Obsidian.” The article is automatically converted to clean Markdown and saved to your designated Obsidian vault.

This ensures fast, high-fidelity ingestion of source material — giving your AI assistant quality raw material to work with.

Step 2: Protect Your Assets — Localize Images with One Keystroke

Articles saved via Web Clipper may still reference external image links. Over time, those links break, leaving your notes with broken images — and making them unreadable for AI.

Solution: Configure Obsidian for one-click image localization.

  1. Set a unified attachment path:
    • Open Obsidian Settings → “Files & Links.”
    • Find “Default location for new attachments” and set it to “In subfolder under current folder.” Name the subfolder attachments. This keeps every article’s images in a parallel folder, maintaining a clean structure.
  2. Bind a download shortcut:
    • Open Settings → “Hotkeys.”
    • Search for “download” and find the command “Download all remote images (current file).”
    • Assign a convenient shortcut, such as Ctrl+Shift+D (Windows/Linux) or Cmd+Shift+D (Mac).

Daily workflow: After clipping an article, immediately press your shortcut. All images download locally, and links update automatically. No more broken images — ever.

Practical tip: Karpathy notes that current LLMs can’t read Markdown files with many embedded images in one pass. The workaround: have the AI read the text first, then view referenced image files separately as needed. Slightly clunky, but reliable.

Step 3: See the Big Picture with the Graph View

Obsidian’s “Graph View” is your bird’s-eye map of the entire knowledge network. It displays all notes as nodes, with bidirectional links drawn as connecting lines.

To open it: click the graph icon in the left sidebar, or press Ctrl+G.

Karpathy uses the graph view with AI for two critical purposes:

  1. Knowledge base health check: Instantly spot “orphan” notes — those with no incoming links. These represent gaps in cross-referencing that your AI should fill.
  2. Discover blind spots: If a concept is mentioned across many notes but has no dedicated page, it appears as a gray “ghost node” — a clear signal to ask your AI to create a dedicated wiki page for it.

Step 4: (Advanced) Structure Your Knowledge with Dataview for Dynamic Queries

As your wiki grows, manually maintaining indexes becomes impractical. Dataview is a powerful Obsidian community plugin that queries note metadata (written as YAML frontmatter at the top of files) like a database, generating dynamic lists and tables.

Installation: Settings → Community Plugins → Community Plugin Browser → Search “Dataview” → Install and enable.

How to use it with LLM Wiki:

  1. Have AI add metadata to pages. Instruct it to write structured frontmatter at the top of each page it creates or organizes. For example:

    ---
    type: source
    title: "Karpathy's LLM Wiki Methodology"
    date: 2026-04-05
    tags: [knowledge-management, AI, Obsidian]
    source_count: 3
    ---
    
  2. Write query blocks. In any note, insert a Dataview query. For instance, to list all “source” notes sorted by date:

    ```dataview
    TABLE title, date, tags
    FROM "wiki/sources"
    SORT date DESC
    ```
    

A dynamic, auto-updating source list appears instantly. The more pages you have, the more value this provides. That said, if your collection is small, a simple manually maintained index.md directory file works perfectly fine.

Step 5: (Advanced) Turn Notes into Slides with Marp

Marp is a Markdown-based presentation format. With the Marp Slides plugin installed in Obsidian, you can preview and export structured Markdown notes as PDF, HTML, or PPTX slideshows.

Installation: Settings → Community Plugins → Search “Marp Slides” → Install and enable.

Usage: Add marp: true to the top of your Markdown file, separate slides with ---, preview directly in Obsidian, and export with one click.

With LLM Wiki: When preparing a presentation, have your AI extract relevant content from wiki pages and generate a Marp slide deck draft. You polish and present.

Step 6: Safety Net — Version Control with Git

When AI can batch-modify files, version control isn’t optional — it’s essential. Git is the industry-standard distributed version control system.

Steps:

  1. Install the Obsidian Git plugin: Settings → Community Plugins → Community Plugin Browser → Search “git” → Install and enable.

  2. Initialize a Git repository (if your vault isn’t one yet):

    • Open your terminal (PowerShell on Windows, Terminal on Mac).
    • Navigate to your Obsidian vault directory using cd.
    • Run git init.
  3. Connect to a remote repository (e.g., GitHub): Create a private repository — your knowledge base is personal data. Then run:

    git branch -M main
    git remote add origin https://github.com/your-username/your-knowledge-base.git
    git add .
    git commit -m "init: initialize knowledge base"
    git push -u origin main
    
  4. Enable automatic syncing: In the Obsidian Git plugin settings, set Auto commit-and-sync interval to something like 10 minutes. The plugin handles commit and push automatically.

From this point on, you don’t need to manage anything manually. Automatic backups happen on a schedule — giving you both real-time redundancy and a complete history. AI made a bad change? Roll back to any previous version.

Step 7: (Optional) Large-Scale Search with qmd

When your wiki is small (a few hundred pages or fewer), a manual or AI-maintained index.md is sufficient for navigation. But as page counts grow into the thousands, you may need more powerful search.

Karpathy recommends qmd, a fully local Markdown search engine. For most users, though, you won’t need this until you notice your AI clearly slowing down when searching. Nail the fundamentals first.

Part 3: Why This Approach Works — Freeing Humans to Do What They Do Best

Karpathy put it perfectly: “The hardest part of maintaining a knowledge base isn’t reading or thinking — it’s the recording. Updating cross-references, keeping summaries current, flagging contradictions, maintaining consistency across dozens of pages… Humans abandon wikis because maintenance costs grow faster than the value they provide.”

AI solves this: it doesn’t get tired, doesn’t forget to update cross-references, and can touch fifteen files in a single operation. When maintenance costs drop to near zero, knowledge bases can finally survive — and thrive.

The core philosophy is a redefined division of labor. You invest your precious cognitive resources in what truly matters: selecting sources, setting direction, asking good questions, and deriving meaning. All the repetitive, administrative work of organizing and maintaining goes to your tireless AI assistant.

For most people, combining Obsidian Web Clipper + image localization shortcuts + Git version control + a capable LLM (such as Claude) is more than enough to build a powerful, durable personal LLM Wiki system.

Frequently Asked Questions

Q: Do I need programming skills for this?
A: Not at all. The setup involves installing Obsidian plugins, configuring settings, and using graphical interfaces. The only command-line step is initializing Git, which requires copying and pasting a few commands — and you likely only do it once.

Q: How does this compare to Notion, Logseq, or other tools?
A: The core advantages are data sovereignty and AI collaboration depth. All your knowledge lives as plain-text Markdown files on your local machine, fully under your control — no platform lock-in. Because the files are local and open, you can have LLMs directly and deeply interact with your file system, enabling more automated and complex knowledge tasks that are hard to achieve with closed, cloud-based platforms.

Q: Which LLM should I use as the “AI assistant”?
A: Both Karpathy and practitioners have mentioned Claude, primarily for its strong long-text understanding and instruction-following abilities. In practice, any LLM with good file-operation capabilities (via plugins or APIs) and a large context window can work — GPT-4 series (with appropriate tools), Gemini, and others are all viable. The key requirement is that the model can “understand” your Obsidian vault structure and “execute” file operation instructions.

Q: How do I get started without feeling overwhelmed?
A: Begin with the simplest core loop:

  1. Install Obsidian.
  2. Install the Web Clipper plugin.
  3. Configure image localization (set the attachment path and download shortcut).
  4. Install the Git plugin and set up automatic backups.

Run the “capture → organize → back up” loop first. Explore Dataview, Marp, and other advanced features only as your needs grow.

Q: What types of knowledge work best with this system?
A: It excels at declarative knowledge that benefits from long-term accumulation, frequent cross-referencing, and clear structure. Examples include:

  • Study notes (programming, languages, academic subjects)
  • Research material compilation
  • Project documentation and post-mortems
  • Book notes and thought excerpts
  • Personal writing and creative素材

Highly dynamic or unstructured information (like real-time chat logs) may require supplementary tools.


By building an AI-driven LLM Wiki, you stop being merely a consumer of knowledge and become the architect of your own cognitive infrastructure. Information no longer scatters and fades with time — instead, it grows, interconnects, and compounds within your digital garden, becoming a permanent and uniquely personal intellectual asset.

Exit mobile version