OpenViking: An Open-Source Context Database for Smarter AI Agents

As artificial intelligence evolves at breakneck speed, we are entering an era where AI agents—autonomous programs that can reason, plan, and execute tasks—are becoming increasingly central to how we work and build software. Imagine a personal assistant that doesn’t just answer simple questions but can manage a complex project over several days, or a coding agent that understands your entire codebase and your personal preferences.

However, as these agents take on more ambitious roles, a fundamental challenge emerges: How do we efficiently manage the vast amount of contextual information they need?

Whether it’s a personal helper, a code generation tool, or an automated decision-making system, an AI agent needs to access a wide range of information during its operation: long-term memories of past interactions, external knowledge resources, and its own set of skills. Traditional solutions often lead to a fragmented mess—memories scattered in code, documents stored in vector databases, and tools defined in separate modules. This makes agents complex to build, expensive to run, and difficult to debug.

Today, we introduce an open-source project designed specifically to tackle this problem: OpenViking. It’s a purpose-built context database for AI agents that uses a novel “file system paradigm” to organize an agent’s world. Think of it as a way to build your agent’s brain by managing files and folders, just like you do on your local computer.


The Overlooked “Context Crisis” in Agent Development

If you’ve ever tried to build a sophisticated AI agent, you’ve likely encountered these frustrating problems:

  • Context Fragmentation: Important pieces of information live in different places. A user’s preference might be hardcoded in a Python file. A crucial project document sits in a separate vector database. A tool for searching the web is a function call in your code. The agent has to juggle all these disparate sources, making its logic complex and the system difficult to maintain.

  • Exploding Context Needs: An agent running a long task, like researching a topic or managing a project, generates new context with every interaction. Simply truncating the chat history or naively compressing it leads to information loss. This breaks the agent’s ability to maintain coherence over long periods.

  • Poor Retrieval Quality: Traditional RAG (Retrieval-Augmented Generation) systems treat information as flat chunks of text. They look for semantic similarity—finding sentences that “mean” the same thing. But they completely ignore the structure of the information. It’s like trying to understand a book by grabbing random sentences from different chapters, with no idea how they relate to each other or the overall narrative.

  • Invisible Context: The retrieval process is a black box. When an agent gives a wrong answer, you have no way of knowing why. Was it because the model misunderstood the question? Or was it because the context it retrieved was irrelevant or incorrect? This makes debugging incredibly hard.

  • Limited Memory Evolution: Most systems treat “memory” as nothing more than a raw log of past conversations. They lack the ability to learn from task executions, to extract valuable insights, and to refine their own knowledge over time. The agent doesn’t get smarter with use.

The root of these problems is simple: we lack a dedicated, structured infrastructure for managing an agent’s context. This is the gap OpenViking aims to fill.


OpenViking: Redefining the Context Database

OpenViking is an open-source context database, but it’s not a database in the traditional sense. It’s a system designed from the ground up for AI agents. Its core idea is elegant yet powerful: unify all the context an agent needs—long-term memories, external knowledge bases, and even its own skills—into a single, organized virtual file system.

Developers and agents alike can interact with this context using familiar file system commands like ls, find, grep, and cat. This provides the agent with a structured, observable, and self-improving “brain.”

The Core Goals of OpenViking

  • Unified Management: Replace fragmented storage with a single, consistent abstraction: the virtual file system.
  • Cost Optimization: Drastically reduce the token consumption of large language models (LLMs) by storing information in layers and loading only what’s necessary.
  • Precise Retrieval: Combine the power of semantic search with the logical structure of a directory tree. Find information by first locating the right “folder” and then exploring its contents.
  • Complete Observability: Record the entire path of every retrieval, allowing developers to trace the agent’s “thought process” back to the source.
  • Self-Evolution: Automatically extract and refine long-term memories from conversations, enabling the agent to learn and improve over time.

Core Concepts: How the File System Paradigm Solves Real Problems

OpenViking’s design is built on five core concepts, each directly addressing one of the challenges we discussed earlier.

1. The File System Paradigm → Solving Fragmentation

OpenViking maps all context to virtual directories under the viking:// protocol. Each piece of information has a unique URI, just like a file path. This creates a familiar and navigable structure.

viking://
├── resources/              # External knowledge: project docs, code repos, web pages
│   ├── my_project/
│   │   ├── docs/
│   │   └── src/
├── user/                   # Memories related to the user
│   └── memories/
│       ├── preferences/
│       └── habits/
└── agent/                  # The agent's own skills and experiences
    ├── skills/
    ├── memories/
    └── instructions/

This structure allows an agent to locate information deterministically. Need to find the project documentation? Go to viking://resources/my_project/docs/. Want to recall the user’s preferred coding style? Check viking://user/memories/preferences/coding_style.

Developers can also use simple commands to manage this context:

ov ls viking://resources/                 # List all resources
ov tree viking://resources/my_project -L 2 # View a directory as a tree
ov cat viking://user/memories/preferences  # View the content of a "file"

This solves fragmentation: Memories, resources, and skills are no longer isolated. They are unified within a single, navigable file system.

2. Layered Context Loading → Reducing Token Consumption

To avoid loading massive amounts of irrelevant information for every query, OpenViking automatically generates three layers for each piece of context when it’s first stored:

  • L0 (Summary): A one-sentence summary, perfect for quick filtering and relevance checks. For a document, the L0 might be “OpenViking installation guide.”
  • L1 (Overview): Contains the core information and use cases, roughly 2K tokens. An agent reads the L1 layer during its planning phase to decide if it needs to go deeper.
  • L2 (Details): The complete, raw data. This could be the full text of a document, the entire code file, or the high-resolution image. This layer is only loaded when the agent is absolutely sure it needs the specifics.

These layers are stored as special “hidden” files within the same virtual directory (e.g., .abstract, .overview). An agent can navigate down through the layers, paying the cost for full details only when necessary, instead of for every single interaction.

3. Recursive Directory Retrieval → Improving Retrieval Quality

Traditional vector search is good at finding semantically similar chunks, but it often misses the bigger picture. OpenViking introduces a recursive directory retrieval strategy that mimics how a human would search for information: first, figure out which folder might contain the answer; then, explore inside that folder.

The retrieval process works like this:

  1. Intent Analysis: The system analyzes the user’s query to generate multiple search criteria (like keywords and entities).
  2. Initial Targeting: It uses fast vector search to find a few highly relevant “chunks” and then identifies the high-scoring directories (folders) they belong to.
  3. Refined Exploration: It performs a more focused search within those promising directories, adding the best results to a candidate list.
  4. Recursive Deepening: If those directories have subdirectories, it repeats the refined exploration step on them, drilling down layer by layer.
  5. Result Aggregation: Finally, it compiles the most relevant context from all these steps and returns it to the agent.

This method doesn’t just find semantically matching sentences; it ensures the information is retrieved within its full structural context. For example, if an agent asks “How do I configure OpenViking’s embedding models?”, the system might first navigate to the viking://resources/openviking/docs/configuration/ directory. Then, it would search inside that specific directory for “embedding” and return the entire configuration section, not just a few scattered sentences.

4. Visualizing Retrieval Trails → Making Context Observable

Because the retrieval process is based on navigating a directory tree, every step of the search path is recorded. Developers can use an API or the CLI to view this “retrieval trail”—the exact path the system took to find the information (e.g., from viking://resources/ down to viking://resources/openviking/docs/configuration/embedding.md).

When an agent gives a wrong answer, you can now clearly see whether the fault was in the retrieval path (it looked in the wrong place) or in the model’s reasoning (it had the right info but used it incorrectly). This level of observability is a game-changer for debugging complex agent behaviors.

5. Automatic Session Management → Context Self-Iteration

OpenViking has a built-in memory self-iteration loop. At the end of a session (or when triggered by a developer), the system can analyze the interaction. It asynchronously processes the user’s feedback, the task’s outcome, the tools that were called, and the conversation history. It then automatically updates the user and agent memory directories:

  • User Memory Update: Extracts new preferences, like a preferred writing style or a frequently used code library, making the agent more personalized for future interactions.
  • Agent Experience Accumulation: Learns from successful (or failed) task executions. It can distill operational tips or tool-usage patterns and store them as “skills” in its own memory, aiding future decision-making.

This means an agent is no longer a static piece of code. It can continuously learn and evolve through its interactions with the world, truly becoming “smarter with use.”


Quick Start: Build Your First Context Database in 10 Minutes

Let’s move from concepts to practice. This guide will walk you through installing OpenViking and running your first example.

Prerequisites

  • Python: Version 3.10 or higher.
  • Go: Version 1.22 or higher (only needed if building certain components from source).
  • C++ Compiler: GCC 9+ or Clang 11+ (required for building core extensions, must support C++17).
  • Operating System: Linux, macOS, or Windows.
  • Network Connection: A stable internet connection for downloading dependencies and accessing model APIs.

1. Installation

The easiest way to install OpenViking is via pip:

pip install openviking --upgrade

If you also want the command-line tool ov (recommended), you can install it with Rust’s Cargo package manager:

cargo install --git https://github.com/volcengine/OpenViking ov_cli

Or use a one-liner installation script:

curl -fsSL https://raw.githubusercontent.com/volcengine/OpenViking/main/crates/ov_cli/install.sh | bash

2. Preparing Your Models

OpenViking relies on two types of AI models to function:

  • VLM (Vision-Language Model): For understanding images and complex content (e.g., visual question answering, document parsing).
  • Embedding Model: For generating vector representations of text, which is how semantic search works.

Supported VLM Providers

You have flexibility in choosing your model provider. OpenViking supports three main options:

Provider Description Example Models
volcengine Volcano Engine’s Doubao models doubao-seed-2-0-pro-260215
openai OpenAI’s official API gpt-4-vision-preview, gpt-4o
litellm A unified interface for many models (Anthropic, DeepSeek, Gemini, local models via vLLM/Ollama, and more) claude-3-5-sonnet-20240620, deepseek-chat, gemini-pro, ollama/llama3.1

Note: litellm is a powerful option that lets you use a single API format for dozens of model providers. The model field must follow LiteLLM’s naming conventions.

Supported Embedding Providers

Provider Example Model Dimension
volcengine doubao-embedding-vision-250615 1024
openai text-embedding-3-large 3072

3. Configuration

Creating the Server Config File ~/.openviking/ov.conf

Here’s a complete example configuration using Volcano Engine’s Doubao models. You’ll need to replace the placeholders with your own API key and endpoint.

{
  "storage": {
    "workspace": "/home/your-username/openviking_workspace"
  },
  "log": {
    "level": "INFO",
    "output": "stdout"
  },
  "embedding": {
    "dense": {
      "api_base": "https://ark.cn-beijing.volces.com/api/v3",
      "api_key": "YOUR_VOLCENGINE_API_KEY",
      "provider": "volcengine",
      "dimension": 1024,
      "model": "doubao-embedding-vision-250615"
    },
    "max_concurrent": 10
  },
  "vlm": {
    "api_base": "https://ark.cn-beijing.volces.com/api/v3",
    "api_key": "YOUR_VOLCENGINE_API_KEY",
    "provider": "volcengine",
    "model": "doubao-seed-2-0-pro-260215",
    "max_concurrent": 100
  }
}

If you prefer to use OpenAI, the configuration would look like this:

{
  "embedding": {
    "dense": {
      "api_base": "https://api.openai.com/v1",
      "api_key": "YOUR_OPENAI_API_KEY",
      "provider": "openai",
      "dimension": 3072,
      "model": "text-embedding-3-large"
    }
  },
  "vlm": {
    "api_base": "https://api.openai.com/v1",
    "api_key": "YOUR_OPENAI_API_KEY",
    "provider": "openai",
    "model": "gpt-4-vision-preview"
  }
}

Setting the Environment Variable

On Linux/macOS:

export OPENVIKING_CONFIG_FILE=~/.openviking/ov.conf

On Windows PowerShell:

$env:OPENVIKING_CONFIG_FILE = "$HOME/.openviking/ov.conf"

4. Running Your First Example

First, start the OpenViking server:

openviking-server

The server will start and listen on the default port 1933. To run it in the background:

nohup openviking-server > /data/log/openviking.log 2>&1 &

Now, open another terminal. Let’s configure the CLI client. Create the file ~/.openviking/ovcli.conf:

{
  "url": "http://localhost:1933",
  "timeout": 60.0,
  "output": "table"
}

Set the environment variable for the client config (optional, as it defaults to this path):

export OPENVIKING_CLI_CONFIG_FILE=~/.openviking/ovcli.conf

Now you can start interacting with your new context database:

# 1. Check the server status
ov status

# 2. Add an external resource (e.g., the OpenViking GitHub repo)
ov add-resource https://github.com/volcengine/OpenViking --wait

# 3. List all the resources you've added
ov ls viking://resources/

# 4. View the directory tree of a specific resource (up to 2 levels deep)
ov tree viking://resources/volcengine -L 2

# 5. Wait a moment for the semantic processing to complete, then search for information
ov find "what is openviking"

# 6. Perform a full-text search within a specific directory
ov grep "context database" --uri viking://resources/volcengine/OpenViking/docs

Congratulations! You’ve successfully used OpenViking to ingest, organize, and search an external GitHub repository.

5. Trying VikingBot (Optional)

VikingBot is a lightweight agent framework built on top of OpenViking. It lets you quickly start an interactive chat session with an agent that can use your context database.

Install the bot component:

pip install "openviking[bot]"

Start the server with the bot enabled:

openviking-server --with-bot

In another terminal, start an interactive chat:

ov chat

Now you can have a conversation with the agent. It will automatically use the context stored in OpenViking to answer your questions.


Deeper Dive: Model Provider Configuration

OpenViking’s flexibility comes from its support for multiple model providers. Here’s a more detailed look at how to configure them.

Volcengine (Doubao)

Volcengine’s Doubao models are a strong choice, especially for Chinese-language tasks. You can configure your VLM using either the model name or a specific endpoint ID from the Volcengine console.

{
  "vlm": {
    "provider": "volcengine",
    "model": "doubao-seed-2-0-pro-260215",
    "api_key": "your-api-key",
    "api_base": "https://ark.cn-beijing.volces.com/api/v3"
  }
}

If you’ve created a dedicated inference endpoint in the Volcengine ARK console, you can use its ID as the model:

{
  "vlm": {
    "provider": "volcengine",
    "model": "ep-20241220174930-xxxxx",
    "api_key": "your-api-key",
    "api_base": "https://ark.cn-beijing.volces.com/api/v3"
  }
}

OpenAI

Using the official OpenAI API is straightforward. You can use models like gpt-4o or gpt-4-vision-preview.

{
  "vlm": {
    "provider": "openai",
    "model": "gpt-4o",
    "api_key": "your-api-key",
    "api_base": "https://api.openai.com/v1"
  }
}

You can also use a custom api_base if you are accessing OpenAI through a proxy or a compatible local server.

LiteLLM (The Universal Connector)

LiteLLM is incredibly powerful. It allows you to use a single configuration format for dozens of model providers. Here are a few typical examples.

Anthropic’s Claude:

{
  "vlm": {
    "provider": "litellm",
    "model": "claude-3-5-sonnet-20240620",
    "api_key": "your-anthropic-api-key"
  }
}

DeepSeek:

{
  "vlm": {
    "provider": "litellm",
    "model": "deepseek-chat",
    "api_key": "your-deepseek-api-key"
  }
}

Qwen (Tongyi Qianwen) via DashScope:

For users in mainland China, use this api_base:

{
  "vlm": {
    "provider": "litellm",
    "model": "dashscope/qwen-turbo",
    "api_key": "your-dashscope-api-key",
    "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1"
  }
}

For international users, the endpoint is different:

{
  "vlm": {
    "provider": "litellm",
    "model": "dashscope/qwen-turbo",
    "api_key": "your-dashscope-api-key",
    "api_base": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
  }
}

Running Models Locally with Ollama:

First, make sure Ollama is running.

ollama serve

Then configure OpenViking to use it:

{
  "vlm": {
    "provider": "litellm",
    "model": "ollama/llama3.1",
    "api_base": "http://localhost:11434"
  }
}

Running Models Locally with vLLM:

Assuming you have a vLLM server running, the configuration is similar:

{
  "vlm": {
    "provider": "litellm",
    "model": "hosted_vllm/llama-3.1-8b",
    "api_base": "http://your-vllm-server:8000"
  }
}

Performance & Evaluation: The OpenClaw Memory Plugin

OpenViking is not just a theoretical concept. Its practical value has been validated by integrating it as a memory plugin for the OpenClaw agent framework. The team ran tests on the LoCoMo long-context conversation dataset. Here are the key results, using the seed-2.0-code model.

Experimental Setup Task Completion Rate Total Input Tokens
OpenClaw (with its default memory-core) 35.65% 24,611,530
OpenClaw + LanceDB (without memory-core) 44.55% 51,574,530
OpenClaw + OpenViking Plugin (without memory-core) 52.08% 4,264,396
OpenClaw + OpenViking Plugin (with memory-core) 51.23% 2,099,622

What this means:

  • By integrating OpenViking and disabling OpenClaw’s native memory, the task completion rate jumped from 35.65% to 52.08%—an impressive 46% improvement.
  • At the same time, the total input token cost was slashed by 83% .
  • Even when OpenClaw’s native memory was left on, OpenViking still reduced token costs by 91% while significantly boosting performance.

This data powerfully demonstrates OpenViking’s potential to make AI agents both smarter and dramatically more cost-effective.


Frequently Asked Questions

How is OpenViking different from traditional RAG?

Traditional RAG systems typically chop documents into flat chunks and store them in a vector database. Retrieval is based solely on semantic similarity, ignoring the document’s original structure (like chapters, sections, and headings). OpenViking uses a file system paradigm. It organizes context into a hierarchical tree of directories. Retrieval starts by identifying the right “folder” and then drills down into its contents. This structural approach is more aligned with how humans organize and find information, and it makes the retrieval process transparent and debuggable.

What language models does OpenViking support?

OpenViking itself doesn’t provide the language models. Instead, it acts as a smart context manager that connects to external models. It currently supports providers like Volcengine, OpenAI, and any provider compatible with LiteLLM (which includes Anthropic, DeepSeek, Google Gemini, Cohere, and many local options). This list is expected to grow.

Do I need a GPU to run OpenViking?

The OpenViking server itself does not require a GPU. It’s a lightweight service that manages metadata, the virtual file system, and coordinates retrieval. However, the VLM and Embedding models it calls do need significant computational resources. You have two choices:

  1. Use Cloud APIs: This is the easiest path. Connect to services like OpenAI or Volcengine. Your local machine doesn’t need a GPU.
  2. Run Models Locally: If you use LiteLLM to connect to a local setup like Ollama or vLLM, you will need a machine with a capable GPU to run those models.

Is OpenViking suitable for personal projects or only for enterprises?

Both! For a personal developer, OpenViking can be a fantastic tool to build a “second brain” or a highly personalized AI assistant that remembers everything. For an enterprise, it can serve as the central context infrastructure for multiple agents, unifying company knowledge, project documentation, and shared tools. Its open-source nature means it’s free to use and can be customized for any scale.

How can I contribute to OpenViking?

We welcome contributions of all kinds! You can:

  • Report bugs or suggest features by opening an issue on our GitHub repository.
  • Improve the documentation.
  • Submit code changes by forking the repository and creating a pull request.

Please refer to our Contributing Guide for more details.


Community and Getting Involved

OpenViking is still in its early stages, and there’s a lot of exciting work to be done. We warmly invite every developer passionate about the future of AI agents to join us.

If you find this project interesting, giving us a star on GitHub is a huge encouragement!


License

OpenViking is licensed under the Apache License 2.0. This means you are free to use, modify, and distribute it, as long as you retain the copyright notice and license text. For the full terms, please see the LICENSE file.


Star History Chart

The journey to build a better way for AI agents to remember and learn has begun. We hope you’ll join us in shaping the future of context management.