UltraRAG 2.0: Building High-Performance Retrieval-Augmented Generation Systems with Minimal Code

Dozens of lines of code to implement complex reasoning pipelines like Search-o1, focusing on research innovation instead of engineering burdens.

Have you ever struggled with the complex engineering implementation when building retrieval-augmented generation (RAG) systems? As RAG systems evolve from simple “retrieve + generate” approaches to complex knowledge systems incorporating adaptive knowledge organization, multi-step reasoning, and dynamic retrieval, researchers face increasing engineering challenges. Traditional methods require substantial code to implement workflow control, module integration, and experimental evaluation—not only time-consuming but also error-prone.

Now, there’s a new solution: UltraRAG 2.0.

What is UltraRAG 2.0?

UltraRAG 2.0 (UR-2.0) is a RAG framework based on the Model Context Protocol (MCP) architecture, jointly developed by Tsinghua University’s THUNLP Lab, Northeastern University’s NEUIR Lab, OpenBMB, and AI9stars. The revolutionary aspect of this framework is that you only need to write YAML files to directly declare complex logic including serial processing, loops, and conditional branching, significantly reducing the technical barriers and learning costs associated with complex RAG systems.

Imagine being able to implement multi-round iterative retrieval and generation pipelines similar to Search-o1 with just dozens of lines of code. UltraRAG 2.0 makes this possible.

Core Highlights: Why Choose UltraRAG 2.0?

🚀 Low-Code Complex Pipeline Construction

UltraRAG 2.0 natively supports reasoning control structures including serial processing, loops, and conditional branching. You don’t need to write complex program logic—simply declare the execution flow through YAML files to build powerful iterative RAG systems.

⚡ Rapid Reproduction and Functional Expansion

Based on the MCP architecture, all modules are encapsulated as independent, reusable Servers:

Customize Servers as needed or directly reuse existing modules
Each Server’s functionality is registered through function-level Tool interfaces—adding new features only requires adding a single function
Supports calling external MCP Servers, easily expanding pipeline capabilities and application scenarios

📊 Unified Evaluation and Comparison

Built-in standardized evaluation processes and metric management, out-of-the-box support for 17 mainstream research benchmarks:

Continuous integration of the latest baselines
Provides Leaderboard results
Facilitates systematic comparison and optimization experiments

Technical Principles: MCP Architecture and Native Process Control

Across different RAG systems, while core functions like retrieval and generation are similar, implementation strategies vary significantly, often resulting in modules lacking unified interfaces that are difficult to reuse across projects. The 👉Model Context Protocol (MCP), as an open protocol, standardizes how to provide context for large language models (LLMs) using a Client-Server architecture, enabling Server components developed according to this protocol to be seamlessly reused across different systems.

UltraRAG 2.0 builds upon the MCP architecture, abstracting and encapsulating core RAG functions such as retrieval, generation, and evaluation into independent MCP Servers, invoked through standardized function-level Tool interfaces. This design ensures both flexibility in module function expansion and allows new modules to be integrated in a “hot-plug” manner without invasive modifications to the global codebase.

Developing complex RAG reasoning frameworks presents significant challenges, but UltraRAG 2.0’s ability to support complex system construction under low-code conditions stems from its native support for multi-structure Pipeline process control at its core. All control logic can be defined and scheduled at the YAML level, covering various process expression methods required for complex reasoning tasks.

During actual operation, the reasoning workflow is executed by the built-in Client, with its logic entirely described by externally written Pipeline YAML scripts, achieving decoupling from the underlying implementation. Developers can invoke instructions like loop and step as if using programming language keywords, quickly constructing multi-stage reasoning processes in a declarative manner.

Installation Guide: Getting Started with UltraRAG 2.0

Environment Preparation

Create a virtual environment using Conda:

conda create -n ultrarag python=3.11
conda activate ultrarag

Clone the project to your local machine or server via git:

git clone https://github.com/OpenBMB/UltraRAG.git
cd UltraRAG

Dependency Installation

We recommend using uv for package management, providing a faster and more reliable Python dependency management experience:

pip install uv
uv pip install -e .

If you prefer pip, you can also run directly:

pip install -e .

Optional Dependency Installation

UR-2.0 supports rich Server components—you can flexibly install required dependencies based on actual tasks:

# If you need to use faiss for vector indexing:
# You need to manually compile and install the CPU or GPU version of FAISS according to your hardware environment:
# CPU version:
uv pip install faiss-cpu
# GPU version (example: CUDA 12.x)
uv pip install faiss-gpu-cu12

# If you need to use infinity_emb for corpus encoding and indexing:
uv pip install -e ."[infinity_emb]"

# If you need to use lancedb vector database:
uv pip install -e ."[lancedb]"

# If you need to use vLLM service deployment models:
uv pip install -e ."[vllm]"

# If you need to use corpus document parsing functionality:
uv pip install -e ."[corpus]"

# Install all dependencies (except faiss)
uv pip install -e ."[all]"

Verification Installation

Run the following command to verify successful installation:

# Successful run displays 'Hello, UltraRAG 2.0!' welcome message
ultrarag run examples/sayhello.yaml

Quick Start: Three Steps to Build Your First RAG System

Using UltraRAG 2.0 is extremely simple, requiring just three steps:

Compile Pipeline files to generate parameter configuration
Modify parameter files
Run Pipeline files

We provide complete tutorial examples from beginner to advanced levels. Welcome to visit the 👉tutorial documentation to quickly get started with UltraRAG 2.0!

Common Function Directory

Here’s a quick navigation to commonly used functions in research:

Comprehensive Support: Datasets, Corpora, and Baseline Methods

UltraRAG 2.0 works out-of-the-box with built-in support for the most commonly used public evaluation datasets, large-scale corpora, and typical baseline methods in the current RAG field, facilitating rapid reproduction and expansion of experiments for researchers.

Supported Datasets

UltraRAG 2.0 supports various types of evaluation datasets, covering task types including QA, multi-hop QA, multiple-choice, long-form QA, fact verification, dialogue, and slot filling:

Task Type	Dataset Name	Original Data Quantity	Evaluation Sample Quantity
QA	👉NQ	3,610	1,000
QA	👉TriviaQA	11,313	1,000
QA	👉PopQA	14,267	1,000
QA	👉AmbigQA	2,002	1,000
QA	👉MarcoQA	55,636	1,000
QA	👉WebQuestions	2,032	1,000
Multi-hop QA	👉HotpotQA	7,405	1,000
Multi-hop QA	👉2WikiMultiHopQA	12,576	1,000
Multi-hop QA	👉Musique	2,417	1,000
Multi-hop QA	👉Bamboogle	125	125
Multi-hop QA	👉StrategyQA	2,290	1,000
Multiple-choice	👉ARC	3,548	1,000
Multiple-choice	👉MMLU	14,042	1,000
Long-form QA	👉ASQA	948	948
Fact-verification	👉FEVER	13,332	1,000
Dialogue	👉WoW	3,054	1,000
Slot-filling	👉T-REx	5,000	1,000

Supported Corpora

Corpus Name	Document Count
👉wiki-2018	21,015,324
wiki-2024	Being organized, coming soon

The complete 👉dataset can be accessed and downloaded via this link. You can also refer to the 👉data format description to flexibly customize and add any dataset or corpus.

Supported Baseline Methods (Continuously Updated)

UltraRAG 2.0 continuously integrates the latest baseline methods, facilitating comparative experiments for researchers:

Baseline Name	Script
Vanilla LLM	examples/vanilla.yaml
Vanilla RAG	examples/rag.yaml
👉IRCoT	examples/IRCoT.yaml
👉IterRetGen	examples/IterRetGen.yaml
👉RankCoT	examples/RankCoT.yaml
👉R1-searcher	examples/r1_searcher.yaml
👉Search-o1	examples/search_o1.yaml
👉Search-r1	examples/search_r1.yaml
WebNote	examples/webnote.yaml

Frequently Asked Questions

Who is UltraRAG 2.0 Suitable For?

UltraRAG 2.0 is particularly suitable for the following user groups:

Researchers: Want to quickly implement and validate new RAG algorithm ideas without getting bogged down in engineering details
Students and beginners: Learners who want to study RAG technology but are intimidated by complex implementations
Engineers: Developers who need rapid prototyping and iterative RAG application scenarios

How Much Programming Experience Do I Need to Use UltraRAG 2.0?

UltraRAG 2.0 is designed to lower the barrier to entry. Even with only basic Python and YAML knowledge, you can quickly get started. The framework lets you focus on process design rather than implementation details through a declarative approach.

What About UltraRAG 2.0’s Performance?

Thanks to the standardized MCP architecture, UltraRAG 2.0 maintains ease of use without sacrificing performance. The modular design allows each component to be independently optimized and supports distributed deployment to meet large-scale application requirements.

How to Extend UltraRAG 2.0’s Functionality?

Extending functionality is very simple: just implement a new MCP Server, register the corresponding Tool function, and then call it in the YAML configuration file. This hot-plug design makes functional expansion easy without affecting existing systems.

Is There Community Support?

Yes, UltraRAG has active community support. You can submit technical questions via 👉GitHub Issues, join the 👉WeChat group, 👉Feishu group, or 👉discord to communicate with other users and developers.

Conclusion

UltraRAG 2.0 represents a new direction in RAG technology development—reducing the implementation barriers of complex systems through standardization and modularization. Whether you’re a researcher, student, or engineer, UltraRAG 2.0 can help you focus more on algorithm innovation and experimental design rather than engineering implementation details.

Start using UltraRAG 2.0 now and experience the convenience of building high-performance RAG systems with just dozens of lines of code! If you have any questions or suggestions, please feel free to contact us through our community channels.

UltraRAG 2.0: Build Advanced RAG Systems in Dozens of Lines of Code