UltraRAG 2.0: Building High-Performance Retrieval-Augmented Generation Systems with Minimal Code
Dozens of lines of code to implement complex reasoning pipelines like Search-o1, focusing on research innovation instead of engineering burdens.
Have you ever struggled with the complex engineering implementation when building retrieval-augmented generation (RAG) systems? As RAG systems evolve from simple “retrieve + generate” approaches to complex knowledge systems incorporating adaptive knowledge organization, multi-step reasoning, and dynamic retrieval, researchers face increasing engineering challenges. Traditional methods require substantial code to implement workflow control, module integration, and experimental evaluation—not only time-consuming but also error-prone.
Now, there’s a new solution: UltraRAG 2.0.
What is UltraRAG 2.0?
UltraRAG 2.0 (UR-2.0) is a RAG framework based on the Model Context Protocol (MCP) architecture, jointly developed by Tsinghua University’s THUNLP Lab, Northeastern University’s NEUIR Lab, OpenBMB, and AI9stars. The revolutionary aspect of this framework is that you only need to write YAML files to directly declare complex logic including serial processing, loops, and conditional branching, significantly reducing the technical barriers and learning costs associated with complex RAG systems.
Imagine being able to implement multi-round iterative retrieval and generation pipelines similar to Search-o1 with just dozens of lines of code. UltraRAG 2.0 makes this possible.
Core Highlights: Why Choose UltraRAG 2.0?
🚀 Low-Code Complex Pipeline Construction
UltraRAG 2.0 natively supports reasoning control structures including serial processing, loops, and conditional branching. You don’t need to write complex program logic—simply declare the execution flow through YAML files to build powerful iterative RAG systems.
⚡ Rapid Reproduction and Functional Expansion
Based on the MCP architecture, all modules are encapsulated as independent, reusable Servers:
-
Customize Servers as needed or directly reuse existing modules -
Each Server’s functionality is registered through function-level Tool interfaces—adding new features only requires adding a single function -
Supports calling external MCP Servers, easily expanding pipeline capabilities and application scenarios
📊 Unified Evaluation and Comparison
Built-in standardized evaluation processes and metric management, out-of-the-box support for 17 mainstream research benchmarks:
-
Continuous integration of the latest baselines -
Provides Leaderboard results -
Facilitates systematic comparison and optimization experiments
Technical Principles: MCP Architecture and Native Process Control
Across different RAG systems, while core functions like retrieval and generation are similar, implementation strategies vary significantly, often resulting in modules lacking unified interfaces that are difficult to reuse across projects. The 👉Model Context Protocol (MCP), as an open protocol, standardizes how to provide context for large language models (LLMs) using a Client-Server architecture, enabling Server components developed according to this protocol to be seamlessly reused across different systems.
UltraRAG 2.0 builds upon the MCP architecture, abstracting and encapsulating core RAG functions such as retrieval, generation, and evaluation into independent MCP Servers, invoked through standardized function-level Tool interfaces. This design ensures both flexibility in module function expansion and allows new modules to be integrated in a “hot-plug” manner without invasive modifications to the global codebase.

Developing complex RAG reasoning frameworks presents significant challenges, but UltraRAG 2.0’s ability to support complex system construction under low-code conditions stems from its native support for multi-structure Pipeline process control at its core. All control logic can be defined and scheduled at the YAML level, covering various process expression methods required for complex reasoning tasks.
During actual operation, the reasoning workflow is executed by the built-in Client, with its logic entirely described by externally written Pipeline YAML scripts, achieving decoupling from the underlying implementation. Developers can invoke instructions like loop and step as if using programming language keywords, quickly constructing multi-stage reasoning processes in a declarative manner.
Installation Guide: Getting Started with UltraRAG 2.0
Environment Preparation
Create a virtual environment using Conda:
conda create -n ultrarag python=3.11
conda activate ultrarag
Clone the project to your local machine or server via git:
git clone https://github.com/OpenBMB/UltraRAG.git
cd UltraRAG
Dependency Installation
We recommend using uv for package management, providing a faster and more reliable Python dependency management experience:
pip install uv
uv pip install -e .
If you prefer pip, you can also run directly:
pip install -e .
Optional Dependency Installation
UR-2.0 supports rich Server components—you can flexibly install required dependencies based on actual tasks:
# If you need to use faiss for vector indexing:
# You need to manually compile and install the CPU or GPU version of FAISS according to your hardware environment:
# CPU version:
uv pip install faiss-cpu
# GPU version (example: CUDA 12.x)
uv pip install faiss-gpu-cu12
# If you need to use infinity_emb for corpus encoding and indexing:
uv pip install -e ."[infinity_emb]"
# If you need to use lancedb vector database:
uv pip install -e ."[lancedb]"
# If you need to use vLLM service deployment models:
uv pip install -e ."[vllm]"
# If you need to use corpus document parsing functionality:
uv pip install -e ."[corpus]"
# Install all dependencies (except faiss)
uv pip install -e ."[all]"
Verification Installation
Run the following command to verify successful installation:
# Successful run displays 'Hello, UltraRAG 2.0!' welcome message
ultrarag run examples/sayhello.yaml
Quick Start: Three Steps to Build Your First RAG System
Using UltraRAG 2.0 is extremely simple, requiring just three steps:
-
Compile Pipeline files to generate parameter configuration -
Modify parameter files -
Run Pipeline files
We provide complete tutorial examples from beginner to advanced levels. Welcome to visit the 👉tutorial documentation to quickly get started with UltraRAG 2.0!
Common Function Directory
Here’s a quick navigation to commonly used functions in research:
-
👉Using retrievers for corpus encoding and indexing -
👉Deploying retrievers -
👉Deploying LLMs -
👉Baseline reproduction -
👉Experimental result case analysis -
👉Debugging tutorial
Comprehensive Support: Datasets, Corpora, and Baseline Methods
UltraRAG 2.0 works out-of-the-box with built-in support for the most commonly used public evaluation datasets, large-scale corpora, and typical baseline methods in the current RAG field, facilitating rapid reproduction and expansion of experiments for researchers.
Supported Datasets
UltraRAG 2.0 supports various types of evaluation datasets, covering task types including QA, multi-hop QA, multiple-choice, long-form QA, fact verification, dialogue, and slot filling:
Task Type | Dataset Name | Original Data Quantity | Evaluation Sample Quantity |
---|---|---|---|
QA | 👉NQ | 3,610 | 1,000 |
QA | 👉TriviaQA | 11,313 | 1,000 |
QA | 👉PopQA | 14,267 | 1,000 |
QA | 👉AmbigQA | 2,002 | 1,000 |
QA | 👉MarcoQA | 55,636 | 1,000 |
QA | 👉WebQuestions | 2,032 | 1,000 |
Multi-hop QA | 👉HotpotQA | 7,405 | 1,000 |
Multi-hop QA | 👉2WikiMultiHopQA | 12,576 | 1,000 |
Multi-hop QA | 👉Musique | 2,417 | 1,000 |
Multi-hop QA | 👉Bamboogle | 125 | 125 |
Multi-hop QA | 👉StrategyQA | 2,290 | 1,000 |
Multiple-choice | 👉ARC | 3,548 | 1,000 |
Multiple-choice | 👉MMLU | 14,042 | 1,000 |
Long-form QA | 👉ASQA | 948 | 948 |
Fact-verification | 👉FEVER | 13,332 | 1,000 |
Dialogue | 👉WoW | 3,054 | 1,000 |
Slot-filling | 👉T-REx | 5,000 | 1,000 |
Supported Corpora
Corpus Name | Document Count |
---|---|
👉wiki-2018 | 21,015,324 |
wiki-2024 | Being organized, coming soon |
The complete 👉dataset can be accessed and downloaded via this link. You can also refer to the 👉data format description to flexibly customize and add any dataset or corpus.
Supported Baseline Methods (Continuously Updated)
UltraRAG 2.0 continuously integrates the latest baseline methods, facilitating comparative experiments for researchers:
Baseline Name | Script |
---|---|
Vanilla LLM | examples/vanilla.yaml |
Vanilla RAG | examples/rag.yaml |
👉IRCoT | examples/IRCoT.yaml |
👉IterRetGen | examples/IterRetGen.yaml |
👉RankCoT | examples/RankCoT.yaml |
👉R1-searcher | examples/r1_searcher.yaml |
👉Search-o1 | examples/search_o1.yaml |
👉Search-r1 | examples/search_r1.yaml |
WebNote | examples/webnote.yaml |
Frequently Asked Questions
Who is UltraRAG 2.0 Suitable For?
UltraRAG 2.0 is particularly suitable for the following user groups:
-
Researchers: Want to quickly implement and validate new RAG algorithm ideas without getting bogged down in engineering details -
Students and beginners: Learners who want to study RAG technology but are intimidated by complex implementations -
Engineers: Developers who need rapid prototyping and iterative RAG application scenarios
How Much Programming Experience Do I Need to Use UltraRAG 2.0?
UltraRAG 2.0 is designed to lower the barrier to entry. Even with only basic Python and YAML knowledge, you can quickly get started. The framework lets you focus on process design rather than implementation details through a declarative approach.
What About UltraRAG 2.0’s Performance?
Thanks to the standardized MCP architecture, UltraRAG 2.0 maintains ease of use without sacrificing performance. The modular design allows each component to be independently optimized and supports distributed deployment to meet large-scale application requirements.
How to Extend UltraRAG 2.0’s Functionality?
Extending functionality is very simple: just implement a new MCP Server, register the corresponding Tool function, and then call it in the YAML configuration file. This hot-plug design makes functional expansion easy without affecting existing systems.
Is There Community Support?
Yes, UltraRAG has active community support. You can submit technical questions via 👉GitHub Issues, join the 👉WeChat group, 👉Feishu group, or 👉discord to communicate with other users and developers.
Conclusion
UltraRAG 2.0 represents a new direction in RAG technology development—reducing the implementation barriers of complex systems through standardization and modularization. Whether you’re a researcher, student, or engineer, UltraRAG 2.0 can help you focus more on algorithm innovation and experimental design rather than engineering implementation details.
Start using UltraRAG 2.0 now and experience the convenience of building high-performance RAG systems with just dozens of lines of code! If you have any questions or suggestions, please feel free to contact us through our community channels.