Intelligent Search & Deep Research: Building a Local AI-Powered Efficient Data Collection Platform
In an age of information overload, merely listing dozens of web links no longer suffices for true research. DeepRearch is a Python-based project combining AI-driven retrieval and multi-model collaboration to help you sift valuable insights from massive datasets—and its transparent, visual pipeline ensures full control over the research process.
“Prioritizing search quality beats mindlessly stacking hundreds of pages.”
Table of Contents
-
Core Principles -
Key Features -
System Architecture Overview -
External Service Integration -
Deep Research Mode -
Getting Started: Environment Setup -
Configuration Details -
API Usage Examples -
Python Dependencies -
Demonstration of Results -
Known Issues & Solutions -
Roadmap & How to Contribute
Core Principles
Traditional search engines focus on quantity—returning massive lists of URLs that users must manually filter. DeepRearch flips this approach on its head by:
-
Quality First
AI models evaluate each webpage’s value, selecting only high-relevance, high-utility results. -
Transparent Workflow
Every step—from keyword generation to final summary—is visualized in real time, giving you clear insight into AI decision-making. -
Multi-Model Collaboration
Dedicated models handle specific tasks—keyword planning, result evaluation, content compression, extraction, and summary—ensuring each phase benefits from specialized AI expertise.
This three-pronged strategy boosts efficiency and drives deeper, more accurate research outcomes.
Key Features
1. Fully Local Deployment
All service modules—excluding external large-model APIs—run locally.
-
Security & Control: Sensitive data and search flows remain within your local network. -
Customizability: Tailor or extend functionality to fit your organization’s needs.
2. Visualized Research Pipeline
From initial planning through dynamic search, evaluation, and iterative refinement, the entire process is rendered in an interactive, step-by-step view.
Users can instantly observe AI’s:
Task decomposition Search strategy adjustments Selection of top results
This transparency fosters trust and helps pinpoint bottlenecks quickly.
3. OpenAI-Compatible API Service
Built on Flask, DeepRearch provides standard /v1/chat/completions
and /v1/models
endpoints—directly compatible with most LLM clients.
-
Streaming Responses: Partial results stream back in real time to enhance interactivity. -
Smart Mode Switching: Automatically selects “standard search” or “deep research” mode based on request content.
4. Deep Research Mode
The deep-research
mode iterates through multiple rounds of search, evaluation, extraction, and planning—ideal for tackling complex topics. Detailed mechanics follow in the next section.
5. Flexible Search Engine & Crawler Integration
Supports SearXNG and Tavily search APIs as well as FireCrawl and Crawl4AI web crawlers. Auto-switching between services ensures high availability and robust data collection.
6. Intelligent Content Compression
AI-driven compression prunes out redundant content from fetched pages, boosting processing efficiency and context density.
7. Seamless Multi-Model Orchestration
-
Base Chat Model handles user interaction and tool coordination. -
Keyword Planning Model breaks down user queries into optimized search terms. -
Evaluation Model scores each page for relevance and value. -
Compression & Extraction Models distill core insights. -
Summary Model synthesizes findings into cohesive conclusions.
This pipeline maximizes each model’s strengths for superior research reports.
System Architecture Overview
Below is a high-level view of DeepRearch’s components and data flow:
-
User Request Layer
Receives queries and routes them to the appropriate research mode. -
Search Engine Module
Interfaces with SearXNG/Tavily to fetch raw search results. -
Crawler Services
Uses FireCrawl/Crawl4AI to retrieve full webpage content. -
Model Orchestration Layer
Calls specialized AI models for keyword generation, evaluation, compression, extraction, and summarization. -
Output Layer
Delivers structured JSON or streamed responses back to the client.
This horizontally scalable architecture lets you spin up multiple instances to handle high loads.
External Service Integration
DeepRearch offers two interchangeable service stacks to maximize flexibility and fault tolerance.
Search Engine APIs
-
SearXNG
-
Self-hostable via Docker or available public instances. -
JSON output simplifies parsing.
-
-
Tavily
-
Commercial API requiring a TAVILY_KEY
. -
Allows advanced sorting strategies.
-
By default, the system attempts SearXNG first, falling back to Tavily on failure.
Web Crawlers
-
FireCrawl
-
High-performance, self-hostable API. -
Ideal for concurrent fetching.
-
-
Crawl4AI
-
Docker-compatible backup crawler.
-
A priority strategy ensures continuous operation even if the primary service goes down.
Deep Research Mode
Deep Research Mode isn’t a single search—it’s an iterative exploration framework:
-
Initial Query Planning
The Keyword Planning Model breaks down the user’s question into multiple high-impact search terms. -
Multi-Round Search & Evaluation
Each round:-
Fetches top N pages via search engine + crawler. -
Scores each page’s relevance and value through the Evaluation Model. -
Selects top candidates for further analysis.
-
-
Content Compression & Extraction
-
Compression Model eliminates noise. -
Extraction Model pulls out core insights.
-
-
Dynamic Planning for Next Steps
Based on extracted data, the system refines its search strategy—adding or tweaking keywords. -
Final Summarization
The Summary Model weaves together extracted points into a coherent, in-depth report.
Recommended Settings:
MAX_DEEPRESEARCH_RESULTS
: pages per round (default: 3)MAX_STEPS_NUM
: maximum iterations (default: 12)
Use this mode for deep dives into complex subjects—market analysis, technical whitepapers, competitive intelligence—without drowning in noise.
Getting Started: Environment Setup
1. Operating System & Python
-
OS: Linux / macOS / Windows -
Python: v3.8+ -
Virtual Env: venv
orconda
recommended
# Clone the repo
git clone https://your.repo/DeepRearch.git
cd DeepRearch
# Create & activate venv
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
2. Environment Variables
Copy the template and configure .env
with your keys and URLs:
cp .env.template .env
Variable | Purpose | Example |
---|---|---|
API_KEY |
Project API authorization key | your_project_secret |
SEARXNG_URL |
SearXNG instance endpoint | http://localhost:8080 |
TAVILY_KEY |
Tavily service API key | your_tavily_key |
FIRECRAWL_API_URL |
FireCrawl endpoint | https://api.firecrawl.dev |
FIRECRAWL_API_KEY |
FireCrawl API key | your_firecrawl_key |
CRAWL4AI_API_URL |
Crawl4AI endpoint | https://api.crawl4ai.com |
BASE_CHAT_MODEL |
Base chat model name | gpt-4o |
SEARCH_KEYWORD_MODEL |
Keyword planning model | gpt-4o-search |
EVALUATE_MODEL |
Page evaluation model | gpt-4o-eval |
COMPRESS_MODEL |
Content compression model | gpt-4o-compress |
SUMMARY_MODEL |
Summary generation model | gpt-4o-summary |
MAX_SEARCH_RESULTS |
Pages per standard search | 10 |
MAX_DEEPRESEARCH_RESULTS |
Pages per deep research round | 3 |
MAX_STEPS_NUM |
Deep research max iterations | 12 |
Once configured, test and launch:
# Test all APIs
python main.py --test
# Start service
python main.py
By default, DeepRearch listens on http://0.0.0.0:5000
.
API Usage Examples
Below are sample calls using any OpenAI-compatible client.
Standard Search Mode
curl -X POST http://localhost:5000/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role":"user","content":"How to enable QEMU KVM acceleration for RK3399 on Linux?"}]
}'
Response streams back keyword plans, evaluation scores, and a concise answer.
Deep Research Mode
curl -X POST http://localhost:5000/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-deep-research",
"messages": [{"role":"user","content":"Please deeply analyze how to use QEMU KVM acceleration for RK3399 on Linux."}]
}'
Triggers multi-round iterations, finally outputting:
-
Keyword breakdown -
Top pages per round -
Core command examples & configuration steps -
Comprehensive conclusions & caveats
Python Dependencies
All essential libraries are declared in requirements.txt
:
-
Flask – Lightweight web framework -
openai – OpenAI Python SDK -
requests – HTTP client -
beautifulsoup4 – HTML/XML parser -
PyMuPDF, python-docx, openpyxl – Document format handlers -
python-dotenv – Environment variable loader
Install them via:
pip install Flask openai requests beautifulsoup4 PyMuPDF python-docx openpyxl python-dotenv
Demonstration of Results
Here’s how DeepRearch shines in real-world scenarios.
Multi-Model Parameter Table
User Prompt:
“Fetch parameters and official API pricing for Gemini, Claude, DeepSeek, GLM, Qwen, and present them in a table.”
Sample Output:
Niche Technical Query Resolution
User Prompt:
“How to leverage QEMU KVM acceleration for RK3399 on Linux?”
Key Takeaways:
-
Use taskset
to bind to big.LITTLE cores -
Load KVM modules & configure permissions -
Analyze performance benchmarks and considerations
Known Issues & Solutions
While DeepRearch is stable, certain edge cases exist:
-
Premature End of Research
-
Cause: Initial prompt lacks “deep” or “detailed” keywords. -
Fix: Include phrases like “deep research” or “detailed analysis” in your request.
-
-
Client Timeouts
-
Cause: Excessive deep research rounds or long fetch times.
-
Fix:
-
Reduce MAX_STEPS_NUM
(≤8) -
Limit crawler concurrency via CRAWL_THREAD_NUM
-
-
-
Third-Party Service Downtime
-
Cause: SearXNG or FireCrawl instance offline. -
Fix: Verify .env
settings; redundant services (Tavily/Crawl4AI) auto-activate.
-
Roadmap & How to Contribute
🛠️ Roadmap
-
Asynchronous Orchestration: Introduce async scheduling to boost throughput for large tasks. -
Plugin Architecture: Add support for more external services (image search, academic APIs). -
Intelligent Q&A: Integrate knowledge graphs for richer contextual answers. -
Visualization Dashboard: Real-time monitoring of research tasks and model calls.
🤝 Contributing
-
Fork the repo & create a branch: feature/your-feature
. -
Submit a Pull Request describing your enhancements or bug fixes. -
Open an Issue to discuss ideas or report problems.
We welcome all contributors passionate about AI-driven research!