In the digital age, the ability to conduct in – depth research quickly and accurately is crucial. The Shandu Deep Research System is a powerful tool that combines multiple search engines with LangChain integration, offering a seamless and efficient research experience. This article will explore the key features, components, and usage scenarios of the Shandu system.
1. Overview of the Shandu Deep Research System
The Shandu Deep Research System is designed to empower users to perform complex web searches and in – depth analysis. It is equipped with a unified searcher that can leverage multiple search engines, including Google, DuckDuckGo, Bing, and Wikipedia. With its caching mechanism, it can significantly reduce search time and improve efficiency.
Key Features
-
Multi – engine Search: The UnifiedSearcher class allows users to search across multiple search engines simultaneously, providing a comprehensive set of results. -
AI – enhanced Analysis: The AISearcher class combines search results with AI analysis, offering detailed summaries and source citations. -
Structured Output: Various structured output models, such as SearchQueries, SearchResultAnalysis, and ContentAnalysis, help in organizing and presenting research findings in a clear and concise manner.
2. Components of the Shandu System
2.1 Search Module
The search module is the core of the Shandu system. It includes the UnifiedSearcher
and SearchResult
classes.
UnifiedSearcher
The UnifiedSearcher
class is responsible for performing searches across multiple engines. It supports caching to improve performance and can be configured to limit the number of results per engine.
class UnifiedSearcher:
def __init__(self, max_results: int = 10, cache_enabled: bool = True, cache_ttl: int = 86400):
# Initialize the unified searcher
pass
async def search(self, query: str, engines: Optional[List[str]] = None, force_refresh: bool = False) -> List[SearchResult]:
# Search for a query using multiple engines
pass
SearchResult
The SearchResult
class stores the information about each search result, including the URL, title, snippet, and source.
@dataclass
class SearchResult:
url: str
title: str
snippet: str
source: str
def __str__(self) -> str:
return f"Title: {self.title}\nURL: {self.url}\nSnippet: {self.snippet}\nSource: {self.source}"
def to_dict(self) -> Dict[str, Any]:
return {
"url": self.url,
"title": self.title,
"snippet": self.snippet,
"source": self.source
}
2.2 AI Searcher
The AISearcher
class enhances the search results with AI analysis. It can generate detailed summaries, track citations, and extract learnings from the sources.
class AISearcher:
def __init__(
self,
llm: Optional[ChatOpenAI] = None,
searcher: Optional[UnifiedSearcher] = None,
scraper: Optional[WebScraper] = None,
citation_manager: Optional[CitationManager] = None,
max_results: int = 10,
max_pages_to_scrape: int = 3
):
# Initialize the AI searcher
pass
async def search(
self,
query: str,
engines: Optional[List[str]] = None,
detailed: bool = False,
enable_scraping: bool = True,
use_ddg_tools: bool = True
) -> AISearchResult:
# Perform AI - enhanced search
pass
2.3 Content Processing and Report Generation
The content processing and report generation modules help in analyzing the search results and generating professional reports.
ContentProcessor
The ContentProcessor
class provides functions for checking URL relevance, processing scraped items, and analyzing content.
async def is_relevant_url(llm: ChatOpenAI, url: str, title: str, snippet: str, query: str) -> bool:
# Check if a URL is relevant to the query
pass
async def process_scraped_item(llm: ChatOpenAI, item: ScrapedContent, subquery: str, main_content: str) -> Dict[str, Any]:
# Process a scraped item to evaluate reliability and extract content
pass
async def analyze_content(llm: ChatOpenAI, subquery: str, content_text: str) -> str:
# Analyze content from multiple sources and synthesize the information
pass
ReportGenerator
The ReportGenerator
class can generate professional titles for research reports and format citations.
async def generate_title(llm: ChatOpenAI, query: str) -> str:
# Generate a professional title for the report
pass
3. Usage Scenarios
3.1 AI – powered Search
The Shandu system can be used to perform AI – powered searches. The aisearch
command in the CLI allows users to specify search engines, maximum results, and output options.
@cli.command()
@click.argument("query")
@click.option("--engines", "-e", default=None, help="Comma - separated list of search engines to use")
@click.option("--max - results", "-m", default=10, type=int, help="Maximum number of results to return")
@click.option("--output", "-o", help="Save results to file")
@click.option("--detailed", "-d", is_flag=True, help="Generate a detailed analysis")
def aisearch(query: str, engines: Optional[str], max_results: int, output: Optional[str], detailed: bool):
# Perform AI - powered search with analysis of results
pass
3.2 Research Agent
The research agent in the Shandu system can generate targeted search queries based on current findings, extract relevant URLs, and analyze search results.
async def generate_queries_node(llm, progress_callback, state: AgentState) -> AgentState:
# Generate targeted search queries based on current findings
pass
async def _extract_urls_from_results(
self,
search_results: List[SearchResult],
max_urls: int = 3
) -> List[str]:
# Extract top URLs from search results
pass
4. Conclusion
The Shandu Deep Research System is a versatile and powerful tool for web research. Its combination of multi – engine search, AI – enhanced analysis, and structured output makes it an ideal choice for researchers, analysts, and anyone who needs to conduct in – depth web searches. Whether you are looking for detailed information on a specific topic or generating professional research reports, Shandu can help you achieve your goals efficiently.
By leveraging the capabilities of the Shandu system, users can save time, improve the accuracy of their research, and gain valuable insights from the vast amount of information available on the web. So, if you are in need of a reliable research tool, give Shandu a try!