Building a Multi-Agent Public Opinion Analysis System from Scratch: The BettaFish (Weiyu) Technical Deep Dive
Core Question: How can you build a fully automated, multi-agent system that analyzes social media sentiment and generates comprehensive public opinion reports?
In the age of information overload, understanding what people truly think across millions of social media posts is no easy task.
The Weibo Public Opinion Analysis System, codenamed BettaFish (Weiyu), tackles this challenge through a multi-agent AI framework that automates data collection, analysis, and report generation across multiple modalities and platforms.
This article walks you through its architecture, setup, operational workflow, and practical extensions — strictly based on the project documentation. You’ll see not just how it works, but also how it reflects a new direction for AI-powered analytics systems.
🧭 Overview: What Is BettaFish and Why It Matters
Core Question: What is the BettaFish (Weiyu) system, and what core problems does it solve?
BettaFish is a ground-up, AI-driven, multi-agent public opinion analysis system. It helps users break free from echo chambers, reconstruct real sentiment landscapes, and forecast opinion trends.
The system accepts simple natural language prompts — users can type requests conversationally (“analyze public sentiment on Wuhan University”) — and intelligent agents automatically collect, process, and synthesize insights from 30+ major global social platforms.
Key Advantages of BettaFish
-
AI-driven all-domain monitoring
Continuous 24/7 operation with an AI crawler cluster. It covers multiple platforms such as Weibo, Xiaohongshu, Douyin, and Kuaishou, capturing not only hot topics but also millions of user comments. -
Beyond LLMs: A composite analysis engine
Combines five types of specialized Agents, fine-tuned models, and statistical tools for cross-validation and multi-dimensional reasoning. -
Multimodal understanding
Supports text, video, and structured data (e.g., weather cards, stock info) — enabling complete situational awareness. -
Agent “Forum Collaboration” mechanism
Each Agent has unique tools and thinking modes. A debate-style “ForumEngine” orchestrates discussions to produce more balanced, reasoned conclusions. -
Public-private data fusion
Integrates external social media data with internal enterprise databases for end-to-end intelligence. -
Lightweight and extensible architecture
Built entirely in Python, designed for modular expansion and one-click deployment.
Author’s reflection:
Traditional monitoring tools stop at data visualization. BettaFish goes further — from listening to understanding. It treats data as a conversation rather than a static report.
🏗 System Architecture: Inside the Multi-Agent Brain
Core Question: How do multiple Agents collaborate to perform complex sentiment and trend analysis?
The BettaFish system is built around four primary Agents and one collaboration engine, each serving a distinct analytical role.
| Component | Description | Purpose |
|---|---|---|
| Insight Agent | Mines private databases for in-depth insights | Internal data analysis |
| Media Agent | Analyzes multimodal content (images, videos) | Visual & cross-modal understanding |
| Query Agent | Searches domestic and international sources | Broad content discovery |
| Report Agent | Generates structured reports | Automated documentation |
| ForumEngine | Coordinates debates between Agents | Collaborative reasoning |
Full Workflow
| Step | Phase | Key Operation | Participants |
|---|---|---|---|
| 1 | User Query | Receive the user’s question | Flask Main App |
| 2 | Parallel Start | Launch all Agents simultaneously | Query / Media / Insight |
| 3 | Preliminary Analysis | Agents use their toolkits to explore data | Agents + Tools |
| 4 | Strategy Formation | Define focused research paths | Decision Modules |
| 5 | Forum Loop | Iterative collaboration and reflection | ForumEngine + Agents |
| 6 | Result Consolidation | Collect and merge findings | Report Agent |
| 7 | Report Generation | Format and render the final report | Template Engine |

Image source: BettaFish Documentation
Author’s reflection:
The ForumEngine concept is transformative. It turns analysis into a team debate rather than a single model’s monologue.
In practical testing, this significantly improves reasoning depth and reduces model bias.
🚀 Quick Start Guide
Core Question: How can you deploy and run BettaFish on your own system?
This section provides a step-by-step process for installing and launching the BettaFish system.
Environment Requirements
| Component | Requirement |
|---|---|
| OS | Windows / Linux / macOS |
| Python | 3.9 or higher (recommended 3.11) |
| Conda | Anaconda or Miniconda |
| Database | MySQL (cloud or local) |
| Memory | ≥ 2 GB |
Step 1: Create a Conda Environment
conda create -n weiyu_env python=3.11
conda activate weiyu_env
Step 2: Install Dependencies
pip install -r requirements.txt
You can comment out “machine learning” dependencies in the requirements file if you prefer lightweight CPU-only execution.
Step 3: Install Playwright Driver
BettaFish uses Playwright for browser automation in its crawler system.
playwright install chromium
Step 4: Configure System Settings
Edit config.py to set database and API credentials:
# MySQL settings
DB_HOST = "localhost"
DB_PORT = 3306
DB_USER = "your_username"
DB_PASSWORD = "your_password"
DB_NAME = "your_db_name"
# LLM settings
INSIGHT_ENGINE_API_KEY = "your_api_key"
INSIGHT_ENGINE_BASE_URL = "https://api.moonshot.cn/v1"
INSIGHT_ENGINE_MODEL_NAME = "kimi-k2-0711-preview"
The system supports any LLM provider compatible with the OpenAI API format — simply change the
BASE_URLandMODEL_NAME.
Step 5: Initialize the Database
Option 1: Local Database Setup
cd MindSpider
python schema/init_database.py
Option 2: Cloud Database (Recommended)
The project team provides a cloud service with over 100,000 real social posts daily (service temporarily closed for new users since Oct 2025).
Step 6: Launch the System
conda activate weiyu_env
python app.py
Access the interface at http://localhost:5000.
Tips:
- •
Streamlit apps may occupy ports after termination — kill lingering processes if necessary. - •
For remote deployment issues, see the PR#45 reference in the original repository.
🧠 Agent Design and Collaboration
Core Question: How do individual Agents process information and communicate during analysis?
BettaFish organizes its logic into five major engine modules under the project root:
Weibo_PublicOpinion_AnalysisSystem/
├── QueryEngine/
├── MediaEngine/
├── InsightEngine/
├── ReportEngine/
└── ForumEngine/
🔍 QueryEngine: Breadth-first Search
- •
Handles large-scale content discovery across news and social networks. - •
Parameters such as max reflections, content length, and search depth are configurable.
class Config:
max_reflections = 2
max_search_results = 15
max_content_length = 8000
🖼 MediaEngine: Multimodal Interpreter
- •
Extracts and understands text, images, and video data. - •
Useful for analyzing short videos or mixed-media social posts.
💡 InsightEngine: Deep Analytical Core
- •
Connects to local or business databases for high-value mining. - •
Integrates Qwen-based keyword optimization and multilingual sentiment analysis.
Example sentiment analyzer config:
SENTIMENT_CONFIG = {
'model_type': 'multilingual',
'confidence_threshold': 0.8,
'batch_size': 32,
'max_sequence_length': 512,
}
📑 ReportEngine: Structured Output Generator
- •
Converts analytical results into multi-section reports. - •
Automatically selects from Markdown or HTML templates based on topic.
🧭 ForumEngine: The Collaborative Debate Platform
ForumEngine acts as a moderator. It guides discussion cycles, synthesizes perspectives, and prompts Agents to reflect and refine.
Author’s insight:
Instead of one monolithic model, BettaFish uses a moderated committee of models — an elegant analogy to how human research teams operate.
💬 Sentiment Analysis and Model Choices
Core Question: What sentiment models are supported, and how are they used in practice?
BettaFish includes five sentiment analysis options, each suited to different contexts.
| Model Type | Description | Typical Use |
|---|---|---|
| Multilingual | Auto-detects languages | Cross-border public opinion |
| Small Qwen3 | Lightweight, efficient | Local deployments |
| Fine-tuned BERT | Optimized for Chinese | Domestic platforms |
| GPT-2 LoRA | Strong generative power | Opinion text generation |
| Machine Learning | Transparent classical models | Education / testing |
Example usage:
cd SentimentAnalysisModel/WeiboMultilingualSentiment
python predict.py --text "This product is amazing!" --lang "en"
Author’s reflection:
The modular model setup makes BettaFish educational as well as practical.
Switching between algorithms gives a rare hands-on comparison of AI techniques in one framework.
🔌 Integrating Business Databases
Core Question: How can organizations connect their internal data with the BettaFish framework?
Step 1: Add Business Database Config
BUSINESS_DB_HOST = "your_business_db_host"
BUSINESS_DB_PORT = 3306
BUSINESS_DB_USER = "your_business_user"
BUSINESS_DB_PASSWORD = "your_business_password"
BUSINESS_DB_NAME = "your_business_database"
Step 2: Build a Custom Database Tool
class CustomBusinessDBTool:
def search_business_data(self, query, table):
pass
Step 3: Integrate into InsightEngine
from .tools.custom_db_tool import CustomBusinessDBTool
class DeepSearchAgent:
def __init__(self):
self.custom_db_tool = CustomBusinessDBTool()
This allows enterprises to combine public sentiment data with private metrics such as customer feedback or brand mentions.
🧩 Custom Report Templates
Core Question: How can users customize or extend report templates?
BettaFish allows both in-app uploads and manual template creation.
Option 1: Upload via Web UI
Users can directly upload .md or .txt files through the report generation interface.
Option 2: Add Templates Manually
Create a file in ReportEngine/report_template/, for example:
ReportEngine/report_template/Brand_Reputation.md
The system automatically matches the best template for each analysis topic.
⚙️ Advanced Configuration and Model Replacement
Core Question: How can developers adjust key parameters or integrate other LLM providers?
You can fine-tune parameters such as reflection rounds, content limits, and model thresholds to control precision and performance.
Example of using any OpenAI-compatible API:
from openai import OpenAI
client = OpenAI(api_key="your_api_key",
base_url="https://api.siliconflow.cn/v1")
response = client.chat.completions.create(
model="Qwen/Qwen2.5-72B-Instruct",
messages=[{"role": "user", "content": "What are the public opinion trends this month?"}]
)
Author’s reflection:
This simple compatibility makes BettaFish flexible — a universal orchestration layer that can evolve with any future LLM ecosystem.
🧱 Contribution and Development Guidelines
Core Question: How can developers participate and contribute improvements?
The project welcomes open-source collaboration.
Contribution Workflow
-
Fork the repository -
Create a feature branch -
Commit with clear bilingual messages -
Push changes -
Submit a Pull Request
Development Standards
- •
Follow PEP8 coding style - •
Include tests and updated docs for new features - •
Use descriptive commit messages
🔮 Future Roadmap: Toward Predictive Public Opinion Modeling
Core Question: What’s next for BettaFish?
The system currently completes the pipeline of input → multi-agent analysis → report generation.
The next milestone focuses on trend prediction — using time-series data, graph neural networks, and multimodal fusion to forecast public sentiment shifts.

Author’s note:
Prediction requires trustable, data-driven reasoning.
BettaFish’s long-term dataset of topic heatmaps already lays the foundation for machine learning–based forecasting.
🧭 One-Page Overview
| Category | Description |
|---|---|
| System Name | Weibo Public Opinion Analysis System (BettaFish / Weiyu) |
| Core Function | Automated public opinion monitoring, analysis, and reporting |
| Distinguishing Features | Multi-agent collaboration, multimodal analysis, debate-style reasoning |
| Key Components | Query / Media / Insight / Report / Forum Engines |
| Tech Stack | Python 3.9+, Flask, Streamlit, MySQL, Playwright |
| Deployment Command | python app.py |
| Output | HTML or Markdown analytical reports |
| Extensibility | Supports custom databases, templates, and LLM models |
🧰 Action Checklist
- •
✅ Create and activate Conda environment - •
✅ Install dependencies and browser drivers - •
✅ Configure config.py(API keys, database info) - •
✅ Initialize MySQL or connect to cloud DB - •
✅ Run python app.pyto start the system - •
✅ (Optional) Launch individual Agents with Streamlit - •
✅ Analyze data and export reports
💡 FAQ
Q1: Does BettaFish require GPU hardware?
No. It supports CPU-only execution for lightweight setups.
Q2: Can it analyze multilingual content?
Yes. The multilingual sentiment model handles multiple languages seamlessly.
Q3: How can I customize report designs?
Add Markdown templates under ReportEngine/report_template/ or upload via UI.
Q4: What is the ForumEngine moderator model?
A large language model that summarizes and guides Agent discussions.
Q5: Is PostgreSQL supported?
The default is MySQL, but the architecture allows adaptation to other databases.
Q6: Where are crawler results stored?
In the MindSpider database schema, accessible for visualization and reporting.
Q7: How do I resolve report generation errors?
Ensure all template paths are valid and the Flask API is running.
Q8: Can it be deployed on a remote server?
Yes. BettaFish supports full remote deployment via Conda and Flask.
Author’s closing thought:
BettaFish isn’t just an app — it’s a glimpse into how AI systems can think together.
When multiple agents collaborate, reflect, and debate, analysis becomes more than computation — it becomes collective intelligence.

