Building a Multi-Agent Public Opinion Analysis System from Scratch: The BettaFish (Weiyu) Technical Deep Dive

Core Question: How can you build a fully automated, multi-agent system that analyzes social media sentiment and generates comprehensive public opinion reports?

In the age of information overload, understanding what people truly think across millions of social media posts is no easy task.
The Weibo Public Opinion Analysis System, codenamed BettaFish (Weiyu), tackles this challenge through a multi-agent AI framework that automates data collection, analysis, and report generation across multiple modalities and platforms.

This article walks you through its architecture, setup, operational workflow, and practical extensions — strictly based on the project documentation. You’ll see not just how it works, but also how it reflects a new direction for AI-powered analytics systems.


🧭 Overview: What Is BettaFish and Why It Matters

Core Question: What is the BettaFish (Weiyu) system, and what core problems does it solve?

BettaFish is a ground-up, AI-driven, multi-agent public opinion analysis system. It helps users break free from echo chambers, reconstruct real sentiment landscapes, and forecast opinion trends.

The system accepts simple natural language prompts — users can type requests conversationally (“analyze public sentiment on Wuhan University”) — and intelligent agents automatically collect, process, and synthesize insights from 30+ major global social platforms.

Key Advantages of BettaFish

  1. AI-driven all-domain monitoring
    Continuous 24/7 operation with an AI crawler cluster. It covers multiple platforms such as Weibo, Xiaohongshu, Douyin, and Kuaishou, capturing not only hot topics but also millions of user comments.

  2. Beyond LLMs: A composite analysis engine
    Combines five types of specialized Agents, fine-tuned models, and statistical tools for cross-validation and multi-dimensional reasoning.

  3. Multimodal understanding
    Supports text, video, and structured data (e.g., weather cards, stock info) — enabling complete situational awareness.

  4. Agent “Forum Collaboration” mechanism
    Each Agent has unique tools and thinking modes. A debate-style “ForumEngine” orchestrates discussions to produce more balanced, reasoned conclusions.

  5. Public-private data fusion
    Integrates external social media data with internal enterprise databases for end-to-end intelligence.

  6. Lightweight and extensible architecture
    Built entirely in Python, designed for modular expansion and one-click deployment.

Author’s reflection:
Traditional monitoring tools stop at data visualization. BettaFish goes further — from listening to understanding. It treats data as a conversation rather than a static report.


🏗 System Architecture: Inside the Multi-Agent Brain

Core Question: How do multiple Agents collaborate to perform complex sentiment and trend analysis?

The BettaFish system is built around four primary Agents and one collaboration engine, each serving a distinct analytical role.

Component Description Purpose
Insight Agent Mines private databases for in-depth insights Internal data analysis
Media Agent Analyzes multimodal content (images, videos) Visual & cross-modal understanding
Query Agent Searches domestic and international sources Broad content discovery
Report Agent Generates structured reports Automated documentation
ForumEngine Coordinates debates between Agents Collaborative reasoning

Full Workflow

Step Phase Key Operation Participants
1 User Query Receive the user’s question Flask Main App
2 Parallel Start Launch all Agents simultaneously Query / Media / Insight
3 Preliminary Analysis Agents use their toolkits to explore data Agents + Tools
4 Strategy Formation Define focused research paths Decision Modules
5 Forum Loop Iterative collaboration and reflection ForumEngine + Agents
6 Result Consolidation Collect and merge findings Report Agent
7 Report Generation Format and render the final report Template Engine

System Architecture Diagram
Image source: BettaFish Documentation

Author’s reflection:
The ForumEngine concept is transformative. It turns analysis into a team debate rather than a single model’s monologue.
In practical testing, this significantly improves reasoning depth and reduces model bias.


🚀 Quick Start Guide

Core Question: How can you deploy and run BettaFish on your own system?

This section provides a step-by-step process for installing and launching the BettaFish system.

Environment Requirements

Component Requirement
OS Windows / Linux / macOS
Python 3.9 or higher (recommended 3.11)
Conda Anaconda or Miniconda
Database MySQL (cloud or local)
Memory ≥ 2 GB

Step 1: Create a Conda Environment

conda create -n weiyu_env python=3.11
conda activate weiyu_env

Step 2: Install Dependencies

pip install -r requirements.txt

You can comment out “machine learning” dependencies in the requirements file if you prefer lightweight CPU-only execution.


Step 3: Install Playwright Driver

BettaFish uses Playwright for browser automation in its crawler system.

playwright install chromium

Step 4: Configure System Settings

Edit config.py to set database and API credentials:

# MySQL settings
DB_HOST = "localhost"
DB_PORT = 3306
DB_USER = "your_username"
DB_PASSWORD = "your_password"
DB_NAME = "your_db_name"

# LLM settings
INSIGHT_ENGINE_API_KEY = "your_api_key"
INSIGHT_ENGINE_BASE_URL = "https://api.moonshot.cn/v1"
INSIGHT_ENGINE_MODEL_NAME = "kimi-k2-0711-preview"

The system supports any LLM provider compatible with the OpenAI API format — simply change the BASE_URL and MODEL_NAME.


Step 5: Initialize the Database

Option 1: Local Database Setup

cd MindSpider
python schema/init_database.py

Option 2: Cloud Database (Recommended)

The project team provides a cloud service with over 100,000 real social posts daily (service temporarily closed for new users since Oct 2025).


Step 6: Launch the System

conda activate weiyu_env
python app.py

Access the interface at http://localhost:5000.

Tips:


  • Streamlit apps may occupy ports after termination — kill lingering processes if necessary.

  • For remote deployment issues, see the PR#45 reference in the original repository.

🧠 Agent Design and Collaboration

Core Question: How do individual Agents process information and communicate during analysis?

BettaFish organizes its logic into five major engine modules under the project root:

Weibo_PublicOpinion_AnalysisSystem/
├── QueryEngine/
├── MediaEngine/
├── InsightEngine/
├── ReportEngine/
└── ForumEngine/

🔍 QueryEngine: Breadth-first Search


  • Handles large-scale content discovery across news and social networks.

  • Parameters such as max reflections, content length, and search depth are configurable.
class Config:
    max_reflections = 2
    max_search_results = 15
    max_content_length = 8000

🖼 MediaEngine: Multimodal Interpreter


  • Extracts and understands text, images, and video data.

  • Useful for analyzing short videos or mixed-media social posts.

💡 InsightEngine: Deep Analytical Core


  • Connects to local or business databases for high-value mining.

  • Integrates Qwen-based keyword optimization and multilingual sentiment analysis.

Example sentiment analyzer config:

SENTIMENT_CONFIG = {
    'model_type': 'multilingual',
    'confidence_threshold': 0.8,
    'batch_size': 32,
    'max_sequence_length': 512,
}

📑 ReportEngine: Structured Output Generator


  • Converts analytical results into multi-section reports.

  • Automatically selects from Markdown or HTML templates based on topic.

🧭 ForumEngine: The Collaborative Debate Platform

ForumEngine acts as a moderator. It guides discussion cycles, synthesizes perspectives, and prompts Agents to reflect and refine.

Author’s insight:
Instead of one monolithic model, BettaFish uses a moderated committee of models — an elegant analogy to how human research teams operate.


💬 Sentiment Analysis and Model Choices

Core Question: What sentiment models are supported, and how are they used in practice?

BettaFish includes five sentiment analysis options, each suited to different contexts.

Model Type Description Typical Use
Multilingual Auto-detects languages Cross-border public opinion
Small Qwen3 Lightweight, efficient Local deployments
Fine-tuned BERT Optimized for Chinese Domestic platforms
GPT-2 LoRA Strong generative power Opinion text generation
Machine Learning Transparent classical models Education / testing

Example usage:

cd SentimentAnalysisModel/WeiboMultilingualSentiment
python predict.py --text "This product is amazing!" --lang "en"

Author’s reflection:
The modular model setup makes BettaFish educational as well as practical.
Switching between algorithms gives a rare hands-on comparison of AI techniques in one framework.


🔌 Integrating Business Databases

Core Question: How can organizations connect their internal data with the BettaFish framework?

Step 1: Add Business Database Config

BUSINESS_DB_HOST = "your_business_db_host"
BUSINESS_DB_PORT = 3306
BUSINESS_DB_USER = "your_business_user"
BUSINESS_DB_PASSWORD = "your_business_password"
BUSINESS_DB_NAME = "your_business_database"

Step 2: Build a Custom Database Tool

class CustomBusinessDBTool:
    def search_business_data(self, query, table):
        pass

Step 3: Integrate into InsightEngine

from .tools.custom_db_tool import CustomBusinessDBTool

class DeepSearchAgent:
    def __init__(self):
        self.custom_db_tool = CustomBusinessDBTool()

This allows enterprises to combine public sentiment data with private metrics such as customer feedback or brand mentions.


🧩 Custom Report Templates

Core Question: How can users customize or extend report templates?

BettaFish allows both in-app uploads and manual template creation.

Option 1: Upload via Web UI

Users can directly upload .md or .txt files through the report generation interface.

Option 2: Add Templates Manually

Create a file in ReportEngine/report_template/, for example:

ReportEngine/report_template/Brand_Reputation.md

The system automatically matches the best template for each analysis topic.


⚙️ Advanced Configuration and Model Replacement

Core Question: How can developers adjust key parameters or integrate other LLM providers?

You can fine-tune parameters such as reflection rounds, content limits, and model thresholds to control precision and performance.

Example of using any OpenAI-compatible API:

from openai import OpenAI

client = OpenAI(api_key="your_api_key",
                base_url="https://api.siliconflow.cn/v1")

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-72B-Instruct",
    messages=[{"role": "user", "content": "What are the public opinion trends this month?"}]
)

Author’s reflection:
This simple compatibility makes BettaFish flexible — a universal orchestration layer that can evolve with any future LLM ecosystem.


🧱 Contribution and Development Guidelines

Core Question: How can developers participate and contribute improvements?

The project welcomes open-source collaboration.

Contribution Workflow

  1. Fork the repository
  2. Create a feature branch
  3. Commit with clear bilingual messages
  4. Push changes
  5. Submit a Pull Request

Development Standards


  • Follow PEP8 coding style

  • Include tests and updated docs for new features

  • Use descriptive commit messages

🔮 Future Roadmap: Toward Predictive Public Opinion Modeling

Core Question: What’s next for BettaFish?

The system currently completes the pipeline of input → multi-agent analysis → report generation.
The next milestone focuses on trend prediction — using time-series data, graph neural networks, and multimodal fusion to forecast public sentiment shifts.

Future Plan Banner

Author’s note:
Prediction requires trustable, data-driven reasoning.
BettaFish’s long-term dataset of topic heatmaps already lays the foundation for machine learning–based forecasting.


🧭 One-Page Overview

Category Description
System Name Weibo Public Opinion Analysis System (BettaFish / Weiyu)
Core Function Automated public opinion monitoring, analysis, and reporting
Distinguishing Features Multi-agent collaboration, multimodal analysis, debate-style reasoning
Key Components Query / Media / Insight / Report / Forum Engines
Tech Stack Python 3.9+, Flask, Streamlit, MySQL, Playwright
Deployment Command python app.py
Output HTML or Markdown analytical reports
Extensibility Supports custom databases, templates, and LLM models

🧰 Action Checklist


  • ✅ Create and activate Conda environment

  • ✅ Install dependencies and browser drivers

  • ✅ Configure config.py (API keys, database info)

  • ✅ Initialize MySQL or connect to cloud DB

  • ✅ Run python app.py to start the system

  • ✅ (Optional) Launch individual Agents with Streamlit

  • ✅ Analyze data and export reports

💡 FAQ

Q1: Does BettaFish require GPU hardware?
No. It supports CPU-only execution for lightweight setups.

Q2: Can it analyze multilingual content?
Yes. The multilingual sentiment model handles multiple languages seamlessly.

Q3: How can I customize report designs?
Add Markdown templates under ReportEngine/report_template/ or upload via UI.

Q4: What is the ForumEngine moderator model?
A large language model that summarizes and guides Agent discussions.

Q5: Is PostgreSQL supported?
The default is MySQL, but the architecture allows adaptation to other databases.

Q6: Where are crawler results stored?
In the MindSpider database schema, accessible for visualization and reporting.

Q7: How do I resolve report generation errors?
Ensure all template paths are valid and the Flask API is running.

Q8: Can it be deployed on a remote server?
Yes. BettaFish supports full remote deployment via Conda and Flask.


Author’s closing thought:
BettaFish isn’t just an app — it’s a glimpse into how AI systems can think together.
When multiple agents collaborate, reflect, and debate, analysis becomes more than computation — it becomes collective intelligence.