戒酒的李白

Update readme.

1 <div align="center"> 1 <div align="center">
2 2
3 -# 📊 Weibo Public Opinion Multi-Agent Analysis System  
4 -  
5 <img src="static/image/logo_compressed.png" alt="Weibo Public Opinion Analysis System Logo" width="600"> 3 <img src="static/image/logo_compressed.png" alt="Weibo Public Opinion Analysis System Logo" width="600">
6 4
7 [![GitHub Stars](https://img.shields.io/github/stars/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/stargazers) 5 [![GitHub Stars](https://img.shields.io/github/stars/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/stargazers)
  6 +[![GitHub Watchers](https://img.shields.io/github/watchers/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/watchers)
8 [![GitHub Forks](https://img.shields.io/github/forks/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/network) 7 [![GitHub Forks](https://img.shields.io/github/forks/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/network)
9 [![GitHub Issues](https://img.shields.io/github/issues/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) 8 [![GitHub Issues](https://img.shields.io/github/issues/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues)
10 [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) 9 [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE)
@@ -13,104 +12,57 @@ @@ -13,104 +12,57 @@
13 12
14 </div> 13 </div>
15 14
16 -<div align="center">  
17 -<img src="static/image/banner_compressed.png" alt="banner" width="800">  
18 -</div>  
19 -  
20 ## 📝 Project Overview 15 ## 📝 Project Overview
21 16
22 -**Weibo Public Opinion Multi-Agent Analysis System** is an innovative public opinion analysis platform built from scratch, utilizing multi-agent collaborative architecture to provide accurate, real-time, and comprehensive Weibo public opinion monitoring and analysis services. The system achieves full-process automation from data collection and sentiment analysis to report generation through the collaboration of five specialized AI agents. 17 +**"WeiYu"** is an innovative multi-agent public opinion analysis system built from scratch, featuring universal simplicity across all platforms.
23 18
24 -### 🚀 Key Features 19 +See the system-generated research report on "Wuhan University Public Opinion":[In-depth Analysis Report on Wuhan University's Brand Reputation](./final_reports/final_report__20250827_131630.html)
25 20
26 -- **Multi-Agent Collaborative Architecture**: 5 specialized agents working together to complete the full process of public opinion analysis  
27 -- **Comprehensive Data Collection**: Integrating Weibo crawlers, news search, multimedia content, and other multi-dimensional data sources  
28 -- **Deep Sentiment Analysis**: Precise multilingual sentiment recognition based on fine-tuned BERT/GPT-2/Qwen models  
29 -- **Intelligent Report Generation**: Automatically generate structured HTML analysis reports with custom template support  
30 -- **Agent Forum Communication**: ForumEngine provides information sharing and collaborative decision-making platform for agents  
31 -- **High-Performance Asynchronous Processing**: Support concurrent processing of multiple public opinion tasks with real-time status monitoring  
32 -- **Cloud Data Support**: Convenient cloud database service with 100,000+ daily real data 21 +Beyond just report quality, compared to similar products, we have 🚀 six major advantages:
33 22
34 -## 🏗️ System Architecture 23 +1. **AI-Driven Comprehensive Monitoring**: AI crawler clusters operate 24/7 non-stop, comprehensively covering 10+ key domestic and international social media platforms including Weibo, Xiaohongshu, TikTok, Kuaishou, etc. Not only capturing trending content in real-time, but also drilling down to massive user comments, letting you hear the most authentic and widespread public voice.
35 24
36 -### Overall Architecture Diagram 25 +2. **Composite Analysis Engine Beyond LLM**: We not only rely on 5 types of professionally designed Agents, but also integrate middleware such as fine-tuned models and statistical models. Through multi-model collaborative work, we ensure the depth, accuracy, and multi-dimensional perspective of analysis results.
37 26
38 -```mermaid  
39 -graph TB  
40 - subgraph "Frontend Display Layer"  
41 - UI[Web Interface<br/>Flask + Streamlit]  
42 - end  
43 -  
44 - subgraph "Multi-Agent Collaboration Layer"  
45 - QE[QueryEngine<br/>News Search Agent]  
46 - ME[MediaEngine<br/>Multimedia Search Agent]  
47 - IE[InsightEngine<br/>Deep Insight Agent]  
48 - RE[ReportEngine<br/>Report Generation Agent]  
49 - Forum[ForumEngine<br/>Agent Forum Communication Center]  
50 - end  
51 -  
52 - subgraph "Data Processing Layer"  
53 - MS[MindSpider<br/>Weibo Crawler System]  
54 - SA[SentimentAnalysis<br/>Sentiment Analysis Model Collection]  
55 - DB[(MySQL<br/>Database)]  
56 - end  
57 -  
58 - subgraph "External Service Layer"  
59 - LLM[LLM API<br/>DeepSeek/Kimi/Gemini]  
60 - Search[Search API<br/>Tavily/Bocha]  
61 - end  
62 -  
63 - UI --> QE  
64 - UI --> ME  
65 - UI --> IE  
66 - UI --> RE  
67 -  
68 - QE --> Search  
69 - ME --> Search  
70 - IE --> MS  
71 - IE --> SA  
72 -  
73 - QE --> LLM  
74 - ME --> LLM  
75 - IE --> LLM  
76 - RE --> LLM  
77 -  
78 - MS --> DB  
79 - SA --> DB  
80 -  
81 - %% Agent Forum Communication Mechanism  
82 - QE <--> Forum  
83 - ME <--> Forum  
84 - IE <--> Forum  
85 - RE <--> Forum  
86 -``` 27 +3. **Powerful Multimodal Capabilities**: Breaking through text and image limitations, capable of deep analysis of short video content from TikTok, Kuaishou, etc., and precisely extracting structured multimodal information cards such as weather, calendar, stocks from modern search engines, giving you comprehensive control over public opinion dynamics.
87 28
88 -### Agent Collaboration Workflow 29 +4. **Agent "Forum" Collaboration Mechanism**: Endowing different Agents with unique toolsets and thinking patterns, conducting chain-of-thought collision and debate through the "forum" mechanism. This not only avoids the thinking limitations of single models and homogenization caused by communication, but also catalyzes higher-quality collective intelligence and decision support.
89 30
90 -The system's core workflow is based on multi-agent collaboration: 31 +5. **Seamless Integration of Public and Private Domain Data**: The platform not only analyzes public opinion, but also provides high-security interfaces supporting seamless integration of your internal business databases with public opinion data. Breaking through data barriers, providing powerful analysis capabilities of "external trends + internal insights" for vertical businesses.
91 32
92 -1. **QueryEngine (News Query Agent)**: Uses Tavily API to search authoritative news reports, providing official information sources  
93 -2. **MediaEngine (Multimedia Search Agent)**: Conducts multimodal content search through Bocha API to gather social media perspectives  
94 -3. **InsightEngine (Deep Insight Agent)**: Queries local Weibo database, combines multiple sentiment analysis models for deep analysis  
95 -4. **ForumEngine (Forum Monitoring Agent)**: Real-time monitoring of agent log outputs, extracts key information and promotes collaboration  
96 -5. **ReportEngine (Report Generation Agent)**: Based on analysis results from all agents, uses Gemini LLM to generate comprehensive HTML reports 33 +6. **Lightweight and Highly Extensible Framework**: Based on pure Python modular design, achieving lightweight, one-click deployment. Clear code structure allows developers to easily integrate custom models and business logic, enabling rapid platform expansion and deep customization.
97 34
98 -### Project Code Structure 35 +**Starting with public opinion, but not limited to public opinion**. The goal of "WeiYu" is to become a simple and universal data analysis engine that drives all business scenarios.
  36 +
  37 +<div align="center">
  38 +<img src="static/image/system_schematic.png" alt="banner" width="800">
  39 +
  40 +Say goodbye to traditional data dashboards. In "WeiYu", everything starts with a simple question - you just need to ask your analysis needs like a conversation
  41 +</div>
  42 +
  43 +## 🏗️ System Architecture
  44 +
  45 +### Overall Architecture Diagram
  46 +
  47 +Still drawing...
  48 +
  49 +### Project Code Structure Tree
99 50
100 ``` 51 ```
101 Weibo_PublicOpinion_AnalysisSystem/ 52 Weibo_PublicOpinion_AnalysisSystem/
102 -├── QueryEngine/ # News Query Engine Agent 53 +├── QueryEngine/ # Domestic and international news breadth search Agent
103 │ ├── agent.py # Agent main logic 54 │ ├── agent.py # Agent main logic
104 │ ├── llms/ # LLM interface wrapper 55 │ ├── llms/ # LLM interface wrapper
105 │ ├── nodes/ # Processing nodes 56 │ ├── nodes/ # Processing nodes
106 │ ├── tools/ # Search tools 57 │ ├── tools/ # Search tools
107 -│ └── utils/ # Utility functions  
108 -├── MediaEngine/ # Multimedia Search Engine Agent 58 +│ ├── utils/ # Utility functions
  59 +│ └── ... # Other modules
  60 +├── MediaEngine/ # Powerful multimodal understanding Agent
109 │ ├── agent.py # Agent main logic 61 │ ├── agent.py # Agent main logic
110 │ ├── llms/ # LLM interfaces 62 │ ├── llms/ # LLM interfaces
111 │ ├── tools/ # Search tools 63 │ ├── tools/ # Search tools
112 │ └── ... # Other modules 64 │ └── ... # Other modules
113 -├── InsightEngine/ # Data Insight Engine Agent 65 +├── InsightEngine/ # Private database mining Agent
114 │ ├── agent.py # Agent main logic 66 │ ├── agent.py # Agent main logic
115 │ ├── llms/ # LLM interface wrapper 67 │ ├── llms/ # LLM interface wrapper
116 │ │ ├── deepseek.py # DeepSeek API 68 │ │ ├── deepseek.py # DeepSeek API
@@ -120,7 +72,7 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -120,7 +72,7 @@ Weibo_PublicOpinion_AnalysisSystem/
120 │ ├── nodes/ # Processing nodes 72 │ ├── nodes/ # Processing nodes
121 │ │ ├── first_search_node.py # First search node 73 │ │ ├── first_search_node.py # First search node
122 │ │ ├── reflection_node.py # Reflection node 74 │ │ ├── reflection_node.py # Reflection node
123 -│ │ ├── summary_nodes.py # Summary nodes 75 +│ │ ├── summary_nodes.py # Summary node
124 │ │ ├── search_node.py # Search node 76 │ │ ├── search_node.py # Search node
125 │ │ ├── sentiment_node.py # Sentiment analysis node 77 │ │ ├── sentiment_node.py # Sentiment analysis node
126 │ │ └── insight_node.py # Insight generation node 78 │ │ └── insight_node.py # Insight generation node
@@ -137,7 +89,7 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -137,7 +89,7 @@ Weibo_PublicOpinion_AnalysisSystem/
137 │ ├── __init__.py 89 │ ├── __init__.py
138 │ ├── config.py # Configuration management 90 │ ├── config.py # Configuration management
139 │ └── helpers.py # Helper functions 91 │ └── helpers.py # Helper functions
140 -├── ReportEngine/ # Report Generation Engine Agent 92 +├── ReportEngine/ # Multi-round report generation Agent
141 │ ├── agent.py # Agent main logic 93 │ ├── agent.py # Agent main logic
142 │ ├── llms/ # LLM interfaces 94 │ ├── llms/ # LLM interfaces
143 │ │ └── gemini.py # Gemini API dedicated 95 │ │ └── gemini.py # Gemini API dedicated
@@ -149,9 +101,9 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -149,9 +101,9 @@ Weibo_PublicOpinion_AnalysisSystem/
149 │ │ ├── 商业品牌舆情监测.md 101 │ │ ├── 商业品牌舆情监测.md
150 │ │ └── ... # More templates 102 │ │ └── ... # More templates
151 │ └── flask_interface.py # Flask API interface 103 │ └── flask_interface.py # Flask API interface
152 -├── ForumEngine/ # Forum Communication Engine Agent 104 +├── ForumEngine/ # Forum engine simple implementation
153 │ └── monitor.py # Log monitoring and forum management 105 │ └── monitor.py # Log monitoring and forum management
154 -├── MindSpider/ # Weibo Crawler System 106 +├── MindSpider/ # Weibo crawler system
155 │ ├── main.py # Crawler main program 107 │ ├── main.py # Crawler main program
156 │ ├── BroadTopicExtraction/ # Topic extraction module 108 │ ├── BroadTopicExtraction/ # Topic extraction module
157 │ │ ├── get_today_news.py # Today's news fetching 109 │ │ ├── get_today_news.py # Today's news fetching
@@ -161,19 +113,21 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -161,19 +113,21 @@ Weibo_PublicOpinion_AnalysisSystem/
161 │ │ └── platform_crawler.py # Platform crawler management 113 │ │ └── platform_crawler.py # Platform crawler management
162 │ └── schema/ # Database schema 114 │ └── schema/ # Database schema
163 │ └── init_database.py # Database initialization 115 │ └── init_database.py # Database initialization
164 -├── SentimentAnalysisModel/ # Sentiment Analysis Model Collection 116 +├── SentimentAnalysisModel/ # Sentiment analysis model collection
165 │ ├── WeiboSentiment_Finetuned/ # Fine-tuned BERT/GPT-2 models 117 │ ├── WeiboSentiment_Finetuned/ # Fine-tuned BERT/GPT-2 models
166 -│ ├── WeiboMultilingualSentiment/ # Multilingual sentiment analysis  
167 -│ ├── WeiboSentiment_SmallQwen/ # Small Qwen model 118 +│ ├── WeiboMultilingualSentiment/# Multilingual sentiment analysis (recommended)
  119 +│ ├── WeiboSentiment_SmallQwen/ # Small parameter Qwen3 fine-tuning
168 │ └── WeiboSentiment_MachineLearning/ # Traditional machine learning methods 120 │ └── WeiboSentiment_MachineLearning/ # Traditional machine learning methods
169 -├── SingleEngineApp/ # Individual Agent Streamlit apps 121 +├── SingleEngineApp/ # Individual Agent Streamlit applications
170 │ ├── query_engine_streamlit_app.py 122 │ ├── query_engine_streamlit_app.py
171 │ ├── media_engine_streamlit_app.py 123 │ ├── media_engine_streamlit_app.py
172 │ └── insight_engine_streamlit_app.py 124 │ └── insight_engine_streamlit_app.py
173 ├── templates/ # Flask templates 125 ├── templates/ # Flask templates
174 -│ └── index.html # Main interface template 126 +│ └── index.html # Main interface frontend
175 ├── static/ # Static resources 127 ├── static/ # Static resources
176 ├── logs/ # Runtime log directory 128 ├── logs/ # Runtime log directory
  129 +├── final_reports/ # Final generated HTML report files
  130 +├── utils/ # Common utility functions
177 ├── app.py # Flask main application entry 131 ├── app.py # Flask main application entry
178 ├── config.py # Global configuration file 132 ├── config.py # Global configuration file
179 └── requirements.txt # Python dependency list 133 └── requirements.txt # Python dependency list
@@ -183,26 +137,27 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -183,26 +137,27 @@ Weibo_PublicOpinion_AnalysisSystem/
183 137
184 ### System Requirements 138 ### System Requirements
185 139
186 -- **Operating System**: Windows 10/11 (Linux/macOS also supported)  
187 -- **Python Version**: 3.11+ 140 +- **Operating System**: Windows, Linux, MacOS
  141 +- **Python Version**: 3.9+
188 - **Conda**: Anaconda or Miniconda 142 - **Conda**: Anaconda or Miniconda
189 -- **Database**: MySQL 8.0+ (or choose our cloud database service)  
190 -- **Memory**: 8GB+ recommended 143 +- **Database**: MySQL (optional, you can choose our cloud database service)
  144 +- **Memory**: 2GB+ recommended
191 145
192 ### 1. Create Conda Environment 146 ### 1. Create Conda Environment
193 147
194 ```bash 148 ```bash
195 -# Create conda environment named pytorch_python11  
196 -conda create -n pytorch_python11 python=3.11  
197 -conda activate pytorch_python11 149 +# Create conda environment
  150 +conda create -n your_conda_name python=3.11
  151 +conda activate your_conda_name
198 ``` 152 ```
199 153
200 ### 2. Install Dependencies 154 ### 2. Install Dependencies
201 155
202 ```bash 156 ```bash
203 -# Install basic dependencies 157 +# Basic dependency installation
204 pip install -r requirements.txt 158 pip install -r requirements.txt
205 159
  160 +#========Below are optional========
206 # If you need local sentiment analysis functionality, install PyTorch 161 # If you need local sentiment analysis functionality, install PyTorch
207 # CPU version 162 # CPU version
208 pip install torch torchvision torchaudio 163 pip install torch torchvision torchaudio
@@ -225,7 +180,7 @@ playwright install chromium @@ -225,7 +180,7 @@ playwright install chromium
225 180
226 #### 4.1 Configure API Keys 181 #### 4.1 Configure API Keys
227 182
228 -Edit the `config.py` file and fill in your API keys: 183 +Edit the `config.py` file and fill in your API keys (you can also choose your own models and search proxies):
229 184
230 ```python 185 ```python
231 # MySQL Database Configuration 186 # MySQL Database Configuration
@@ -266,10 +221,9 @@ python schema/init_database.py @@ -266,10 +221,9 @@ python schema/init_database.py
266 221
267 **Option 2: Use Cloud Database Service (Recommended)** 222 **Option 2: Use Cloud Database Service (Recommended)**
268 223
269 -We provide convenient cloud database service with 100,000+ daily real Weibo data, currently **free application** during the promotion period! 224 +We provide convenient cloud database service with 100,000+ daily real public opinion data, currently **free application** during the promotion period!
270 225
271 -- Real Weibo data, updated in real-time  
272 -- Pre-processed sentiment annotation data 226 +- Real public opinion data, updated in real-time
273 - Multi-dimensional tag classification 227 - Multi-dimensional tag classification
274 - High-availability cloud service 228 - High-availability cloud service
275 - Professional technical support 229 - Professional technical support
@@ -282,12 +236,14 @@ We provide convenient cloud database service with 100,000+ daily real Weibo data @@ -282,12 +236,14 @@ We provide convenient cloud database service with 100,000+ daily real Weibo data
282 236
283 ```bash 237 ```bash
284 # In project root directory, activate conda environment 238 # In project root directory, activate conda environment
285 -conda activate pytorch_python11 239 +conda activate your_conda_name
286 240
287 -# Start main application (automatically starts all agents) 241 +# Start main application
288 python app.py 242 python app.py
289 ``` 243 ```
290 244
  245 +> Note: Data crawling requires separate operation, see section 5.3 for guidance
  246 +
291 Visit http://localhost:5000 to use the complete system 247 Visit http://localhost:5000 to use the complete system
292 248
293 #### 5.2 Launch Individual Agents 249 #### 5.2 Launch Individual Agents
@@ -303,7 +259,9 @@ streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502 @@ -303,7 +259,9 @@ streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502
303 streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501 259 streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501
304 ``` 260 ```
305 261
306 -#### 5.3 Standalone Crawler System 262 +#### 5.3 Crawler System Standalone Use
  263 +
  264 +This section has detailed configuration documentation: [MindSpider Usage Guide](./MindSpider/README.md)
307 265
308 ```bash 266 ```bash
309 # Enter crawler directory 267 # Enter crawler directory
@@ -322,58 +280,6 @@ python main.py --broad-topic --date 2024-01-20 @@ -322,58 +280,6 @@ python main.py --broad-topic --date 2024-01-20
322 python main.py --deep-sentiment --platforms xhs dy wb 280 python main.py --deep-sentiment --platforms xhs dy wb
323 ``` 281 ```
324 282
325 -## 💾 Database Configuration  
326 -  
327 -### Local Database Configuration  
328 -  
329 -1. **Install MySQL 8.0+**  
330 -2. **Create Database**:  
331 - ```sql  
332 - CREATE DATABASE weibo_analysis CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;  
333 - ```  
334 -3. **Run Initialization Script**:  
335 - ```bash  
336 - cd MindSpider  
337 - python schema/init_database.py  
338 - ```  
339 -  
340 -### Auto-Crawling Configuration  
341 -  
342 -Configure automatic crawling tasks for continuous data updates:  
343 -  
344 -```python  
345 -# Configure crawler parameters in MindSpider/config.py  
346 -CRAWLER_CONFIG = {  
347 - 'max_pages': 200, # Maximum pages to crawl  
348 - 'delay': 1, # Request delay (seconds)  
349 - 'timeout': 30, # Timeout (seconds)  
350 - 'platforms': ['xhs', 'dy', 'wb', 'bili'], # Crawling platforms  
351 - 'daily_keywords': 100, # Daily keywords count  
352 - 'max_notes_per_keyword': 50, # Max content per keyword  
353 - 'use_proxy': False, # Whether to use proxy  
354 -}  
355 -```  
356 -  
357 -### Cloud Database Service (Recommended)  
358 -  
359 -**Why Choose Our Cloud Database Service?**  
360 -  
361 -- **Rich Data Sources**: 100,000+ daily real Weibo data covering hot topics across all industries  
362 -- **High-Quality Annotations**: Professional team manually annotated sentiment data with 95%+ accuracy  
363 -- **Multi-Dimensional Analysis**: Including topic classification, sentiment tendency, influence scoring and other multi-dimensional tags  
364 -- **Real-Time Updates**: 24/7 continuous data collection ensuring timeliness  
365 -- **Technical Support**: Professional team providing technical support and customization services  
366 -  
367 -**Application Method**:  
368 -📧 Email Contact: 670939375@qq.com  
369 -📝 Email Subject: Apply for Weibo Public Opinion Cloud Database Access  
370 -📝 Email Content: Please describe your use case and expected data volume requirements  
371 -  
372 -**Promotion Period Benefits**:  
373 -- Free basic cloud database access  
374 -- Free technical support and deployment guidance  
375 -- Priority access to new features  
376 -  
377 ## ⚙️ Advanced Configuration 283 ## ⚙️ Advanced Configuration
378 284
379 ### Modify Key Parameters 285 ### Modify Key Parameters
@@ -420,7 +326,7 @@ The system supports multiple LLM providers, switchable in each agent's configura @@ -420,7 +326,7 @@ The system supports multiple LLM providers, switchable in each agent's configura
420 ```python 326 ```python
421 # Configure in each Engine's utils/config.py 327 # Configure in each Engine's utils/config.py
422 class Config: 328 class Config:
423 - default_llm_provider = "deepseek" # Options: "deepseek", "openai", "kimi", "gemini" 329 + default_llm_provider = "deepseek" # Options: "deepseek", "openai", "kimi", "gemini", "qwen"
424 330
425 # DeepSeek configuration 331 # DeepSeek configuration
426 deepseek_api_key = "your_api_key" 332 deepseek_api_key = "your_api_key"
@@ -444,40 +350,40 @@ class Config: @@ -444,40 +350,40 @@ class Config:
444 350
445 The system integrates multiple sentiment analysis methods, selectable based on needs: 351 The system integrates multiple sentiment analysis methods, selectable based on needs:
446 352
447 -#### 1. BERT-based Fine-tuned Model (Highest Accuracy) 353 +#### 1. Multilingual Sentiment Analysis
448 354
449 ```bash 355 ```bash
450 -# Use BERT Chinese model  
451 -cd SentimentAnalysisModel/WeiboSentiment_Finetuned/BertChinese-Lora  
452 -python predict.py --text "This product is really great" 356 +cd SentimentAnalysisModel/WeiboMultilingualSentiment
  357 +python predict.py --text "This product is amazing!" --lang "en"
453 ``` 358 ```
454 359
455 -#### 2. GPT-2 LoRA Fine-tuned Model (Faster Speed) 360 +#### 2. Small Parameter Qwen3 Fine-tuning
456 361
457 ```bash 362 ```bash
458 -cd SentimentAnalysisModel/WeiboSentiment_Finetuned/GPT2-Lora  
459 -python predict.py --text "I'm not feeling great today" 363 +cd SentimentAnalysisModel/WeiboSentiment_SmallQwen
  364 +python predict_universal.py --text "This event was very successful"
460 ``` 365 ```
461 366
462 -#### 3. Small Qwen Model (Balanced) 367 +#### 3. BERT-based Fine-tuned Model
463 368
464 ```bash 369 ```bash
465 -cd SentimentAnalysisModel/WeiboSentiment_SmallQwen  
466 -python predict_universal.py --text "This event was very successful" 370 +# Use BERT Chinese model
  371 +cd SentimentAnalysisModel/WeiboSentiment_Finetuned/BertChinese-Lora
  372 +python predict.py --text "This product is really great"
467 ``` 373 ```
468 374
469 -#### 4. Traditional Machine Learning Methods (Lightweight) 375 +#### 4. GPT-2 LoRA Fine-tuned Model
470 376
471 ```bash 377 ```bash
472 -cd SentimentAnalysisModel/WeiboSentiment_MachineLearning  
473 -python predict.py --model_type "svm" --text "Service attitude needs improvement" 378 +cd SentimentAnalysisModel/WeiboSentiment_Finetuned/GPT2-Lora
  379 +python predict.py --text "I'm not feeling great today"
474 ``` 380 ```
475 381
476 -#### 5. Multilingual Sentiment Analysis (Supports 22 Languages) 382 +#### 5. Traditional Machine Learning Methods
477 383
478 ```bash 384 ```bash
479 -cd SentimentAnalysisModel/WeiboMultilingualSentiment  
480 -python predict.py --text "This product is amazing!" --lang "en" 385 +cd SentimentAnalysisModel/WeiboSentiment_MachineLearning
  386 +python predict.py --model_type "svm" --text "Service attitude needs improvement"
481 ``` 387 ```
482 388
483 ### Integrate Custom Business Database 389 ### Integrate Custom Business Database
@@ -538,45 +444,13 @@ class DeepSearchAgent: @@ -538,45 +444,13 @@ class DeepSearchAgent:
538 444
539 ### Custom Report Templates 445 ### Custom Report Templates
540 446
541 -#### 1. Create Template Files  
542 -  
543 -Create new Markdown templates in the `ReportEngine/report_template/` directory:  
544 -  
545 -```markdown  
546 -<!-- Enterprise Brand Monitoring Report.md -->  
547 -# Enterprise Brand Public Opinion Monitoring Report  
548 -  
549 -## 📊 Executive Summary  
550 -{executive_summary}  
551 -  
552 -## 🔍 Brand Mention Analysis  
553 -### Mention Volume Trends  
554 -{mention_trend}  
555 -  
556 -### Sentiment Distribution  
557 -{sentiment_distribution}  
558 -  
559 -## 📈 Competitor Analysis  
560 -{competitor_analysis}  
561 -  
562 -## 🎯 Key Insights Summary  
563 -{key_insights} 447 +#### 1. Upload in Web Interface
564 448
565 -## ⚠️ Risk Alerts  
566 -{risk_alerts} 449 +The system supports uploading custom template files (.md or .txt format), selectable when generating reports.
567 450
568 -## 📋 Improvement Recommendations  
569 -{recommendations} 451 +#### 2. Create Template Files
570 452
571 ----  
572 -*Report Type: Enterprise Brand Public Opinion Monitoring*  
573 -*Generation Time: {generation_time}*  
574 -*Data Sources: {data_sources}*  
575 -```  
576 -  
577 -#### 2. Use in Web Interface  
578 -  
579 -The system supports uploading custom template files (.md or .txt format), selectable when generating reports. 453 +Create new templates in the `ReportEngine/report_template/` directory, and our Agent will automatically select the most appropriate template.
580 454
581 ## 🤝 Contributing Guide 455 ## 🤝 Contributing Guide
582 456
@@ -590,15 +464,6 @@ We welcome all forms of contributions! @@ -590,15 +464,6 @@ We welcome all forms of contributions!
590 4. **Push to branch**: `git push origin feature/AmazingFeature` 464 4. **Push to branch**: `git push origin feature/AmazingFeature`
591 5. **Open Pull Request** 465 5. **Open Pull Request**
592 466
593 -### Contribution Types  
594 -  
595 -- 🐛 Bug fixes  
596 -- ✨ New feature development  
597 -- 📚 Documentation improvements  
598 -- 🎨 UI/UX improvements  
599 -- ⚡ Performance optimization  
600 -- 🧪 Test case additions  
601 -  
602 ### Development Standards 467 ### Development Standards
603 468
604 - Code follows PEP8 standards 469 - Code follows PEP8 standards
@@ -608,7 +473,7 @@ We welcome all forms of contributions! @@ -608,7 +473,7 @@ We welcome all forms of contributions!
608 473
609 ## 📄 License 474 ## 📄 License
610 475
611 -This project is licensed under the [MIT License](LICENSE). Please see the LICENSE file for details. 476 +This project is licensed under the [GPL-2.0 License](LICENSE). Please see the LICENSE file for details.
612 477
613 ## 🎉 Support & Contact 478 ## 🎉 Support & Contact
614 479
@@ -621,8 +486,6 @@ This project is licensed under the [MIT License](LICENSE). Please see the LICENS @@ -621,8 +486,6 @@ This project is licensed under the [MIT License](LICENSE). Please see the LICENS
621 ### Contact Information 486 ### Contact Information
622 487
623 - 📧 **Email**: 670939375@qq.com 488 - 📧 **Email**: 670939375@qq.com
624 -- 💬 **QQ Group**: [Join Technical Discussion Group]  
625 -- 🐦 **WeChat**: [Scan QR Code for Technical Support]  
626 489
627 ### Business Cooperation 490 ### Business Cooperation
628 491
@@ -635,7 +498,7 @@ This project is licensed under the [MIT License](LICENSE). Please see the LICENS @@ -635,7 +498,7 @@ This project is licensed under the [MIT License](LICENSE). Please see the LICENS
635 498
636 **Free Cloud Database Service Application**: 499 **Free Cloud Database Service Application**:
637 📧 Send email to: 670939375@qq.com 500 📧 Send email to: 670939375@qq.com
638 -📝 Subject: Weibo Public Opinion Cloud Database Application 501 +📝 Subject: WeiYu Cloud Database Application
639 📝 Description: Your use case and requirements 502 📝 Description: Your use case and requirements
640 503
641 ## 👥 Contributors 504 ## 👥 Contributors
@@ -650,6 +513,4 @@ Thanks to these excellent contributors: @@ -650,6 +513,4 @@ Thanks to these excellent contributors:
650 513
651 **⭐ If this project helps you, please give us a star!** 514 **⭐ If this project helps you, please give us a star!**
652 515
653 -Made with ❤️ by [Weibo Public Opinion Analysis Team](https://github.com/666ghj)  
654 -  
655 </div> 516 </div>
@@ -2,9 +2,8 @@ @@ -2,9 +2,8 @@
2 2
3 <img src="static/image/logo_compressed.png" alt="Weibo Public Opinion Analysis System Logo" width="600"> 3 <img src="static/image/logo_compressed.png" alt="Weibo Public Opinion Analysis System Logo" width="600">
4 4
5 -# 微舆 - 致力于打造简洁通用的舆情分析平台  
6 -  
7 [![GitHub Stars](https://img.shields.io/github/stars/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/stargazers) 5 [![GitHub Stars](https://img.shields.io/github/stars/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/stargazers)
  6 +[![GitHub Watchers](https://img.shields.io/github/watchers/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/watchers)
8 [![GitHub Forks](https://img.shields.io/github/forks/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/network) 7 [![GitHub Forks](https://img.shields.io/github/forks/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/network)
9 [![GitHub Issues](https://img.shields.io/github/issues/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) 8 [![GitHub Issues](https://img.shields.io/github/issues/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues)
10 [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) 9 [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE)
@@ -13,104 +12,57 @@ @@ -13,104 +12,57 @@
13 12
14 </div> 13 </div>
15 14
16 -<div align="center">  
17 -<img src="static/image/system_schematic.png" alt="banner" width="800">  
18 -</div>  
19 -  
20 ## 📝 项目概述 15 ## 📝 项目概述
21 16
22 -**微博舆情分析多智能体系统**是一个从零构建的创新型舆情分析平台,采用多Agent协作架构,致力于提供准确、实时、全面的微博舆情监测与分析服务。系统通过五个专门化的AI Agent协同工作,实现了从数据采集、情感分析到报告生成的全流程自动化 17 +**微舆**” 是一个从0实现的创新型 多智能体 舆情分析系统,全平台简洁通用
23 18
24 -### 🚀 核心亮点 19 +查看系统以“武汉大学舆情”为例,生成的研究报告:[武汉大学品牌声誉深度分析报告](./final_reports/final_report__20250827_131630.html)
25 20
26 -- **多智能体协作架构**:5个专门化Agent各司其职,协同工作完成舆情分析全流程  
27 -- **全方位数据采集**:整合微博爬虫、新闻搜索、多媒体内容等多维度数据源  
28 -- **深度情感分析**:基于微调BERT/GPT-2/Qwen模型的精准多语言情感识别  
29 -- **智能报告生成**:自动生成结构化HTML分析报告,支持自定义模板  
30 -- **Agent论坛交流**:ForumEngine提供Agent间信息共享和协作决策平台  
31 -- **高性能异步处理**:支持并发处理多个舆情任务,实时状态监控  
32 -- **云端数据支持**:提供便捷云数据库服务,日均10万+真实数据 21 +不仅仅体现在报告质量上,相比同类产品,我们拥有🚀六大优势:
33 22
34 -## 🏗️ 系统架构 23 +1. **AI驱动的全域监控**:AI爬虫集群7x24小时不间断作业,全面覆盖微博、小红书、抖音、快手等10+国内外关键社媒。不仅实时捕获热点内容,更能下钻至海量用户评论,让您听到最真实、最广泛的大众声音。
35 24
36 -### 整体架构图 25 +2. **超越LLM的复合分析引擎**:我们不仅依赖设计的5类专业Agent,更融合了微调模型、统计模型等中间件。通过多模型协同工作,确保了分析结果的深度、准度与多维视角。
37 26
38 -```mermaid  
39 -graph TB  
40 - subgraph "前端展示层"  
41 - UI[Web界面<br/>Flask + Streamlit]  
42 - end  
43 -  
44 - subgraph "多Agent协作层"  
45 - QE[QueryEngine<br/>新闻搜索Agent]  
46 - ME[MediaEngine<br/>多媒体搜索Agent]  
47 - IE[InsightEngine<br/>深度洞察Agent]  
48 - RE[ReportEngine<br/>报告生成Agent]  
49 - Forum[ForumEngine<br/>Agent论坛交流中心]  
50 - end  
51 -  
52 - subgraph "数据处理层"  
53 - MS[MindSpider<br/>微博爬虫系统]  
54 - SA[SentimentAnalysis<br/>情感分析模型集合]  
55 - DB[(MySQL<br/>数据库)]  
56 - end  
57 -  
58 - subgraph "外部服务层"  
59 - LLM[LLM API<br/>DeepSeek/Kimi/Gemini]  
60 - Search[搜索API<br/>Tavily/Bocha]  
61 - end  
62 -  
63 - UI --> QE  
64 - UI --> ME  
65 - UI --> IE  
66 - UI --> RE  
67 -  
68 - QE --> Search  
69 - ME --> Search  
70 - IE --> MS  
71 - IE --> SA  
72 -  
73 - QE --> LLM  
74 - ME --> LLM  
75 - IE --> LLM  
76 - RE --> LLM  
77 -  
78 - MS --> DB  
79 - SA --> DB  
80 -  
81 - %% Agent论坛交流机制  
82 - QE <--> Forum  
83 - ME <--> Forum  
84 - IE <--> Forum  
85 - RE <--> Forum  
86 -``` 27 +3. **强大的多模态能力**:突破图文限制,能深度解析抖音、快手等短视频内容,并精准提取现代搜索引擎中的天气、日历、股票等结构化多模态信息卡片,让您全面掌握舆情动态。
  28 +
  29 +4. **Agent“论坛”协作机制**:为不同Agent赋予独特的工具集与思维模式,通过“论坛”机制进行链式思维碰撞与辩论。这不仅避免了单一模型的思维局限与交流导致的同质化,更催生出更高质量的集体智能与决策支持。
  30 +
  31 +5. **公私域数据无缝融合**:平台不仅分析公开舆情,还提供高安全性的接口,支持您将内部业务数据库与舆情数据无缝集成。打通数据壁垒,为垂直业务提供“外部趋势+内部洞察”的强大分析能力。
87 32
88 -### Agent协作流程 33 +6. **轻量化与高扩展性框架**:基于纯Python模块化设计,实现轻量化、一键式部署。代码结构清晰,开发者可轻松集成自定义模型与业务逻辑,实现平台的快速扩展与深度定制。
89 34
90 -系统核心工作流程基于多Agent协作模式: 35 +**始于舆情,而不止于舆情**。“微舆”的目标,是成为驱动一切业务场景的简洁通用的数据分析引擎。
  36 +
  37 +<div align="center">
  38 +<img src="static/image/system_schematic.png" alt="banner" width="800">
  39 +
  40 +告别传统的数据看板,在“微舆”,一切由一个简单的问题开始,您只需像对话一样,提出您的分析需求
  41 +</div>
  42 +
  43 +## 🏗️ 系统架构
  44 +
  45 +### 整体架构图
91 46
92 -1. **QueryEngine(新闻查询Agent)**:使用Tavily API搜索权威新闻报道,提供官方信息源  
93 -2. **MediaEngine(多媒体搜索Agent)**:通过Bocha API进行多模态内容搜索,获取社交媒体观点  
94 -3. **InsightEngine(深度洞察Agent)**:查询本地微博数据库,结合多种情感分析模型进行深度分析  
95 -4. **ForumEngine(论坛监控Agent)**:实时监控各Agent日志输出,提取关键信息并促进协作  
96 -5. **ReportEngine(报告生成Agent)**:基于所有Agent的分析结果,使用Gemini LLM生成综合HTML报告 47 +还在画...
97 48
98 -### 项目代码结构 49 +### 项目代码结构
99 50
100 ``` 51 ```
101 Weibo_PublicOpinion_AnalysisSystem/ 52 Weibo_PublicOpinion_AnalysisSystem/
102 -├── QueryEngine/ # 新闻查询引擎Agent 53 +├── QueryEngine/ # 国内外新闻广度搜索Agent
103 │ ├── agent.py # Agent主逻辑 54 │ ├── agent.py # Agent主逻辑
104 │ ├── llms/ # LLM接口封装 55 │ ├── llms/ # LLM接口封装
105 │ ├── nodes/ # 处理节点 56 │ ├── nodes/ # 处理节点
106 │ ├── tools/ # 搜索工具 57 │ ├── tools/ # 搜索工具
107 -│ └── utils/ # 工具函数  
108 -├── MediaEngine/ # 多媒体搜索引擎Agent 58 +│ ├── utils/ # 工具函数
  59 +│ └── ... # 其他模块
  60 +├── MediaEngine/ # 强大的多模态理解Agent
109 │ ├── agent.py # Agent主逻辑 61 │ ├── agent.py # Agent主逻辑
110 │ ├── llms/ # LLM接口 62 │ ├── llms/ # LLM接口
111 │ ├── tools/ # 搜索工具 63 │ ├── tools/ # 搜索工具
112 │ └── ... # 其他模块 64 │ └── ... # 其他模块
113 -├── InsightEngine/ # 数据洞察引擎Agent 65 +├── InsightEngine/ # 私有数据库挖掘Agent
114 │ ├── agent.py # Agent主逻辑 66 │ ├── agent.py # Agent主逻辑
115 │ ├── llms/ # LLM接口封装 67 │ ├── llms/ # LLM接口封装
116 │ │ ├── deepseek.py # DeepSeek API 68 │ │ ├── deepseek.py # DeepSeek API
@@ -137,7 +89,7 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -137,7 +89,7 @@ Weibo_PublicOpinion_AnalysisSystem/
137 │ ├── __init__.py 89 │ ├── __init__.py
138 │ ├── config.py # 配置管理 90 │ ├── config.py # 配置管理
139 │ └── helpers.py # 辅助函数 91 │ └── helpers.py # 辅助函数
140 -├── ReportEngine/ # 报告生成引擎Agent 92 +├── ReportEngine/ # 多轮报告生成Agent
141 │ ├── agent.py # Agent主逻辑 93 │ ├── agent.py # Agent主逻辑
142 │ ├── llms/ # LLM接口 94 │ ├── llms/ # LLM接口
143 │ │ └── gemini.py # Gemini API专用 95 │ │ └── gemini.py # Gemini API专用
@@ -149,31 +101,33 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -149,31 +101,33 @@ Weibo_PublicOpinion_AnalysisSystem/
149 │ │ ├── 商业品牌舆情监测.md 101 │ │ ├── 商业品牌舆情监测.md
150 │ │ └── ... # 更多模板 102 │ │ └── ... # 更多模板
151 │ └── flask_interface.py # Flask API接口 103 │ └── flask_interface.py # Flask API接口
152 -├── ForumEngine/ # 论坛交流引擎Agent 104 +├── ForumEngine/ # 论坛引擎简易实现
153 │ └── monitor.py # 日志监控和论坛管理 105 │ └── monitor.py # 日志监控和论坛管理
154 ├── MindSpider/ # 微博爬虫系统 106 ├── MindSpider/ # 微博爬虫系统
155 │ ├── main.py # 爬虫主程序 107 │ ├── main.py # 爬虫主程序
156 │ ├── BroadTopicExtraction/ # 话题提取模块 108 │ ├── BroadTopicExtraction/ # 话题提取模块
157 │ │ ├── get_today_news.py # 今日新闻获取 109 │ │ ├── get_today_news.py # 今日新闻获取
158 │ │ └── topic_extractor.py # 话题提取器 110 │ │ └── topic_extractor.py # 话题提取器
159 -│ ├── DeepSentimentCrawling/ # 深度情感爬取 111 +│ ├── DeepSentimentCrawling/ # 深度舆情爬取
160 │ │ ├── MediaCrawler/ # 媒体爬虫核心 112 │ │ ├── MediaCrawler/ # 媒体爬虫核心
161 │ │ └── platform_crawler.py # 平台爬虫管理 113 │ │ └── platform_crawler.py # 平台爬虫管理
162 │ └── schema/ # 数据库结构 114 │ └── schema/ # 数据库结构
163 │ └── init_database.py # 数据库初始化 115 │ └── init_database.py # 数据库初始化
164 ├── SentimentAnalysisModel/ # 情感分析模型集合 116 ├── SentimentAnalysisModel/ # 情感分析模型集合
165 │ ├── WeiboSentiment_Finetuned/ # 微调BERT/GPT-2模型 117 │ ├── WeiboSentiment_Finetuned/ # 微调BERT/GPT-2模型
166 -│ ├── WeiboMultilingualSentiment/ # 多语言情感分析  
167 -│ ├── WeiboSentiment_SmallQwen/ # 小型Qwen模型 118 +│ ├── WeiboMultilingualSentiment/# 多语言情感分析(推荐)
  119 +│ ├── WeiboSentiment_SmallQwen/ # 小参数Qwen3微调
168 │ └── WeiboSentiment_MachineLearning/ # 传统机器学习方法 120 │ └── WeiboSentiment_MachineLearning/ # 传统机器学习方法
169 ├── SingleEngineApp/ # 单独Agent的Streamlit应用 121 ├── SingleEngineApp/ # 单独Agent的Streamlit应用
170 │ ├── query_engine_streamlit_app.py 122 │ ├── query_engine_streamlit_app.py
171 │ ├── media_engine_streamlit_app.py 123 │ ├── media_engine_streamlit_app.py
172 │ └── insight_engine_streamlit_app.py 124 │ └── insight_engine_streamlit_app.py
173 ├── templates/ # Flask模板 125 ├── templates/ # Flask模板
174 -│ └── index.html # 主界面模板 126 +│ └── index.html # 主界面前端
175 ├── static/ # 静态资源 127 ├── static/ # 静态资源
176 ├── logs/ # 运行日志目录 128 ├── logs/ # 运行日志目录
  129 +├── final_reports/ # 最终生成的HTML报告文件
  130 +├── utils/ # 通用工具函数
177 ├── app.py # Flask主应用入口 131 ├── app.py # Flask主应用入口
178 ├── config.py # 全局配置文件 132 ├── config.py # 全局配置文件
179 └── requirements.txt # Python依赖包清单 133 └── requirements.txt # Python依赖包清单
@@ -183,18 +137,18 @@ Weibo_PublicOpinion_AnalysisSystem/ @@ -183,18 +137,18 @@ Weibo_PublicOpinion_AnalysisSystem/
183 137
184 ### 环境要求 138 ### 环境要求
185 139
186 -- **操作系统**: Windows 10/11(Linux/macOS也支持)  
187 -- **Python版本**: 3.11+ 140 +- **操作系统**: Windows、Linux、MacOS
  141 +- **Python版本**: 3.9+
188 - **Conda**: Anaconda或Miniconda 142 - **Conda**: Anaconda或Miniconda
189 -- **数据库**: MySQL 8.0+(可选择我们的云数据库服务)  
190 -- **内存**: 建议8GB以上 143 +- **数据库**: MySQL(可选择我们的云数据库服务)
  144 +- **内存**: 建议2GB以上
191 145
192 ### 1. 创建Conda环境 146 ### 1. 创建Conda环境
193 147
194 ```bash 148 ```bash
195 -# 创建名为pytorch_python11的conda环境  
196 -conda create -n pytorch_python11 python=3.11  
197 -conda activate pytorch_python11 149 +# 创建conda环境
  150 +conda create -n your_conda_name python=3.11
  151 +conda activate your_conda_name
198 ``` 152 ```
199 153
200 ### 2. 安装依赖包 154 ### 2. 安装依赖包
@@ -203,6 +157,7 @@ conda activate pytorch_python11 @@ -203,6 +157,7 @@ conda activate pytorch_python11
203 # 基础依赖安装 157 # 基础依赖安装
204 pip install -r requirements.txt 158 pip install -r requirements.txt
205 159
  160 +#========下面是可选项========
206 # 如果需要本地情感分析功能,安装PyTorch 161 # 如果需要本地情感分析功能,安装PyTorch
207 # CPU版本 162 # CPU版本
208 pip install torch torchvision torchaudio 163 pip install torch torchvision torchaudio
@@ -225,7 +180,7 @@ playwright install chromium @@ -225,7 +180,7 @@ playwright install chromium
225 180
226 #### 4.1 配置API密钥 181 #### 4.1 配置API密钥
227 182
228 -编辑 `config.py` 文件,填入您的API密钥: 183 +编辑 `config.py` 文件,填入您的API密钥(您也可以选择自己的模型、搜索代理)
229 184
230 ```python 185 ```python
231 # MySQL数据库配置 186 # MySQL数据库配置
@@ -266,10 +221,9 @@ python schema/init_database.py @@ -266,10 +221,9 @@ python schema/init_database.py
266 221
267 **选择2:使用云数据库服务(推荐)** 222 **选择2:使用云数据库服务(推荐)**
268 223
269 -我们提供便捷的云数据库服务,包含日均10万+真实微博数据,目前推广期间**免费申请** 224 +我们提供便捷的云数据库服务,包含日均10万+真实舆情数据,目前推广期间**免费申请**
270 225
271 -- 真实微博数据,实时更新  
272 -- 预处理的情感标注数据 226 +- 真实舆情数据,实时更新
273 - 多维度标签分类 227 - 多维度标签分类
274 - 高可用云端服务 228 - 高可用云端服务
275 - 专业技术支持 229 - 专业技术支持
@@ -282,12 +236,14 @@ python schema/init_database.py @@ -282,12 +236,14 @@ python schema/init_database.py
282 236
283 ```bash 237 ```bash
284 # 在项目根目录下,激活conda环境 238 # 在项目根目录下,激活conda环境
285 -conda activate pytorch_python11 239 +conda activate your_conda_name
286 240
287 -# 启动主应用(自动启动所有Agent) 241 +# 启动主应用即可
288 python app.py 242 python app.py
289 ``` 243 ```
290 244
  245 +> 注:数据爬取需要单独操作,见5.3指引
  246 +
291 访问 http://localhost:5000 即可使用完整系统 247 访问 http://localhost:5000 即可使用完整系统
292 248
293 #### 5.2 单独启动某个Agent 249 #### 5.2 单独启动某个Agent
@@ -305,6 +261,8 @@ streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501 @@ -305,6 +261,8 @@ streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501
305 261
306 #### 5.3 爬虫系统单独使用 262 #### 5.3 爬虫系统单独使用
307 263
  264 +这部分有详细的配置文档:[MindeSpider使用说明](./MindSpider/README.md)
  265 +
308 ```bash 266 ```bash
309 # 进入爬虫目录 267 # 进入爬虫目录
310 cd MindSpider 268 cd MindSpider
@@ -322,65 +280,13 @@ python main.py --broad-topic --date 2024-01-20 @@ -322,65 +280,13 @@ python main.py --broad-topic --date 2024-01-20
322 python main.py --deep-sentiment --platforms xhs dy wb 280 python main.py --deep-sentiment --platforms xhs dy wb
323 ``` 281 ```
324 282
325 -## 💾 数据库配置  
326 -  
327 -### 本地数据库配置  
328 -  
329 -1. **安装MySQL 8.0+**  
330 -2. **创建数据库**  
331 - ```sql  
332 - CREATE DATABASE weibo_analysis CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;  
333 - ```  
334 -3. **运行初始化脚本**  
335 - ```bash  
336 - cd MindSpider  
337 - python schema/init_database.py  
338 - ```  
339 -  
340 -### 自动爬取配置  
341 -  
342 -配置自动爬取任务,实现数据的持续更新:  
343 -  
344 -```python  
345 -# MindSpider/config.py 中配置爬虫参数  
346 -CRAWLER_CONFIG = {  
347 - 'max_pages': 200, # 最大爬取页数  
348 - 'delay': 1, # 请求延迟(秒)  
349 - 'timeout': 30, # 超时时间(秒)  
350 - 'platforms': ['xhs', 'dy', 'wb', 'bili'], # 爬取平台  
351 - 'daily_keywords': 100, # 每日关键词数量  
352 - 'max_notes_per_keyword': 50, # 每关键词最大内容数  
353 - 'use_proxy': False, # 是否使用代理  
354 -}  
355 -```  
356 -  
357 -### 云数据库服务(推荐)  
358 -  
359 -**为什么选择我们的云数据库服务?**  
360 -  
361 -- **丰富数据源**:日均10万+真实微博数据,涵盖各行业热点话题  
362 -- **高质量标注**:专业团队人工标注的情感数据,准确率95%+  
363 -- **多维度分析**:包含话题分类、情感倾向、影响力评分等多维标签  
364 -- **实时更新**:24小时不间断数据采集,确保时效性  
365 -- **技术支持**:专业团队提供技术支持和定制化服务  
366 -  
367 -**申请方式**  
368 -📧 邮件联系:670939375@qq.com  
369 -📝 邮件标题:申请微博舆情云数据库访问  
370 -📝 邮件内容:请说明您的使用场景和预期数据量需求  
371 -  
372 -**推广期福利**  
373 -- 免费提供基础版云数据库访问  
374 -- 免费技术支持和部署指导  
375 -- 优先体验新功能特性  
376 -  
377 ## ⚙️ 高级配置 283 ## ⚙️ 高级配置
378 284
379 ### 修改关键参数 285 ### 修改关键参数
380 286
381 #### Agent配置参数 287 #### Agent配置参数
382 288
383 -每个Agent都有专门的配置文件,可根据需求调整: 289 +每个Agent都有专门的配置文件,可根据需求调整,下面是部分示例
384 290
385 ```python 291 ```python
386 # QueryEngine/utils/config.py 292 # QueryEngine/utils/config.py
@@ -406,7 +312,7 @@ class Config: @@ -406,7 +312,7 @@ class Config:
406 ```python 312 ```python
407 # InsightEngine/tools/sentiment_analyzer.py 313 # InsightEngine/tools/sentiment_analyzer.py
408 SENTIMENT_CONFIG = { 314 SENTIMENT_CONFIG = {
409 - 'model_type': 'multilingual', # 可选: 'bert', 'multilingual', 'qwen' 315 + 'model_type': 'multilingual', # 可选: 'bert', 'multilingual', 'qwen'
410 'confidence_threshold': 0.8, # 置信度阈值 316 'confidence_threshold': 0.8, # 置信度阈值
411 'batch_size': 32, # 批处理大小 317 'batch_size': 32, # 批处理大小
412 'max_sequence_length': 512, # 最大序列长度 318 'max_sequence_length': 512, # 最大序列长度
@@ -420,7 +326,7 @@ SENTIMENT_CONFIG = { @@ -420,7 +326,7 @@ SENTIMENT_CONFIG = {
420 ```python 326 ```python
421 # 在各Engine的utils/config.py中配置 327 # 在各Engine的utils/config.py中配置
422 class Config: 328 class Config:
423 - default_llm_provider = "deepseek" # 可选: "deepseek", "openai", "kimi", "gemini" 329 + default_llm_provider = "deepseek" # 可选: "deepseek", "openai", "kimi", "gemini","qwen"等
424 330
425 # DeepSeek配置 331 # DeepSeek配置
426 deepseek_api_key = "your_api_key" 332 deepseek_api_key = "your_api_key"
@@ -444,40 +350,40 @@ class Config: @@ -444,40 +350,40 @@ class Config:
444 350
445 系统集成了多种情感分析方法,可根据需求选择: 351 系统集成了多种情感分析方法,可根据需求选择:
446 352
447 -#### 1. 基于BERT的微调模型(精度最高) 353 +#### 1. 多语言情感分析
448 354
449 ```bash 355 ```bash
450 -# 使用BERT中文模型  
451 -cd SentimentAnalysisModel/WeiboSentiment_Finetuned/BertChinese-Lora  
452 -python predict.py --text "这个产品真的很不错" 356 +cd SentimentAnalysisModel/WeiboMultilingualSentiment
  357 +python predict.py --text "This product is amazing!" --lang "en"
453 ``` 358 ```
454 359
455 -#### 2. GPT-2 LoRA微调模型(速度较快) 360 +#### 2. 小参数Qwen3微调
456 361
457 ```bash 362 ```bash
458 -cd SentimentAnalysisModel/WeiboSentiment_Finetuned/GPT2-Lora  
459 -python predict.py --text "今天心情不太好" 363 +cd SentimentAnalysisModel/WeiboSentiment_SmallQwen
  364 +python predict_universal.py --text "这次活动办得很成功"
460 ``` 365 ```
461 366
462 -#### 3. 小型Qwen模型(平衡型) 367 +#### 3. 基于BERT的微调模型
463 368
464 ```bash 369 ```bash
465 -cd SentimentAnalysisModel/WeiboSentiment_SmallQwen  
466 -python predict_universal.py --text "这次活动办得很成功" 370 +# 使用BERT中文模型
  371 +cd SentimentAnalysisModel/WeiboSentiment_Finetuned/BertChinese-Lora
  372 +python predict.py --text "这个产品真的很不错"
467 ``` 373 ```
468 374
469 -#### 4. 传统机器学习方法(轻量级) 375 +#### 4. GPT-2 LoRA微调模型
470 376
471 ```bash 377 ```bash
472 -cd SentimentAnalysisModel/WeiboSentiment_MachineLearning  
473 -python predict.py --model_type "svm" --text "服务态度需要改进" 378 +cd SentimentAnalysisModel/WeiboSentiment_Finetuned/GPT2-Lora
  379 +python predict.py --text "今天心情不太好"
474 ``` 380 ```
475 381
476 -#### 5. 多语言情感分析(支持22种语言) 382 +#### 5. 传统机器学习方法
477 383
478 ```bash 384 ```bash
479 -cd SentimentAnalysisModel/WeiboMultilingualSentiment  
480 -python predict.py --text "This product is amazing!" --lang "en" 385 +cd SentimentAnalysisModel/WeiboSentiment_MachineLearning
  386 +python predict.py --model_type "svm" --text "服务态度需要改进"
481 ``` 387 ```
482 388
483 ### 接入自定义业务数据库 389 ### 接入自定义业务数据库
@@ -538,45 +444,13 @@ class DeepSearchAgent: @@ -538,45 +444,13 @@ class DeepSearchAgent:
538 444
539 ### 自定义报告模板 445 ### 自定义报告模板
540 446
541 -#### 1. 创建模板文件  
542 -  
543 -`ReportEngine/report_template/` 目录下创建新的Markdown模板:  
544 -  
545 -```markdown  
546 -<!-- 企业品牌监测报告.md -->  
547 -# 企业品牌舆情监测报告  
548 -  
549 -## 📊 执行摘要  
550 -{executive_summary}  
551 -  
552 -## 🔍 品牌提及分析  
553 -### 提及量趋势  
554 -{mention_trend}  
555 -  
556 -### 情感分布  
557 -{sentiment_distribution}  
558 -  
559 -## 📈 竞品对比分析  
560 -{competitor_analysis}  
561 -  
562 -## 🎯 关键观点摘要  
563 -{key_insights} 447 +#### 1. 在Web界面中上传
564 448
565 -## ⚠️ 风险预警  
566 -{risk_alerts} 449 +系统支持上传自定义模板文件(.md或.txt格式),可在生成报告时选择使用。
567 450
568 -## 📋 改进建议  
569 -{recommendations} 451 +#### 2. 创建模板文件
570 452
571 ----  
572 -*报告类型:企业品牌舆情监测*  
573 -*生成时间:{generation_time}*  
574 -*数据来源:{data_sources}*  
575 -```  
576 -  
577 -#### 2. 在Web界面中使用  
578 -  
579 -系统支持上传自定义模板文件(.md或.txt格式),可在生成报告时选择使用。 453 +`ReportEngine/report_template/` 目录下创建新的模板,我们的Agent会自行选用最合适的模板。
580 454
581 ## 🤝 贡献指南 455 ## 🤝 贡献指南
582 456
@@ -590,15 +464,6 @@ class DeepSearchAgent: @@ -590,15 +464,6 @@ class DeepSearchAgent:
590 4. **推送到分支**`git push origin feature/AmazingFeature` 464 4. **推送到分支**`git push origin feature/AmazingFeature`
591 5. **开启Pull Request** 465 5. **开启Pull Request**
592 466
593 -### 贡献类型  
594 -  
595 -- 🐛 Bug修复  
596 -- ✨ 新功能开发  
597 -- 📚 文档完善  
598 -- 🎨 UI/UX改进  
599 -- ⚡ 性能优化  
600 -- 🧪 测试用例添加  
601 -  
602 ### 开发规范 467 ### 开发规范
603 468
604 - 代码遵循PEP8规范 469 - 代码遵循PEP8规范
@@ -608,7 +473,7 @@ class DeepSearchAgent: @@ -608,7 +473,7 @@ class DeepSearchAgent:
608 473
609 ## 📄 许可证 474 ## 📄 许可证
610 475
611 -本项目采用 [MIT许可证](LICENSE)。详细信息请参阅LICENSE文件。 476 +本项目采用 [GPL-2.0许可证](LICENSE)。详细信息请参阅LICENSE文件。
612 477
613 ## 🎉 支持与联系 478 ## 🎉 支持与联系
614 479
@@ -621,8 +486,6 @@ class DeepSearchAgent: @@ -621,8 +486,6 @@ class DeepSearchAgent:
621 ### 联系方式 486 ### 联系方式
622 487
623 - 📧 **邮箱**:670939375@qq.com 488 - 📧 **邮箱**:670939375@qq.com
624 -- 💬 **QQ群**[加入技术交流群]  
625 -- 🐦 **微信**[扫码添加技术支持]  
626 489
627 ### 商务合作 490 ### 商务合作
628 491
@@ -635,7 +498,7 @@ class DeepSearchAgent: @@ -635,7 +498,7 @@ class DeepSearchAgent:
635 498
636 **免费云数据库服务申请** 499 **免费云数据库服务申请**
637 📧 发送邮件至:670939375@qq.com 500 📧 发送邮件至:670939375@qq.com
638 -📝 标题:微博舆情云数据库申请 501 +📝 标题:微云数据库申请
639 📝 说明:您的使用场景和需求 502 📝 说明:您的使用场景和需求
640 503
641 ## 👥 贡献者 504 ## 👥 贡献者
@@ -650,6 +513,4 @@ class DeepSearchAgent: @@ -650,6 +513,4 @@ class DeepSearchAgent:
650 513
651 **⭐ 如果这个项目对您有帮助,请给我们一个星标!** 514 **⭐ 如果这个项目对您有帮助,请给我们一个星标!**
652 515
653 -Made with ❤️ by [微博舆情分析团队](https://github.com/666ghj)  
654 -  
655 </div> 516 </div>