luojiyin

doc: add docker usecase

@@ -280,9 +280,168 @@ We provide convenient cloud database service with 100,000+ daily real public opi @@ -280,9 +280,168 @@ We provide convenient cloud database service with 100,000+ daily real public opi
280 280
281 > To conduct a data compliance review and service upgrade, we are suspending new applications for the cloud database, effective October 1, 2025. 281 > To conduct a data compliance review and service upgrade, we are suspending new applications for the cloud database, effective October 1, 2025.
282 282
283 -### 5. Launch System 283 +### 5. Docker Deployment (Recommended)
284 284
285 -#### 5.1 Complete System Launch (Recommended) 285 +The project provides complete Docker support, including application and database services, for easy deployment and environment isolation.
  286 +
  287 +#### 5.1 Docker Requirements
  288 +
  289 +- **Docker**: 20.10+
  290 +- **Docker Compose**: 2.0+
  291 +- **Available Memory**: 4GB+ recommended
  292 +- **Available Disk Space**: 10GB+ recommended
  293 +
  294 +#### 5.2 Docker Quick Start
  295 +
  296 +1. **Clone project and enter directory**
  297 +```bash
  298 +git clone https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem.git
  299 +cd Weibo_PublicOpinion_AnalysisSystem
  300 +```
  301 +
  302 +2. **Configure environment variables**
  303 +```bash
  304 +# Copy environment variable template
  305 +cp .env.example .env
  306 +
  307 +# Edit environment variable file and fill in required configurations
  308 +vim .env
  309 +```
  310 +
  311 +**Important environment variable configuration**:
  312 +```bash
  313 +# LLM API configuration (required)
  314 +INSIGHT_ENGINE_API_KEY="your_api_key"
  315 +INSIGHT_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  316 +INSIGHT_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  317 +
  318 +# Media Agent configuration
  319 +MEDIA_ENGINE_API_KEY="your_api_key"
  320 +MEDIA_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  321 +MEDIA_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  322 +
  323 +# Query Agent configuration
  324 +QUERY_ENGINE_API_KEY="your_api_key"
  325 +QUERY_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  326 +QUERY_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  327 +
  328 +# Report Agent configuration
  329 +REPORT_ENGINE_API_KEY="your_api_key"
  330 +REPORT_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  331 +REPORT_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  332 +
  333 +# Database configuration (using built-in Docker PostgreSQL)
  334 +POSTGRES_USER=bettafish
  335 +POSTGRES_PASSWORD=bettafish
  336 +POSTGRES_DB=bettafish
  337 +POSTGRES_PORT=5444
  338 +```
  339 +
  340 +3. **Start Docker services**
  341 +```bash
  342 +# Build and start all services
  343 +docker-compose up -d
  344 +
  345 +# Check service status
  346 +docker-compose ps
  347 +
  348 +# View logs
  349 +docker-compose logs -f bettafish
  350 +```
  351 +
  352 +4. **Access applications**
  353 +- **Main Application**: http://localhost:5000
  354 +- **Insight Engine**: http://localhost:8501
  355 +- **Media Engine**: http://localhost:8502
  356 +- **Query Engine**: http://localhost:8503
  357 +
  358 +#### 5.3 Docker Management Commands
  359 +
  360 +```bash
  361 +# Start all services
  362 +docker-compose up -d
  363 +
  364 +# Stop all services
  365 +docker-compose down
  366 +
  367 +# Stop and delete all data (use with caution)
  368 +docker-compose down -v
  369 +
  370 +# Rebuild and start
  371 +docker-compose up --build -d
  372 +
  373 +# View real-time logs
  374 +docker-compose logs -f
  375 +
  376 +# View specific service logs
  377 +docker-compose logs -f bettafish
  378 +docker-compose logs -f db
  379 +
  380 +# Enter container
  381 +docker-compose exec bettafish bash
  382 +
  383 +# Backup database
  384 +docker-compose exec db pg_dump -U bettafish bettafish > backup.sql
  385 +
  386 +# Restore database
  387 +docker-compose exec -T db psql -U bettafish bettafish < backup.sql
  388 +```
  389 +
  390 +#### 5.4 Docker Data Persistence
  391 +
  392 +The project configures the following data volumes:
  393 +- `./logs`: Application log files
  394 +- `./final_reports`: Generated analysis reports
  395 +- `./insight_engine_streamlit_reports`: Insight Engine reports
  396 +- `./media_engine_streamlit_reports`: Media Engine reports
  397 +- `./query_engine_streamlit_reports`: Query Engine reports
  398 +- `./db_data`: PostgreSQL database data
  399 +
  400 +#### 5.5 Docker Troubleshooting
  401 +
  402 +**Common issues and solutions**:
  403 +
  404 +1. **Port conflicts**
  405 +```bash
  406 +# Check port usage
  407 +netstat -tulpn | grep :5000
  408 +# Or modify port mapping in docker-compose.yml
  409 +```
  410 +
  411 +2. **Insufficient memory**
  412 +```bash
  413 +# Increase Docker memory limits
  414 +# Adjust resource allocation in Docker Desktop
  415 +```
  416 +
  417 +3. **Permission issues**
  418 +```bash
  419 +# Ensure scripts have execute permissions
  420 +chmod +x scripts/*.sh
  421 +
  422 +# Ensure data directory permissions are correct
  423 +sudo chown -R $USER:$USER ./
  424 +```
  425 +
  426 +4. **Build failures**
  427 +```bash
  428 +# Clear Docker cache and rebuild
  429 +docker system prune -a
  430 +docker-compose build --no-cache
  431 +```
  432 +
  433 +5. **Service won't start**
  434 +```bash
  435 +# Check logs to troubleshoot
  436 +docker-compose logs bettafish
  437 +
  438 +# Check environment variable configuration
  439 +docker-compose config
  440 +```
  441 +
  442 +### 6. Traditional Deployment
  443 +
  444 +#### 6.1 Complete System Launch (Recommended)
286 445
287 ```bash 446 ```bash
288 # In project root directory, activate conda environment 447 # In project root directory, activate conda environment
@@ -303,13 +462,13 @@ python app.py @@ -303,13 +462,13 @@ python app.py
303 462
304 > Note 1: After a run is terminated, the Streamlit app might not shut down correctly and may still be occupying the port. If this occurs, find the process that is holding the port and kill it. 463 > Note 1: After a run is terminated, the Streamlit app might not shut down correctly and may still be occupying the port. If this occurs, find the process that is holding the port and kill it.
305 464
306 -> Note 2: Data scraping needs to be performed as a separate operation. Please refer to the instructions in section 5.3. 465 +> Note 2: Data scraping needs to be performed as a separate operation. Please refer to the instructions in section 6.3.
307 466
308 > Note 3: If page display issues occur during remote server deployment, see [PR#45](https://github.com/666ghj/BettaFish/pull/45) 467 > Note 3: If page display issues occur during remote server deployment, see [PR#45](https://github.com/666ghj/BettaFish/pull/45)
309 468
310 Visit http://localhost:5000 to use the complete system 469 Visit http://localhost:5000 to use the complete system
311 470
312 -#### 5.2 Launch Individual Agents 471 +#### 6.2 Launch Individual Agents
313 472
314 ```bash 473 ```bash
315 # Start QueryEngine 474 # Start QueryEngine
@@ -322,7 +481,7 @@ streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502 @@ -322,7 +481,7 @@ streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502
322 streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501 481 streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501
323 ``` 482 ```
324 483
325 -#### 5.3 Crawler System Standalone Use 484 +#### 6.3 Crawler System Standalone Use
326 485
327 This section has detailed configuration documentation: [MindSpider Usage Guide](./MindSpider/README.md) 486 This section has detailed configuration documentation: [MindSpider Usage Guide](./MindSpider/README.md)
328 487
@@ -294,9 +294,168 @@ python main.py --setup @@ -294,9 +294,168 @@ python main.py --setup
294 294
295 > 为进行数据合规性审查与服务升级,云数据库自2025年10月1日起暂停接收新的使用申请 295 > 为进行数据合规性审查与服务升级,云数据库自2025年10月1日起暂停接收新的使用申请
296 296
297 -### 5. 启动系统 297 +### 5. Docker 部署(推荐)
298 298
299 -#### 5.1 完整系统启动(推荐) 299 +项目提供了完整的Docker支持,包含应用程序和数据库服务,便于快速部署和环境隔离。
  300 +
  301 +#### 5.1 Docker 环境要求
  302 +
  303 +- **Docker**: 20.10+
  304 +- **Docker Compose**: 2.0+
  305 +- **可用内存**: 建议4GB以上
  306 +- **可用磁盘空间**: 建议10GB以上
  307 +
  308 +#### 5.2 Docker 快速启动
  309 +
  310 +1. **克隆项目并进入目录**
  311 +```bash
  312 +git clone https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem.git
  313 +cd Weibo_PublicOpinion_AnalysisSystem
  314 +```
  315 +
  316 +2. **配置环境变量**
  317 +```bash
  318 +# 复制环境变量模板
  319 +cp .env.example .env
  320 +
  321 +# 编辑环境变量文件,填入必要的配置
  322 +vim .env
  323 +```
  324 +
  325 +**重要环境变量配置**
  326 +```bash
  327 +# LLM API配置(必需)
  328 +INSIGHT_ENGINE_API_KEY="your_api_key"
  329 +INSIGHT_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  330 +INSIGHT_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  331 +
  332 +# Media Agent配置
  333 +MEDIA_ENGINE_API_KEY="your_api_key"
  334 +MEDIA_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  335 +MEDIA_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  336 +
  337 +# Query Agent配置
  338 +QUERY_ENGINE_API_KEY="your_api_key"
  339 +QUERY_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  340 +QUERY_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  341 +
  342 +# Report Agent配置
  343 +REPORT_ENGINE_API_KEY="your_api_key"
  344 +REPORT_ENGINE_BASE_URL="https://api.moonshot.cn/v1"
  345 +REPORT_ENGINE_MODEL_NAME="kimi-k2-0711-preview"
  346 +
  347 +# 数据库配置(使用Docker内置PostgreSQL)
  348 +POSTGRES_USER=bettafish
  349 +POSTGRES_PASSWORD=bettafish
  350 +POSTGRES_DB=bettafish
  351 +POSTGRES_PORT=5444
  352 +```
  353 +
  354 +3. **启动Docker服务**
  355 +```bash
  356 +# 构建并启动所有服务
  357 +docker-compose up -d
  358 +
  359 +# 查看服务状态
  360 +docker-compose ps
  361 +
  362 +# 查看日志
  363 +docker-compose logs -f bettafish
  364 +```
  365 +
  366 +4. **访问应用**
  367 +- **主应用**: http://localhost:5000
  368 +- **Insight Engine**: http://localhost:8501
  369 +- **Media Engine**: http://localhost:8502
  370 +- **Query Engine**: http://localhost:8503
  371 +
  372 +#### 5.3 Docker 管理命令
  373 +
  374 +```bash
  375 +# 启动所有服务
  376 +docker-compose up -d
  377 +
  378 +# 停止所有服务
  379 +docker-compose down
  380 +
  381 +# 停止并删除所有数据(谨慎使用)
  382 +docker-compose down -v
  383 +
  384 +# 重新构建并启动
  385 +docker-compose up --build -d
  386 +
  387 +# 查看实时日志
  388 +docker-compose logs -f
  389 +
  390 +# 查看特定服务日志
  391 +docker-compose logs -f bettafish
  392 +docker-compose logs -f db
  393 +
  394 +# 进入容器内部
  395 +docker-compose exec bettafish bash
  396 +
  397 +# 备份数据库
  398 +docker-compose exec db pg_dump -U bettafish bettafish > backup.sql
  399 +
  400 +# 恢复数据库
  401 +docker-compose exec -T db psql -U bettafish bettafish < backup.sql
  402 +```
  403 +
  404 +#### 5.4 Docker 数据持久化
  405 +
  406 +项目配置了以下数据卷:
  407 +- `./logs`: 应用日志文件
  408 +- `./final_reports`: 生成的分析报告
  409 +- `./insight_engine_streamlit_reports`: Insight Engine报告
  410 +- `./media_engine_streamlit_reports`: Media Engine报告
  411 +- `./query_engine_streamlit_reports`: Query Engine报告
  412 +- `./db_data`: PostgreSQL数据库数据
  413 +
  414 +#### 5.5 Docker 故障排除
  415 +
  416 +**常见问题及解决方案**
  417 +
  418 +1. **端口冲突**
  419 +```bash
  420 +# 检查端口占用
  421 +netstat -tulpn | grep :5000
  422 +# 或修改docker-compose.yml中的端口映射
  423 +```
  424 +
  425 +2. **内存不足**
  426 +```bash
  427 +# 增加Docker内存限制
  428 +# 在Docker Desktop中调整资源分配
  429 +```
  430 +
  431 +3. **权限问题**
  432 +```bash
  433 +# 确保脚本有执行权限
  434 +chmod +x scripts/*.sh
  435 +
  436 +# 确保数据目录权限正确
  437 +sudo chown -R $USER:$USER ./
  438 +```
  439 +
  440 +4. **构建失败**
  441 +```bash
  442 +# 清理Docker缓存并重新构建
  443 +docker system prune -a
  444 +docker-compose build --no-cache
  445 +```
  446 +
  447 +5. **服务无法启动**
  448 +```bash
  449 +# 检查日志排查问题
  450 +docker-compose logs bettafish
  451 +
  452 +# 检查环境变量配置
  453 +docker-compose config
  454 +```
  455 +
  456 +### 6. 传统方式启动
  457 +
  458 +#### 6.1 完整系统启动
300 459
301 ```bash 460 ```bash
302 # 在项目根目录下,激活conda环境 461 # 在项目根目录下,激活conda环境
@@ -306,7 +465,7 @@ conda activate your_conda_name @@ -306,7 +465,7 @@ conda activate your_conda_name
306 python app.py 465 python app.py
307 ``` 466 ```
308 467
309 -uv 版本启动命令 468 +uv 版本启动命令
310 ```bash 469 ```bash
311 # 在项目根目录下,激活uv环境 470 # 在项目根目录下,激活uv环境
312 .venv\Scripts\activate 471 .venv\Scripts\activate
@@ -317,13 +476,13 @@ python app.py @@ -317,13 +476,13 @@ python app.py
317 476
318 > 注1:一次运行终止后,streamlit app可能结束异常仍然占用端口,此时搜索占用端口的进程kill掉即可 477 > 注1:一次运行终止后,streamlit app可能结束异常仍然占用端口,此时搜索占用端口的进程kill掉即可
319 478
320 -> 注2:数据爬取需要单独操作,见5.3指引 479 +> 注2:数据爬取需要单独操作,见6.3指引
321 480
322 > 注3:如果服务器远程部署出现页面显示问题,见[PR#45](https://github.com/666ghj/BettaFish/pull/45) 481 > 注3:如果服务器远程部署出现页面显示问题,见[PR#45](https://github.com/666ghj/BettaFish/pull/45)
323 482
324 访问 http://localhost:5000 即可使用完整系统 483 访问 http://localhost:5000 即可使用完整系统
325 484
326 -#### 5.2 单独启动某个Agent 485 +#### 6.2 单独启动某个Agent
327 486
328 ```bash 487 ```bash
329 # 启动QueryEngine 488 # 启动QueryEngine
@@ -336,7 +495,7 @@ streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502 @@ -336,7 +495,7 @@ streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502
336 streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501 495 streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501
337 ``` 496 ```
338 497
339 -#### 5.3 爬虫系统单独使用 498 +#### 6.3 爬虫系统单独使用
340 499
341 这部分有详细的配置文档:[MindSpider使用说明](./MindSpider/README.md) 500 这部分有详细的配置文档:[MindSpider使用说明](./MindSpider/README.md)
342 501