Committed by
GitHub
功能: 重写小红书 Skills,完整迁移为 CDP Python 实现 (#1)
## 主要变更 ### 核心模块重写 - 创建 scripts/xhs/ 包,包含 18 个专业模块(3728 行代码) - 基于 xiaohongshu-mcp Go 源码完整实现 - CDP WebSocket 直接通信,替代第三方库依赖 ### 模块清单 - cdp.py: Browser/Page/Element 类,完整 CDP 协议实现 - stealth.py: 反检测 JS 注入 + Chrome 启动参数 - login.py: 登录检查与二维码登录(QR 码保存到临时文件供 Agent 显示) - publish.py: 图文发布完整流程 - publish_video.py: 视频发布完整流程 - search.py: 搜索与内容筛选 - feed_detail.py: 笔记详情与评论加载 - comment.py: 评论与回复 - like_favorite.py: 点赞与收藏 - user_profile.py: 用户主页 - cookies.py: Cookie 持久化 - types.py: 完整的 dataclass 数据类型系统 - errors.py: 自定义异常体系 - human.py: 人类行为模拟(延迟、滚动) - selectors.py: CSS 选择器常量 - urls.py: URL 构建函数 ### CLI 统一接口 - scripts/cli.py: 13 个子命令,完全兼容 xiaohongshu-mcp MCP 工具 - check-login: 检查登录状态 - login: 获取登录二维码 - switch-account/delete-cookies: 账号切换 - publish-content: 图文发布 - publish-with-video: 视频发布 - list-feeds: Feed 列表 - search-feeds: Feed 搜索 - get-feed-detail: 笔记详情 - user-profile: 用户主页 - post-comment: 发送评论 - like-feed: 点赞笔记 - favorite-feed: 收藏笔记 ### 支持脚本重写 - chrome_launcher.py: Chrome 进程管理(跨平台) - account_manager.py: 多账号 Profile 隔离 - image_downloader.py: 图片/视频下载(SHA256 缓存) - title_utils.py: UTF-16 标题长度计算 - run_lock.py: 单实例锁机制 - publish_pipeline.py: 发布流程编排 CLI ### 文档与配置 - SKILL.md: 统一技能入口(路由到 5 个子技能) - skills/xhs-auth/SKILL.md: 认证管理技能 - skills/xhs-publish/SKILL.md: 内容发布技能(图文+视频) - skills/xhs-explore/SKILL.md: 内容发现与分析技能 - skills/xhs-interact/SKILL.md: 社交互动技能(评论/点赞/收藏) - skills/xhs-content-ops/SKILL.md: 复合内容运营工作流技能 - CLAUDE.md: 项目开发指南 - PROMPT.md: Ralph Loop 驱动文件 - pyproject.toml: uv 项目配置(uv.lock) - README.md: 完整项目文档 ### 技术栈 - Python 3.11+ with uv 包管理 - requests + websockets: CDP WebSocket 通信 - 代码规范: ruff lint + format ## 对应关系 所有 13 个子命令与 xiaohongshu-mcp MCP 工具完全对应 支持 OpenClaw agent 框架直接调用 ## 前置工作 - 创建 scripts/xhs/ 包架构 - 实现 CDP WebSocket 协议 - 完整的类型系统和错误处理 - CLI 子命令系统 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Showing
37 changed files
with
5947 additions
and
1 deletions
| @@ -205,3 +205,15 @@ cython_debug/ | @@ -205,3 +205,15 @@ cython_debug/ | ||
| 205 | marimo/_static/ | 205 | marimo/_static/ |
| 206 | marimo/_lsp/ | 206 | marimo/_lsp/ |
| 207 | __marimo__/ | 207 | __marimo__/ |
| 208 | + | ||
| 209 | +# Project specific | ||
| 210 | +tmp/ | ||
| 211 | +*.txt | ||
| 212 | +!requirements.txt | ||
| 213 | +config/accounts.json | ||
| 214 | +title.txt | ||
| 215 | +content.txt | ||
| 216 | +comment.txt | ||
| 217 | + | ||
| 218 | +# Ralph Loop state | ||
| 219 | +.claude/.ralph-loop.local.md |
CLAUDE.md
0 → 100644
| 1 | +# xiaohongshu-skills | ||
| 2 | + | ||
| 3 | +小红书自动化 Claude Code Skills,基于 Python CDP 浏览器自动化引擎。 | ||
| 4 | +为 OpenClaw 生态提供小红书操作能力,同时支持 Claude Code skills 格式。 | ||
| 5 | + | ||
| 6 | +## 项目结构 | ||
| 7 | + | ||
| 8 | +``` | ||
| 9 | +xiaohongshu-skills/ | ||
| 10 | +├── scripts/ # Python CDP 自动化引擎 | ||
| 11 | +│ ├── xhs/ # 核心 XHS 自动化包 | ||
| 12 | +│ │ ├── __init__.py | ||
| 13 | +│ │ ├── cdp.py # CDP WebSocket 客户端(Browser, Page, Element) | ||
| 14 | +│ │ ├── stealth.py # 反检测 JS 注入 + Chrome 启动参数 | ||
| 15 | +│ │ ├── cookies.py # Cookie 文件持久化 | ||
| 16 | +│ │ ├── types.py # 数据类型(dataclass) | ||
| 17 | +│ │ ├── errors.py # 异常体系 | ||
| 18 | +│ │ ├── selectors.py # CSS 选择器常量 | ||
| 19 | +│ │ ├── urls.py # URL 常量和构建函数 | ||
| 20 | +│ │ ├── human.py # 人类行为模拟(延迟、滚动) | ||
| 21 | +│ │ ├── login.py # 登录检查、二维码登录 | ||
| 22 | +│ │ ├── feeds.py # 首页 Feed 列表 | ||
| 23 | +│ │ ├── search.py # 搜索 + 筛选 | ||
| 24 | +│ │ ├── feed_detail.py # 笔记详情 + 评论加载 | ||
| 25 | +│ │ ├── user_profile.py # 用户主页 | ||
| 26 | +│ │ ├── comment.py # 评论、回复 | ||
| 27 | +│ │ ├── like_favorite.py # 点赞、收藏 | ||
| 28 | +│ │ ├── publish.py # 图文发布 | ||
| 29 | +│ │ └── publish_video.py # 视频发布 | ||
| 30 | +│ ├── cli.py # 统一 CLI 入口(13 个子命令) | ||
| 31 | +│ ├── chrome_launcher.py # Chrome 进程管理 | ||
| 32 | +│ ├── account_manager.py # 多账号管理 | ||
| 33 | +│ ├── image_downloader.py # 媒体下载(SHA256 缓存) | ||
| 34 | +│ ├── title_utils.py # UTF-16 标题长度计算 | ||
| 35 | +│ ├── run_lock.py # 单实例锁 | ||
| 36 | +│ └── publish_pipeline.py # 发布编排器 | ||
| 37 | +├── skills/ # Claude Code Skills 定义 | ||
| 38 | +│ ├── xhs-auth/SKILL.md # 认证管理 | ||
| 39 | +│ ├── xhs-publish/SKILL.md # 内容发布(图文+视频) | ||
| 40 | +│ ├── xhs-explore/SKILL.md # 内容发现与分析 | ||
| 41 | +│ ├── xhs-interact/SKILL.md # 社交互动(评论/点赞/收藏) | ||
| 42 | +│ └── xhs-content-ops/SKILL.md # 复合内容运营工作流 | ||
| 43 | +├── pyproject.toml # uv 项目配置 | ||
| 44 | +├── SKILL.md # 统一入口(路由到子技能) | ||
| 45 | +├── CLAUDE.md # 本文件 | ||
| 46 | +├── PROMPT.md # Ralph Loop 驱动文件 | ||
| 47 | +└── README.md | ||
| 48 | +``` | ||
| 49 | + | ||
| 50 | +## 技术栈 | ||
| 51 | + | ||
| 52 | +- **Python**: >=3.11 | ||
| 53 | +- **包管理**: uv | ||
| 54 | +- **依赖**: requests + websockets(直接 CDP WebSocket 通信) | ||
| 55 | +- **浏览器**: Chrome(通过 CDP 远程调试协议控制) | ||
| 56 | +- **代码规范**: ruff(lint + format) | ||
| 57 | +- **数据提取**: `window.__INITIAL_STATE__`(与 Go 源码一致) | ||
| 58 | + | ||
| 59 | +## 开发命令 | ||
| 60 | + | ||
| 61 | +```bash | ||
| 62 | +uv sync # 安装依赖 | ||
| 63 | +uv run ruff check . # Lint 检查 | ||
| 64 | +uv run ruff format . # 代码格式化 | ||
| 65 | +uv run pytest # 运行测试 | ||
| 66 | +``` | ||
| 67 | + | ||
| 68 | +## 架构设计 | ||
| 69 | + | ||
| 70 | +### 双层结构 | ||
| 71 | + | ||
| 72 | +1. **scripts/ — Python CDP 引擎** | ||
| 73 | + - 基于 xiaohongshu-mcp Go 源码从零重写 | ||
| 74 | + - `xhs/` 包:模块化的核心自动化库 | ||
| 75 | + - `cli.py`:统一 CLI 入口,13 个子命令对应 MCP 工具 | ||
| 76 | + - JSON 结构化输出,便于 agent 解析 | ||
| 77 | + - 多账号支持,独立 Chrome Profile 隔离 | ||
| 78 | + - 反检测保护(stealth flags + JS 注入) | ||
| 79 | + | ||
| 80 | +2. **skills/ — Claude Code Skills 定义** | ||
| 81 | + - SKILL.md 格式,指导 Claude 如何调用 scripts/ | ||
| 82 | + - 包含输入判断、约束规则、工作流程、失败处理 | ||
| 83 | + | ||
| 84 | +### 调用方式 | ||
| 85 | + | ||
| 86 | +```bash | ||
| 87 | +# 统一 CLI 入口 | ||
| 88 | +python scripts/cli.py check-login | ||
| 89 | +python scripts/cli.py search-feeds --keyword "关键词" | ||
| 90 | +python scripts/cli.py publish --title-file t.txt --content-file c.txt --images pic.jpg | ||
| 91 | + | ||
| 92 | +# 发布流水线(含图片下载和登录检查) | ||
| 93 | +python scripts/publish_pipeline.py --title-file t.txt --content-file c.txt --images URL1 | ||
| 94 | +``` | ||
| 95 | + | ||
| 96 | +## 代码规范 | ||
| 97 | + | ||
| 98 | +### Python 风格 | ||
| 99 | +- 遵循 PEP 8,使用 ruff 强制执行 | ||
| 100 | +- 完整的 type hints(PEP 484),使用 `str | None` 语法 | ||
| 101 | +- 公共函数和类必须有 docstring | ||
| 102 | +- 行长度上限 100 字符 | ||
| 103 | +- 使用 `from __future__ import annotations` 启用延迟注解 | ||
| 104 | + | ||
| 105 | +### 命名约定 | ||
| 106 | +- 文件名:snake_case | ||
| 107 | +- 类名:PascalCase | ||
| 108 | +- 函数/变量:snake_case | ||
| 109 | +- 常量:UPPER_SNAKE_CASE | ||
| 110 | + | ||
| 111 | +### 错误处理 | ||
| 112 | +- 自定义异常类继承自 `XHSError` 基类(`xhs/errors.py`) | ||
| 113 | +- CLI 命令使用结构化 exit code:0=成功,1=未登录,2=错误 | ||
| 114 | +- 所有用户可见的错误信息使用中文 | ||
| 115 | + | ||
| 116 | +### 安全约束 | ||
| 117 | +- 发布类操作必须有用户确认机制 | ||
| 118 | +- 文件路径必须使用绝对路径 | ||
| 119 | +- 不在命令行参数中内联敏感内容(使用文件传递) | ||
| 120 | +- Chrome Profile 目录隔离账号 cookies | ||
| 121 | + | ||
| 122 | +## 参考资源 | ||
| 123 | + | ||
| 124 | +- **xiaohongshu-mcp Go 源码**: /Users/zy/src/zy/xiaohongshu-mcp/ | ||
| 125 | + | ||
| 126 | +## MCP 工具对照表 | ||
| 127 | + | ||
| 128 | +scripts/cli.py 的 13 个子命令对应 xiaohongshu-mcp 的 MCP 工具: | ||
| 129 | + | ||
| 130 | +| CLI 子命令 | MCP 工具 | 分类 | | ||
| 131 | +|--|--|--| | ||
| 132 | +| `check-login` | check_login_status | 认证 | | ||
| 133 | +| `login` | get_login_qrcode | 认证 | | ||
| 134 | +| `delete-cookies` | delete_cookies | 认证 | | ||
| 135 | +| `list-feeds` | list_feeds | 浏览 | | ||
| 136 | +| `search-feeds` | search_feeds | 浏览 | | ||
| 137 | +| `get-feed-detail` | get_feed_detail | 浏览 | | ||
| 138 | +| `user-profile` | user_profile | 浏览 | | ||
| 139 | +| `post-comment` | post_comment_to_feed | 互动 | | ||
| 140 | +| `reply-comment` | reply_comment_in_feed | 互动 | | ||
| 141 | +| `like-feed` | like_feed | 互动 | | ||
| 142 | +| `favorite-feed` | favorite_feed | 互动 | | ||
| 143 | +| `publish` | publish_content | 发布 | | ||
| 144 | +| `publish-video` | publish_with_video | 发布 | |
PROMPT.md
0 → 100644
| 1 | +# 小红书 Skills 开发任务 | ||
| 2 | + | ||
| 3 | +## 目标 | ||
| 4 | + | ||
| 5 | +基于 xiaohongshu-mcp Go 源码,从零重写 Python CDP 引擎,为 OpenClaw 生态构建完整的小红书自动化 Skills。 | ||
| 6 | + | ||
| 7 | +## 参考资料 | ||
| 8 | + | ||
| 9 | +- **xiaohongshu-mcp Go 源码**: `/Users/zy/src/zy/xiaohongshu-mcp/` — 10k stars,13 个 MCP 工具 | ||
| 10 | +- **xiaohongshu-mcp 数据结构**: `/Users/zy/src/zy/xiaohongshu-mcp/xiaohongshu/types.go` | ||
| 11 | +- **xiaohongshu-mcp 工具定义**: `/Users/zy/src/zy/xiaohongshu-mcp/mcp_server.go` | ||
| 12 | + | ||
| 13 | +## 架构 | ||
| 14 | + | ||
| 15 | +### 模块结构 | ||
| 16 | + | ||
| 17 | +``` | ||
| 18 | +scripts/ | ||
| 19 | +├── xhs/ # 核心 XHS 自动化包 | ||
| 20 | +│ ├── cdp.py # CDP WebSocket 客户端 | ||
| 21 | +│ ├── stealth.py # 反检测 JS 注入 + Chrome 启动参数 | ||
| 22 | +│ ├── cookies.py # Cookie 文件持久化 | ||
| 23 | +│ ├── types.py # 数据类型(dataclass) | ||
| 24 | +│ ├── errors.py # 异常体系 | ||
| 25 | +│ ├── selectors.py # CSS 选择器常量 | ||
| 26 | +│ ├── urls.py # URL 常量 | ||
| 27 | +│ ├── human.py # 人类行为模拟 | ||
| 28 | +│ ├── login.py # 登录 | ||
| 29 | +│ ├── feeds.py # 首页 Feed | ||
| 30 | +│ ├── search.py # 搜索 + 筛选 | ||
| 31 | +│ ├── feed_detail.py # 笔记详情 + 评论加载 | ||
| 32 | +│ ├── user_profile.py # 用户主页 | ||
| 33 | +│ ├── comment.py # 评论、回复 | ||
| 34 | +│ ├── like_favorite.py # 点赞、收藏 | ||
| 35 | +│ ├── publish.py # 图文发布 | ||
| 36 | +│ └── publish_video.py # 视频发布 | ||
| 37 | +├── cli.py # 统一 CLI 入口(13 个子命令) | ||
| 38 | +├── chrome_launcher.py # Chrome 进程管理 | ||
| 39 | +├── account_manager.py # 多账号管理 | ||
| 40 | +├── image_downloader.py # 媒体下载(SHA256 缓存) | ||
| 41 | +├── title_utils.py # UTF-16 标题长度计算 | ||
| 42 | +├── run_lock.py # 单实例锁 | ||
| 43 | +└── publish_pipeline.py # 发布编排器 | ||
| 44 | +``` | ||
| 45 | + | ||
| 46 | +### CLI 接口(对应 Go 的 13 个 MCP 工具) | ||
| 47 | + | ||
| 48 | +```bash | ||
| 49 | +python scripts/cli.py check-login | ||
| 50 | +python scripts/cli.py login | ||
| 51 | +python scripts/cli.py delete-cookies | ||
| 52 | +python scripts/cli.py list-feeds | ||
| 53 | +python scripts/cli.py search-feeds --keyword "关键词" [--sort-by --note-type ...] | ||
| 54 | +python scripts/cli.py get-feed-detail --feed-id ID --xsec-token TOKEN [--load-all-comments] | ||
| 55 | +python scripts/cli.py user-profile --user-id ID --xsec-token TOKEN | ||
| 56 | +python scripts/cli.py post-comment --feed-id ID --xsec-token TOKEN --content "内容" | ||
| 57 | +python scripts/cli.py reply-comment --feed-id ID --xsec-token TOKEN --content "内容" [--comment-id | --user-id] | ||
| 58 | +python scripts/cli.py like-feed --feed-id ID --xsec-token TOKEN [--unlike] | ||
| 59 | +python scripts/cli.py favorite-feed --feed-id ID --xsec-token TOKEN [--unfavorite] | ||
| 60 | +python scripts/cli.py publish --title-file T --content-file C --images P1 P2 [--tags --schedule-at --visibility] | ||
| 61 | +python scripts/cli.py publish-video --title-file T --content-file C --video P [--tags --schedule-at] | ||
| 62 | +``` | ||
| 63 | + | ||
| 64 | +全局选项:`--host`, `--port`, `--account` | ||
| 65 | +输出:JSON(`ensure_ascii=False`) | ||
| 66 | +退出码:0=成功,1=未登录,2=错误 | ||
| 67 | + | ||
| 68 | +## 代码规范要求 | ||
| 69 | + | ||
| 70 | +- Python 代码必须通过 `ruff check` 和 `ruff format` | ||
| 71 | +- 完整的 type hints(PEP 484),使用 `str | None` 而非 `Optional[str]` | ||
| 72 | +- 公共函数和类必须有 docstring | ||
| 73 | +- 行长度上限 100 字符 | ||
| 74 | +- 使用 `from __future__ import annotations` 启用延迟注解 | ||
| 75 | +- 异常类统一继承自 `XHSError` | ||
| 76 | +- CLI 使用 argparse,exit code: 0=成功,1=未登录,2=错误 | ||
| 77 | +- JSON 输出使用 `ensure_ascii=False` 保留中文 | ||
| 78 | + | ||
| 79 | +## 完成标志 | ||
| 80 | + | ||
| 81 | +当以下条件全部满足时,输出完成标志: | ||
| 82 | +1. `xhs/` 包 17 个模块已全部创建 | ||
| 83 | +2. `cli.py` 13 个子命令已实现 | ||
| 84 | +3. 5 个支撑脚本已重写 | ||
| 85 | +4. 5 个 `skills/*/SKILL.md` 已更新 | ||
| 86 | +5. 根目录 `SKILL.md`、`CLAUDE.md`、`README.md` 已更新 | ||
| 87 | +6. `uv run ruff check .` 无错误 | ||
| 88 | +7. `uv run ruff format --check .` 无差异 | ||
| 89 | + | ||
| 90 | +<promise>ALL SKILLS COMPLETE</promise> |
| 1 | # xiaohongshu-skills | 1 | # xiaohongshu-skills |
| 2 | -xiaohongshu-skills | 2 | + |
| 3 | +小红书自动化 Claude Code Skills,基于 Python CDP 浏览器自动化引擎。 | ||
| 4 | + | ||
| 5 | +为 OpenClaw 生态提供小红书操作能力,同时兼容 Claude Code Skills 格式。 | ||
| 6 | + | ||
| 7 | +## 功能概览 | ||
| 8 | + | ||
| 9 | +| 技能 | 说明 | 核心命令 | | ||
| 10 | +|------|------|----------| | ||
| 11 | +| **xhs-auth** | 认证管理 | `check-login`, `login`, `delete-cookies` | | ||
| 12 | +| **xhs-publish** | 内容发布 | `publish`, `publish-video` | | ||
| 13 | +| **xhs-explore** | 内容发现 | `list-feeds`, `search-feeds`, `get-feed-detail`, `user-profile` | | ||
| 14 | +| **xhs-interact** | 社交互动 | `post-comment`, `reply-comment`, `like-feed`, `favorite-feed` | | ||
| 15 | +| **xhs-content-ops** | 复合运营 | 竞品分析、热点追踪、内容创作、互动管理 | | ||
| 16 | + | ||
| 17 | +## 安装 | ||
| 18 | + | ||
| 19 | +```bash | ||
| 20 | +# 克隆项目 | ||
| 21 | +git clone https://github.com/autoclaw-cc/xiaohongshu-skills.git | ||
| 22 | +cd xiaohongshu-skills | ||
| 23 | + | ||
| 24 | +# 安装依赖(需要 uv) | ||
| 25 | +uv sync | ||
| 26 | +``` | ||
| 27 | + | ||
| 28 | +### 前置条件 | ||
| 29 | + | ||
| 30 | +- Python >= 3.11 | ||
| 31 | +- [uv](https://docs.astral.sh/uv/) 包管理器 | ||
| 32 | +- Google Chrome 浏览器 | ||
| 33 | + | ||
| 34 | +## 快速开始 | ||
| 35 | + | ||
| 36 | +### 1. 启动 Chrome | ||
| 37 | + | ||
| 38 | +```bash | ||
| 39 | +# 有窗口模式(推荐首次登录) | ||
| 40 | +python scripts/chrome_launcher.py | ||
| 41 | + | ||
| 42 | +# 无头模式 | ||
| 43 | +python scripts/chrome_launcher.py --headless | ||
| 44 | +``` | ||
| 45 | + | ||
| 46 | +### 2. 登录小红书 | ||
| 47 | + | ||
| 48 | +```bash | ||
| 49 | +# 检查登录状态 | ||
| 50 | +python scripts/cli.py check-login | ||
| 51 | + | ||
| 52 | +# 登录(扫码) | ||
| 53 | +python scripts/cli.py login | ||
| 54 | +``` | ||
| 55 | + | ||
| 56 | +### 3. 搜索笔记 | ||
| 57 | + | ||
| 58 | +```bash | ||
| 59 | +python scripts/cli.py search-feeds --keyword "关键词" | ||
| 60 | + | ||
| 61 | +# 带筛选 | ||
| 62 | +python scripts/cli.py search-feeds \ | ||
| 63 | + --keyword "关键词" --sort-by 最新 --note-type 图文 | ||
| 64 | +``` | ||
| 65 | + | ||
| 66 | +### 4. 查看笔记详情 | ||
| 67 | + | ||
| 68 | +```bash | ||
| 69 | +python scripts/cli.py get-feed-detail \ | ||
| 70 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 71 | +``` | ||
| 72 | + | ||
| 73 | +### 5. 发布内容 | ||
| 74 | + | ||
| 75 | +```bash | ||
| 76 | +# 图文发布 | ||
| 77 | +python scripts/cli.py publish \ | ||
| 78 | + --title-file title.txt \ | ||
| 79 | + --content-file content.txt \ | ||
| 80 | + --images "/abs/path/pic1.jpg" "/abs/path/pic2.jpg" | ||
| 81 | + | ||
| 82 | +# 视频发布 | ||
| 83 | +python scripts/cli.py publish-video \ | ||
| 84 | + --title-file title.txt \ | ||
| 85 | + --content-file content.txt \ | ||
| 86 | + --video "/abs/path/video.mp4" | ||
| 87 | +``` | ||
| 88 | + | ||
| 89 | +### 6. 社交互动 | ||
| 90 | + | ||
| 91 | +```bash | ||
| 92 | +# 发表评论 | ||
| 93 | +python scripts/cli.py post-comment \ | ||
| 94 | + --feed-id FEED_ID \ | ||
| 95 | + --xsec-token XSEC_TOKEN \ | ||
| 96 | + --content "评论内容" | ||
| 97 | + | ||
| 98 | +# 点赞 | ||
| 99 | +python scripts/cli.py like-feed \ | ||
| 100 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 101 | + | ||
| 102 | +# 收藏 | ||
| 103 | +python scripts/cli.py favorite-feed \ | ||
| 104 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 105 | +``` | ||
| 106 | + | ||
| 107 | +## CLI 命令参考 | ||
| 108 | + | ||
| 109 | +所有命令通过 `scripts/cli.py` 统一入口调用,输出 JSON 格式。 | ||
| 110 | + | ||
| 111 | +全局选项: | ||
| 112 | +- `--host HOST` — Chrome 调试主机(默认 127.0.0.1) | ||
| 113 | +- `--port PORT` — Chrome 调试端口(默认 9222) | ||
| 114 | +- `--account NAME` — 指定账号 | ||
| 115 | + | ||
| 116 | +| 子命令 | 说明 | | ||
| 117 | +|--------|------| | ||
| 118 | +| `check-login` | 检查登录状态 | | ||
| 119 | +| `login` | 获取登录二维码,等待扫码 | | ||
| 120 | +| `delete-cookies` | 清除 cookies | | ||
| 121 | +| `list-feeds` | 获取首页推荐 Feed | | ||
| 122 | +| `search-feeds` | 关键词搜索笔记 | | ||
| 123 | +| `get-feed-detail` | 获取笔记详情和评论 | | ||
| 124 | +| `user-profile` | 获取用户主页信息 | | ||
| 125 | +| `post-comment` | 对笔记发表评论 | | ||
| 126 | +| `reply-comment` | 回复指定评论 | | ||
| 127 | +| `like-feed` | 点赞 / 取消点赞 | | ||
| 128 | +| `favorite-feed` | 收藏 / 取消收藏 | | ||
| 129 | +| `publish` | 发布图文内容 | | ||
| 130 | +| `publish-video` | 发布视频内容 | | ||
| 131 | + | ||
| 132 | +退出码:0=成功,1=未登录,2=错误 | ||
| 133 | + | ||
| 134 | +## 项目结构 | ||
| 135 | + | ||
| 136 | +``` | ||
| 137 | +xiaohongshu-skills/ | ||
| 138 | +├── scripts/ # Python CDP 自动化引擎 | ||
| 139 | +│ ├── xhs/ # 核心自动化包(模块化) | ||
| 140 | +│ │ ├── cdp.py # CDP WebSocket 客户端 | ||
| 141 | +│ │ ├── stealth.py # 反检测保护 | ||
| 142 | +│ │ ├── cookies.py # Cookie 持久化 | ||
| 143 | +│ │ ├── types.py # 数据类型 | ||
| 144 | +│ │ ├── errors.py # 异常体系 | ||
| 145 | +│ │ ├── selectors.py # CSS 选择器 | ||
| 146 | +│ │ ├── urls.py # URL 常量 | ||
| 147 | +│ │ ├── human.py # 人类行为模拟 | ||
| 148 | +│ │ ├── login.py # 登录 | ||
| 149 | +│ │ ├── feeds.py # 首页 Feed | ||
| 150 | +│ │ ├── search.py # 搜索 | ||
| 151 | +│ │ ├── feed_detail.py # 笔记详情 | ||
| 152 | +│ │ ├── user_profile.py # 用户主页 | ||
| 153 | +│ │ ├── comment.py # 评论 | ||
| 154 | +│ │ ├── like_favorite.py # 点赞/收藏 | ||
| 155 | +│ │ ├── publish.py # 图文发布 | ||
| 156 | +│ │ └── publish_video.py # 视频发布 | ||
| 157 | +│ ├── cli.py # 统一 CLI(13 个子命令) | ||
| 158 | +│ ├── chrome_launcher.py # Chrome 进程管理 | ||
| 159 | +│ ├── account_manager.py # 多账号管理 | ||
| 160 | +│ ├── image_downloader.py # 媒体下载 | ||
| 161 | +│ ├── title_utils.py # 标题长度计算 | ||
| 162 | +│ ├── run_lock.py # 单实例锁 | ||
| 163 | +│ └── publish_pipeline.py # 发布编排器 | ||
| 164 | +├── skills/ # Claude Code Skills 定义 | ||
| 165 | +│ ├── xhs-auth/SKILL.md # 认证管理 | ||
| 166 | +│ ├── xhs-publish/SKILL.md # 内容发布 | ||
| 167 | +│ ├── xhs-explore/SKILL.md # 内容发现 | ||
| 168 | +│ ├── xhs-interact/SKILL.md # 社交互动 | ||
| 169 | +│ └── xhs-content-ops/SKILL.md # 复合运营 | ||
| 170 | +├── SKILL.md # 统一入口 | ||
| 171 | +├── CLAUDE.md # 项目开发指南 | ||
| 172 | +├── pyproject.toml # uv 项目配置 | ||
| 173 | +└── README.md | ||
| 174 | +``` | ||
| 175 | + | ||
| 176 | +## 技术架构 | ||
| 177 | + | ||
| 178 | +### 双层结构 | ||
| 179 | + | ||
| 180 | +1. **scripts/ — Python CDP 引擎** | ||
| 181 | + - 基于 xiaohongshu-mcp Go 源码从零重写 | ||
| 182 | + - 通过 Chrome DevTools Protocol (CDP) 直接控制浏览器 | ||
| 183 | + - 数据提取使用 `window.__INITIAL_STATE__` 模式 | ||
| 184 | + - 内置反检测保护(stealth flags + JS 注入) | ||
| 185 | + - JSON 结构化输出 | ||
| 186 | + | ||
| 187 | +2. **skills/ — Claude Code Skills 定义** | ||
| 188 | + - SKILL.md 格式,指导 AI agent 如何调用 scripts/ | ||
| 189 | + - 包含输入判断、约束规则、工作流程、失败处理 | ||
| 190 | + | ||
| 191 | +## 开发 | ||
| 192 | + | ||
| 193 | +```bash | ||
| 194 | +uv sync # 安装依赖 | ||
| 195 | +uv run ruff check . # Lint 检查 | ||
| 196 | +uv run ruff format . # 代码格式化 | ||
| 197 | +uv run pytest # 运行测试 | ||
| 198 | +``` |
SKILL.md
0 → 100644
| 1 | +--- | ||
| 2 | +name: xiaohongshu-skills | ||
| 3 | +description: | | ||
| 4 | + 小红书自动化技能集合。支持认证登录、内容发布、搜索发现、社交互动、复合运营。 | ||
| 5 | + 当用户要求操作小红书(发布、搜索、评论、登录、分析、点赞、收藏)时触发。 | ||
| 6 | +--- | ||
| 7 | + | ||
| 8 | +# 小红书自动化 Skills | ||
| 9 | + | ||
| 10 | +你是"小红书自动化助手"。根据用户意图路由到对应的子技能完成任务。 | ||
| 11 | + | ||
| 12 | +## 输入判断 | ||
| 13 | + | ||
| 14 | +按优先级判断用户意图,路由到对应子技能: | ||
| 15 | + | ||
| 16 | +1. **认证相关**("登录 / 检查登录 / 切换账号")→ 执行 `xhs-auth` 技能。 | ||
| 17 | +2. **内容发布**("发布 / 发帖 / 上传图文 / 上传视频")→ 执行 `xhs-publish` 技能。 | ||
| 18 | +3. **搜索发现**("搜索笔记 / 查看详情 / 浏览首页 / 查看用户")→ 执行 `xhs-explore` 技能。 | ||
| 19 | +4. **社交互动**("评论 / 回复 / 点赞 / 收藏")→ 执行 `xhs-interact` 技能。 | ||
| 20 | +5. **复合运营**("竞品分析 / 热点追踪 / 批量互动 / 一键创作")→ 执行 `xhs-content-ops` 技能。 | ||
| 21 | + | ||
| 22 | +## 全局约束 | ||
| 23 | + | ||
| 24 | +- 所有操作前应确认登录状态(通过 `check-login`)。 | ||
| 25 | +- 发布和评论操作必须经过用户确认后才能执行。 | ||
| 26 | +- 文件路径必须使用绝对路径。 | ||
| 27 | +- CLI 输出为 JSON 格式,结构化呈现给用户。 | ||
| 28 | +- 操作频率不宜过高,保持合理间隔。 | ||
| 29 | + | ||
| 30 | +## 子技能概览 | ||
| 31 | + | ||
| 32 | +### xhs-auth — 认证管理 | ||
| 33 | + | ||
| 34 | +管理小红书登录状态和多账号切换。 | ||
| 35 | + | ||
| 36 | +| 命令 | 功能 | | ||
| 37 | +|------|------| | ||
| 38 | +| `cli.py check-login` | 检查登录状态 | | ||
| 39 | +| `cli.py login` | 获取登录二维码,等待扫码 | | ||
| 40 | +| `cli.py delete-cookies` | 清除 cookies(退出/切换账号) | | ||
| 41 | + | ||
| 42 | +### xhs-publish — 内容发布 | ||
| 43 | + | ||
| 44 | +发布图文或视频内容到小红书。 | ||
| 45 | + | ||
| 46 | +| 命令 | 功能 | | ||
| 47 | +|------|------| | ||
| 48 | +| `cli.py publish` | 图文发布(本地图片或 URL) | | ||
| 49 | +| `cli.py publish-video` | 视频发布 | | ||
| 50 | +| `publish_pipeline.py` | 发布流水线(含图片下载和登录检查) | | ||
| 51 | + | ||
| 52 | +### xhs-explore — 内容发现 | ||
| 53 | + | ||
| 54 | +搜索笔记、查看详情、获取用户资料。 | ||
| 55 | + | ||
| 56 | +| 命令 | 功能 | | ||
| 57 | +|------|------| | ||
| 58 | +| `cli.py list-feeds` | 获取首页推荐 Feed | | ||
| 59 | +| `cli.py search-feeds` | 关键词搜索笔记 | | ||
| 60 | +| `cli.py get-feed-detail` | 获取笔记完整内容和评论 | | ||
| 61 | +| `cli.py user-profile` | 获取用户主页信息 | | ||
| 62 | + | ||
| 63 | +### xhs-interact — 社交互动 | ||
| 64 | + | ||
| 65 | +发表评论、回复、点赞、收藏。 | ||
| 66 | + | ||
| 67 | +| 命令 | 功能 | | ||
| 68 | +|------|------| | ||
| 69 | +| `cli.py post-comment` | 对笔记发表评论 | | ||
| 70 | +| `cli.py reply-comment` | 回复指定评论 | | ||
| 71 | +| `cli.py like-feed` | 点赞 / 取消点赞 | | ||
| 72 | +| `cli.py favorite-feed` | 收藏 / 取消收藏 | | ||
| 73 | + | ||
| 74 | +### xhs-content-ops — 复合运营 | ||
| 75 | + | ||
| 76 | +组合多步骤完成运营工作流:竞品分析、热点追踪、内容创作、互动管理。 | ||
| 77 | + | ||
| 78 | +## 快速开始 | ||
| 79 | + | ||
| 80 | +```bash | ||
| 81 | +# 1. 启动 Chrome | ||
| 82 | +python scripts/chrome_launcher.py | ||
| 83 | + | ||
| 84 | +# 2. 检查登录状态 | ||
| 85 | +python scripts/cli.py check-login | ||
| 86 | + | ||
| 87 | +# 3. 登录(如需要) | ||
| 88 | +python scripts/cli.py login | ||
| 89 | + | ||
| 90 | +# 4. 搜索笔记 | ||
| 91 | +python scripts/cli.py search-feeds --keyword "关键词" | ||
| 92 | + | ||
| 93 | +# 5. 查看笔记详情 | ||
| 94 | +python scripts/cli.py get-feed-detail \ | ||
| 95 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 96 | + | ||
| 97 | +# 6. 发布图文 | ||
| 98 | +python scripts/cli.py publish \ | ||
| 99 | + --title-file title.txt \ | ||
| 100 | + --content-file content.txt \ | ||
| 101 | + --images "/abs/path/pic1.jpg" | ||
| 102 | + | ||
| 103 | +# 7. 发表评论 | ||
| 104 | +python scripts/cli.py post-comment \ | ||
| 105 | + --feed-id FEED_ID \ | ||
| 106 | + --xsec-token XSEC_TOKEN \ | ||
| 107 | + --content "评论内容" | ||
| 108 | + | ||
| 109 | +# 8. 点赞 | ||
| 110 | +python scripts/cli.py like-feed \ | ||
| 111 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 112 | +``` | ||
| 113 | + | ||
| 114 | +## 失败处理 | ||
| 115 | + | ||
| 116 | +- **未登录**:提示用户执行登录流程(xhs-auth)。 | ||
| 117 | +- **Chrome 未启动**:使用 `chrome_launcher.py` 启动浏览器。 | ||
| 118 | +- **操作超时**:检查网络连接,适当增加等待时间。 | ||
| 119 | +- **频率限制**:降低操作频率,增大间隔。 |
pyproject.toml
0 → 100644
| 1 | +[project] | ||
| 2 | +name = "xiaohongshu-skills" | ||
| 3 | +version = "0.1.0" | ||
| 4 | +description = "小红书自动化 Skills,基于 CDP 浏览器自动化" | ||
| 5 | +readme = "README.md" | ||
| 6 | +license = { text = "MIT" } | ||
| 7 | +requires-python = ">=3.11" | ||
| 8 | +dependencies = [ | ||
| 9 | + "requests>=2.28.0", | ||
| 10 | + "websockets>=12.0", | ||
| 11 | +] | ||
| 12 | + | ||
| 13 | +[project.optional-dependencies] | ||
| 14 | +dev = [ | ||
| 15 | + "ruff>=0.9.0", | ||
| 16 | + "pytest>=8.0", | ||
| 17 | +] | ||
| 18 | + | ||
| 19 | +[tool.ruff] | ||
| 20 | +target-version = "py311" | ||
| 21 | +line-length = 100 | ||
| 22 | + | ||
| 23 | +[tool.ruff.lint] | ||
| 24 | +select = [ | ||
| 25 | + "E", # pycodestyle errors | ||
| 26 | + "W", # pycodestyle warnings | ||
| 27 | + "F", # pyflakes | ||
| 28 | + "I", # isort | ||
| 29 | + "N", # pep8-naming | ||
| 30 | + "UP", # pyupgrade | ||
| 31 | + "B", # flake8-bugbear | ||
| 32 | + "SIM", # flake8-simplify | ||
| 33 | + "RUF", # ruff-specific rules | ||
| 34 | +] | ||
| 35 | +ignore = [ | ||
| 36 | + "E402", # module-level imports not at top (needed for sys.path manipulation) | ||
| 37 | + "RUF001", # ambiguous unicode characters (Chinese punctuation is intentional) | ||
| 38 | + "RUF002", # ambiguous unicode in docstrings (Chinese punctuation is intentional) | ||
| 39 | + "RUF003", # ambiguous unicode in comments (Chinese punctuation is intentional) | ||
| 40 | +] | ||
| 41 | + | ||
| 42 | +[tool.ruff.lint.per-file-ignores] | ||
| 43 | + | ||
| 44 | +[tool.ruff.lint.isort] | ||
| 45 | +known-first-party = ["xiaohongshu_skills"] | ||
| 46 | + | ||
| 47 | +[tool.pytest.ini_options] | ||
| 48 | +testpaths = ["tests"] |
scripts/account_manager.py
0 → 100644
| 1 | +"""多账号管理,对应独立的账号配置管理。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import os | ||
| 8 | +from pathlib import Path | ||
| 9 | + | ||
| 10 | +logger = logging.getLogger(__name__) | ||
| 11 | + | ||
| 12 | +# 账号配置文件路径 | ||
| 13 | +_CONFIG_DIR = Path.home() / ".xhs" | ||
| 14 | +_ACCOUNTS_FILE = _CONFIG_DIR / "accounts.json" | ||
| 15 | + | ||
| 16 | + | ||
| 17 | +def _load_config() -> dict: | ||
| 18 | + """加载账号配置。""" | ||
| 19 | + if not _ACCOUNTS_FILE.exists(): | ||
| 20 | + return {"default": "", "accounts": {}} | ||
| 21 | + with open(_ACCOUNTS_FILE, encoding="utf-8") as f: | ||
| 22 | + return json.load(f) | ||
| 23 | + | ||
| 24 | + | ||
| 25 | +def _save_config(config: dict) -> None: | ||
| 26 | + """保存账号配置。""" | ||
| 27 | + _CONFIG_DIR.mkdir(parents=True, exist_ok=True) | ||
| 28 | + with open(_ACCOUNTS_FILE, "w", encoding="utf-8") as f: | ||
| 29 | + json.dump(config, f, ensure_ascii=False, indent=2) | ||
| 30 | + | ||
| 31 | + | ||
| 32 | +def list_accounts() -> list[dict]: | ||
| 33 | + """列出所有账号。""" | ||
| 34 | + config = _load_config() | ||
| 35 | + default = config.get("default", "") | ||
| 36 | + accounts = config.get("accounts", {}) | ||
| 37 | + result = [] | ||
| 38 | + for name, info in accounts.items(): | ||
| 39 | + result.append( | ||
| 40 | + { | ||
| 41 | + "name": name, | ||
| 42 | + "description": info.get("description", ""), | ||
| 43 | + "is_default": name == default, | ||
| 44 | + "profile_dir": _get_profile_dir(name), | ||
| 45 | + } | ||
| 46 | + ) | ||
| 47 | + return result | ||
| 48 | + | ||
| 49 | + | ||
| 50 | +def add_account(name: str, description: str = "") -> None: | ||
| 51 | + """添加账号。""" | ||
| 52 | + config = _load_config() | ||
| 53 | + accounts = config.setdefault("accounts", {}) | ||
| 54 | + if name in accounts: | ||
| 55 | + raise ValueError(f"账号 '{name}' 已存在") | ||
| 56 | + | ||
| 57 | + accounts[name] = {"description": description} | ||
| 58 | + | ||
| 59 | + # 如果是第一个账号,设为默认 | ||
| 60 | + if not config.get("default"): | ||
| 61 | + config["default"] = name | ||
| 62 | + | ||
| 63 | + _save_config(config) | ||
| 64 | + | ||
| 65 | + # 创建 Profile 目录 | ||
| 66 | + profile_dir = _get_profile_dir(name) | ||
| 67 | + os.makedirs(profile_dir, exist_ok=True) | ||
| 68 | + | ||
| 69 | + logger.info("添加账号: %s", name) | ||
| 70 | + | ||
| 71 | + | ||
| 72 | +def remove_account(name: str) -> None: | ||
| 73 | + """删除账号。""" | ||
| 74 | + config = _load_config() | ||
| 75 | + accounts = config.get("accounts", {}) | ||
| 76 | + if name not in accounts: | ||
| 77 | + raise ValueError(f"账号 '{name}' 不存在") | ||
| 78 | + | ||
| 79 | + del accounts[name] | ||
| 80 | + | ||
| 81 | + # 如果删除的是默认账号,清除默认 | ||
| 82 | + if config.get("default") == name: | ||
| 83 | + config["default"] = next(iter(accounts), "") | ||
| 84 | + | ||
| 85 | + _save_config(config) | ||
| 86 | + logger.info("删除账号: %s", name) | ||
| 87 | + | ||
| 88 | + | ||
| 89 | +def set_default_account(name: str) -> None: | ||
| 90 | + """设置默认账号。""" | ||
| 91 | + config = _load_config() | ||
| 92 | + accounts = config.get("accounts", {}) | ||
| 93 | + if name not in accounts: | ||
| 94 | + raise ValueError(f"账号 '{name}' 不存在") | ||
| 95 | + | ||
| 96 | + config["default"] = name | ||
| 97 | + _save_config(config) | ||
| 98 | + logger.info("默认账号设置为: %s", name) | ||
| 99 | + | ||
| 100 | + | ||
| 101 | +def get_default_account() -> str: | ||
| 102 | + """获取默认账号名称。""" | ||
| 103 | + config = _load_config() | ||
| 104 | + return config.get("default", "") | ||
| 105 | + | ||
| 106 | + | ||
| 107 | +def _get_profile_dir(account: str) -> str: | ||
| 108 | + """获取账号的 Chrome Profile 目录。""" | ||
| 109 | + return str(_CONFIG_DIR / "accounts" / account / "chrome-profile") |
scripts/chrome_launcher.py
0 → 100644
| 1 | +"""Chrome 进程管理(跨平台),对应 Go browser/browser.go 的进程管理部分。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import logging | ||
| 6 | +import os | ||
| 7 | +import platform | ||
| 8 | +import shutil | ||
| 9 | +import signal | ||
| 10 | +import subprocess | ||
| 11 | +import time | ||
| 12 | + | ||
| 13 | +from xhs.stealth import STEALTH_ARGS | ||
| 14 | + | ||
| 15 | +logger = logging.getLogger(__name__) | ||
| 16 | + | ||
| 17 | +# 默认远程调试端口 | ||
| 18 | +DEFAULT_PORT = 9222 | ||
| 19 | + | ||
| 20 | +# 各平台 Chrome 默认路径 | ||
| 21 | +_CHROME_PATHS: dict[str, list[str]] = { | ||
| 22 | + "Darwin": [ | ||
| 23 | + "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome", | ||
| 24 | + "/Applications/Chromium.app/Contents/MacOS/Chromium", | ||
| 25 | + ], | ||
| 26 | + "Linux": [ | ||
| 27 | + "/usr/bin/google-chrome", | ||
| 28 | + "/usr/bin/google-chrome-stable", | ||
| 29 | + "/usr/bin/chromium", | ||
| 30 | + "/usr/bin/chromium-browser", | ||
| 31 | + "/snap/bin/chromium", | ||
| 32 | + ], | ||
| 33 | + "Windows": [ | ||
| 34 | + r"C:\Program Files\Google\Chrome\Application\chrome.exe", | ||
| 35 | + r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe", | ||
| 36 | + ], | ||
| 37 | +} | ||
| 38 | + | ||
| 39 | + | ||
| 40 | +def find_chrome() -> str | None: | ||
| 41 | + """查找 Chrome 可执行文件路径。""" | ||
| 42 | + # 环境变量优先 | ||
| 43 | + env_path = os.getenv("CHROME_BIN") | ||
| 44 | + if env_path and os.path.isfile(env_path): | ||
| 45 | + return env_path | ||
| 46 | + | ||
| 47 | + # which/where 查找 | ||
| 48 | + chrome = shutil.which("google-chrome") or shutil.which("chromium") | ||
| 49 | + if chrome: | ||
| 50 | + return chrome | ||
| 51 | + | ||
| 52 | + # 平台默认路径 | ||
| 53 | + system = platform.system() | ||
| 54 | + for path in _CHROME_PATHS.get(system, []): | ||
| 55 | + if os.path.isfile(path): | ||
| 56 | + return path | ||
| 57 | + | ||
| 58 | + return None | ||
| 59 | + | ||
| 60 | + | ||
| 61 | +def launch_chrome( | ||
| 62 | + port: int = DEFAULT_PORT, | ||
| 63 | + headless: bool = False, | ||
| 64 | + user_data_dir: str | None = None, | ||
| 65 | + chrome_bin: str | None = None, | ||
| 66 | +) -> subprocess.Popen: | ||
| 67 | + """启动 Chrome 进程(带远程调试端口)。 | ||
| 68 | + | ||
| 69 | + Args: | ||
| 70 | + port: 远程调试端口。 | ||
| 71 | + headless: 是否无头模式。 | ||
| 72 | + user_data_dir: 用户数据目录(Profile 隔离)。 | ||
| 73 | + chrome_bin: Chrome 可执行文件路径。 | ||
| 74 | + | ||
| 75 | + Returns: | ||
| 76 | + Chrome 子进程。 | ||
| 77 | + | ||
| 78 | + Raises: | ||
| 79 | + FileNotFoundError: 未找到 Chrome。 | ||
| 80 | + """ | ||
| 81 | + if not chrome_bin: | ||
| 82 | + chrome_bin = find_chrome() | ||
| 83 | + if not chrome_bin: | ||
| 84 | + raise FileNotFoundError("未找到 Chrome,请设置 CHROME_BIN 环境变量或安装 Chrome") | ||
| 85 | + | ||
| 86 | + args = [ | ||
| 87 | + chrome_bin, | ||
| 88 | + f"--remote-debugging-port={port}", | ||
| 89 | + *STEALTH_ARGS, | ||
| 90 | + ] | ||
| 91 | + | ||
| 92 | + if headless: | ||
| 93 | + args.append("--headless=new") | ||
| 94 | + | ||
| 95 | + if user_data_dir: | ||
| 96 | + args.append(f"--user-data-dir={user_data_dir}") | ||
| 97 | + | ||
| 98 | + # 代理 | ||
| 99 | + proxy = os.getenv("XHS_PROXY") | ||
| 100 | + if proxy: | ||
| 101 | + args.append(f"--proxy-server={proxy}") | ||
| 102 | + logger.info("使用代理: %s", _mask_proxy(proxy)) | ||
| 103 | + | ||
| 104 | + logger.info("启动 Chrome: port=%d, headless=%s", port, headless) | ||
| 105 | + process = subprocess.Popen( | ||
| 106 | + args, | ||
| 107 | + stdout=subprocess.DEVNULL, | ||
| 108 | + stderr=subprocess.DEVNULL, | ||
| 109 | + ) | ||
| 110 | + | ||
| 111 | + # 等待 Chrome 准备就绪 | ||
| 112 | + _wait_for_chrome(port) | ||
| 113 | + return process | ||
| 114 | + | ||
| 115 | + | ||
| 116 | +def close_chrome(process: subprocess.Popen) -> None: | ||
| 117 | + """关闭 Chrome 进程。""" | ||
| 118 | + if process.poll() is not None: | ||
| 119 | + return | ||
| 120 | + | ||
| 121 | + try: | ||
| 122 | + process.send_signal(signal.SIGTERM) | ||
| 123 | + process.wait(timeout=5) | ||
| 124 | + except (subprocess.TimeoutExpired, OSError): | ||
| 125 | + process.kill() | ||
| 126 | + process.wait(timeout=3) | ||
| 127 | + | ||
| 128 | + logger.info("Chrome 进程已关闭") | ||
| 129 | + | ||
| 130 | + | ||
| 131 | +def is_chrome_running(port: int = DEFAULT_PORT) -> bool: | ||
| 132 | + """检查指定端口的 Chrome 是否在运行。""" | ||
| 133 | + import requests | ||
| 134 | + | ||
| 135 | + try: | ||
| 136 | + resp = requests.get(f"http://127.0.0.1:{port}/json/version", timeout=2) | ||
| 137 | + return resp.status_code == 200 | ||
| 138 | + except (requests.ConnectionError, requests.Timeout): | ||
| 139 | + return False | ||
| 140 | + | ||
| 141 | + | ||
| 142 | +def _wait_for_chrome(port: int, timeout: float = 15.0) -> None: | ||
| 143 | + """等待 Chrome 调试端口就绪。""" | ||
| 144 | + deadline = time.monotonic() + timeout | ||
| 145 | + while time.monotonic() < deadline: | ||
| 146 | + if is_chrome_running(port): | ||
| 147 | + logger.info("Chrome 已就绪 (port=%d)", port) | ||
| 148 | + return | ||
| 149 | + time.sleep(0.5) | ||
| 150 | + logger.warning("等待 Chrome 就绪超时 (port=%d)", port) | ||
| 151 | + | ||
| 152 | + | ||
| 153 | +def _mask_proxy(proxy_url: str) -> str: | ||
| 154 | + """隐藏代理 URL 中的敏感信息。""" | ||
| 155 | + from urllib.parse import urlparse | ||
| 156 | + | ||
| 157 | + try: | ||
| 158 | + parsed = urlparse(proxy_url) | ||
| 159 | + if parsed.username: | ||
| 160 | + return proxy_url.replace(parsed.username, "***").replace(parsed.password or "", "***") | ||
| 161 | + except Exception: | ||
| 162 | + pass | ||
| 163 | + return proxy_url |
scripts/cli.py
0 → 100644
| 1 | +"""统一 CLI 入口,对应 Go MCP 工具的 13 个子命令。 | ||
| 2 | + | ||
| 3 | +全局选项: --host, --port, --account | ||
| 4 | +输出: JSON(ensure_ascii=False) | ||
| 5 | +退出码: 0=成功, 1=未登录, 2=错误 | ||
| 6 | +""" | ||
| 7 | + | ||
| 8 | +from __future__ import annotations | ||
| 9 | + | ||
| 10 | +import argparse | ||
| 11 | +import json | ||
| 12 | +import logging | ||
| 13 | +import sys | ||
| 14 | + | ||
| 15 | +logging.basicConfig( | ||
| 16 | + level=logging.INFO, | ||
| 17 | + format="%(asctime)s %(levelname)s %(name)s: %(message)s", | ||
| 18 | +) | ||
| 19 | +logger = logging.getLogger("xhs-cli") | ||
| 20 | + | ||
| 21 | + | ||
| 22 | +def _output(data: dict, exit_code: int = 0) -> None: | ||
| 23 | + """输出 JSON 并退出。""" | ||
| 24 | + print(json.dumps(data, ensure_ascii=False, indent=2)) | ||
| 25 | + sys.exit(exit_code) | ||
| 26 | + | ||
| 27 | + | ||
| 28 | +def _connect(args: argparse.Namespace): | ||
| 29 | + """连接到 Chrome 并返回 (browser, page)。""" | ||
| 30 | + from xhs.cdp import Browser | ||
| 31 | + | ||
| 32 | + browser = Browser(host=args.host, port=args.port) | ||
| 33 | + browser.connect() | ||
| 34 | + page = browser.new_page() | ||
| 35 | + return browser, page | ||
| 36 | + | ||
| 37 | + | ||
| 38 | +# ========== 子命令实现 ========== | ||
| 39 | + | ||
| 40 | + | ||
| 41 | +def cmd_check_login(args: argparse.Namespace) -> None: | ||
| 42 | + """检查登录状态。""" | ||
| 43 | + from xhs.login import check_login_status | ||
| 44 | + | ||
| 45 | + browser, page = _connect(args) | ||
| 46 | + try: | ||
| 47 | + logged_in = check_login_status(page) | ||
| 48 | + _output({"logged_in": logged_in}, exit_code=0 if logged_in else 1) | ||
| 49 | + finally: | ||
| 50 | + browser.close_page(page) | ||
| 51 | + browser.close() | ||
| 52 | + | ||
| 53 | + | ||
| 54 | +def cmd_login(args: argparse.Namespace) -> None: | ||
| 55 | + """获取登录二维码并等待扫码。""" | ||
| 56 | + from xhs.login import fetch_qrcode, save_qrcode_to_file, wait_for_login | ||
| 57 | + | ||
| 58 | + browser, page = _connect(args) | ||
| 59 | + try: | ||
| 60 | + src, already = fetch_qrcode(page) | ||
| 61 | + if already: | ||
| 62 | + _output({"logged_in": True, "message": "已登录"}) | ||
| 63 | + else: | ||
| 64 | + # 保存二维码到临时文件 | ||
| 65 | + qrcode_path = save_qrcode_to_file(src) | ||
| 66 | + print( | ||
| 67 | + json.dumps( | ||
| 68 | + { | ||
| 69 | + "qrcode_path": qrcode_path, | ||
| 70 | + "message": "请扫码登录,二维码已保存到文件", | ||
| 71 | + }, | ||
| 72 | + ensure_ascii=False, | ||
| 73 | + ) | ||
| 74 | + ) | ||
| 75 | + success = wait_for_login(page, timeout=120) | ||
| 76 | + _output( | ||
| 77 | + {"logged_in": success, "message": "登录成功" if success else "登录超时"}, | ||
| 78 | + exit_code=0 if success else 2, | ||
| 79 | + ) | ||
| 80 | + finally: | ||
| 81 | + browser.close_page(page) | ||
| 82 | + browser.close() | ||
| 83 | + | ||
| 84 | + | ||
| 85 | +def cmd_delete_cookies(args: argparse.Namespace) -> None: | ||
| 86 | + """删除 cookies。""" | ||
| 87 | + from xhs.cookies import delete_cookies, get_cookies_file_path | ||
| 88 | + | ||
| 89 | + path = get_cookies_file_path(args.account) | ||
| 90 | + delete_cookies(path) | ||
| 91 | + _output({"success": True, "message": f"已删除 cookies: {path}"}) | ||
| 92 | + | ||
| 93 | + | ||
| 94 | +def cmd_list_feeds(args: argparse.Namespace) -> None: | ||
| 95 | + """获取首页 Feed 列表。""" | ||
| 96 | + from xhs.feeds import list_feeds | ||
| 97 | + | ||
| 98 | + browser, page = _connect(args) | ||
| 99 | + try: | ||
| 100 | + feeds = list_feeds(page) | ||
| 101 | + _output({"feeds": [f.to_dict() for f in feeds], "count": len(feeds)}) | ||
| 102 | + finally: | ||
| 103 | + browser.close_page(page) | ||
| 104 | + browser.close() | ||
| 105 | + | ||
| 106 | + | ||
| 107 | +def cmd_search_feeds(args: argparse.Namespace) -> None: | ||
| 108 | + """搜索 Feeds。""" | ||
| 109 | + from xhs.search import search_feeds | ||
| 110 | + from xhs.types import FilterOption | ||
| 111 | + | ||
| 112 | + filter_opt = FilterOption( | ||
| 113 | + sort_by=args.sort_by or "", | ||
| 114 | + note_type=args.note_type or "", | ||
| 115 | + publish_time=args.publish_time or "", | ||
| 116 | + search_scope=args.search_scope or "", | ||
| 117 | + location=args.location or "", | ||
| 118 | + ) | ||
| 119 | + | ||
| 120 | + browser, page = _connect(args) | ||
| 121 | + try: | ||
| 122 | + feeds = search_feeds(page, args.keyword, filter_opt) | ||
| 123 | + _output({"feeds": [f.to_dict() for f in feeds], "count": len(feeds)}) | ||
| 124 | + finally: | ||
| 125 | + browser.close_page(page) | ||
| 126 | + browser.close() | ||
| 127 | + | ||
| 128 | + | ||
| 129 | +def cmd_get_feed_detail(args: argparse.Namespace) -> None: | ||
| 130 | + """获取 Feed 详情。""" | ||
| 131 | + from xhs.feed_detail import get_feed_detail | ||
| 132 | + from xhs.types import CommentLoadConfig | ||
| 133 | + | ||
| 134 | + config = CommentLoadConfig( | ||
| 135 | + click_more_replies=args.click_more_replies, | ||
| 136 | + max_replies_threshold=args.max_replies_threshold, | ||
| 137 | + max_comment_items=args.max_comment_items, | ||
| 138 | + scroll_speed=args.scroll_speed, | ||
| 139 | + ) | ||
| 140 | + | ||
| 141 | + browser, page = _connect(args) | ||
| 142 | + try: | ||
| 143 | + detail = get_feed_detail( | ||
| 144 | + page, | ||
| 145 | + args.feed_id, | ||
| 146 | + args.xsec_token, | ||
| 147 | + load_all_comments=args.load_all_comments, | ||
| 148 | + config=config, | ||
| 149 | + ) | ||
| 150 | + _output(detail.to_dict()) | ||
| 151 | + finally: | ||
| 152 | + browser.close_page(page) | ||
| 153 | + browser.close() | ||
| 154 | + | ||
| 155 | + | ||
| 156 | +def cmd_user_profile(args: argparse.Namespace) -> None: | ||
| 157 | + """获取用户主页。""" | ||
| 158 | + from xhs.user_profile import get_user_profile | ||
| 159 | + | ||
| 160 | + browser, page = _connect(args) | ||
| 161 | + try: | ||
| 162 | + profile = get_user_profile(page, args.user_id, args.xsec_token) | ||
| 163 | + _output(profile.to_dict()) | ||
| 164 | + finally: | ||
| 165 | + browser.close_page(page) | ||
| 166 | + browser.close() | ||
| 167 | + | ||
| 168 | + | ||
| 169 | +def cmd_post_comment(args: argparse.Namespace) -> None: | ||
| 170 | + """发表评论。""" | ||
| 171 | + from xhs.comment import post_comment | ||
| 172 | + | ||
| 173 | + browser, page = _connect(args) | ||
| 174 | + try: | ||
| 175 | + post_comment(page, args.feed_id, args.xsec_token, args.content) | ||
| 176 | + _output({"success": True, "message": "评论发送成功"}) | ||
| 177 | + finally: | ||
| 178 | + browser.close_page(page) | ||
| 179 | + browser.close() | ||
| 180 | + | ||
| 181 | + | ||
| 182 | +def cmd_reply_comment(args: argparse.Namespace) -> None: | ||
| 183 | + """回复评论。""" | ||
| 184 | + from xhs.comment import reply_comment | ||
| 185 | + | ||
| 186 | + browser, page = _connect(args) | ||
| 187 | + try: | ||
| 188 | + reply_comment( | ||
| 189 | + page, | ||
| 190 | + args.feed_id, | ||
| 191 | + args.xsec_token, | ||
| 192 | + args.content, | ||
| 193 | + comment_id=args.comment_id or "", | ||
| 194 | + user_id=args.user_id or "", | ||
| 195 | + ) | ||
| 196 | + _output({"success": True, "message": "回复成功"}) | ||
| 197 | + finally: | ||
| 198 | + browser.close_page(page) | ||
| 199 | + browser.close() | ||
| 200 | + | ||
| 201 | + | ||
| 202 | +def cmd_like_feed(args: argparse.Namespace) -> None: | ||
| 203 | + """点赞/取消点赞。""" | ||
| 204 | + from xhs.like_favorite import like_feed, unlike_feed | ||
| 205 | + | ||
| 206 | + browser, page = _connect(args) | ||
| 207 | + try: | ||
| 208 | + if args.unlike: | ||
| 209 | + result = unlike_feed(page, args.feed_id, args.xsec_token) | ||
| 210 | + else: | ||
| 211 | + result = like_feed(page, args.feed_id, args.xsec_token) | ||
| 212 | + _output(result.to_dict()) | ||
| 213 | + finally: | ||
| 214 | + browser.close_page(page) | ||
| 215 | + browser.close() | ||
| 216 | + | ||
| 217 | + | ||
| 218 | +def cmd_favorite_feed(args: argparse.Namespace) -> None: | ||
| 219 | + """收藏/取消收藏。""" | ||
| 220 | + from xhs.like_favorite import favorite_feed, unfavorite_feed | ||
| 221 | + | ||
| 222 | + browser, page = _connect(args) | ||
| 223 | + try: | ||
| 224 | + if args.unfavorite: | ||
| 225 | + result = unfavorite_feed(page, args.feed_id, args.xsec_token) | ||
| 226 | + else: | ||
| 227 | + result = favorite_feed(page, args.feed_id, args.xsec_token) | ||
| 228 | + _output(result.to_dict()) | ||
| 229 | + finally: | ||
| 230 | + browser.close_page(page) | ||
| 231 | + browser.close() | ||
| 232 | + | ||
| 233 | + | ||
| 234 | +def cmd_publish(args: argparse.Namespace) -> None: | ||
| 235 | + """发布图文内容。""" | ||
| 236 | + from image_downloader import process_images | ||
| 237 | + from xhs.publish import publish_image_content | ||
| 238 | + from xhs.types import PublishImageContent | ||
| 239 | + | ||
| 240 | + # 读取标题和正文 | ||
| 241 | + with open(args.title_file, encoding="utf-8") as f: | ||
| 242 | + title = f.read().strip() | ||
| 243 | + with open(args.content_file, encoding="utf-8") as f: | ||
| 244 | + content = f.read().strip() | ||
| 245 | + | ||
| 246 | + # 处理图片 | ||
| 247 | + image_paths = process_images(args.images) if args.images else [] | ||
| 248 | + if not image_paths: | ||
| 249 | + _output({"success": False, "error": "没有有效的图片"}, exit_code=2) | ||
| 250 | + | ||
| 251 | + browser, page = _connect(args) | ||
| 252 | + try: | ||
| 253 | + publish_image_content( | ||
| 254 | + page, | ||
| 255 | + PublishImageContent( | ||
| 256 | + title=title, | ||
| 257 | + content=content, | ||
| 258 | + tags=args.tags or [], | ||
| 259 | + image_paths=image_paths, | ||
| 260 | + schedule_time=args.schedule_at, | ||
| 261 | + is_original=args.original, | ||
| 262 | + visibility=args.visibility or "", | ||
| 263 | + ), | ||
| 264 | + ) | ||
| 265 | + _output({"success": True, "title": title, "images": len(image_paths), "status": "发布完成"}) | ||
| 266 | + finally: | ||
| 267 | + browser.close_page(page) | ||
| 268 | + browser.close() | ||
| 269 | + | ||
| 270 | + | ||
| 271 | +def cmd_publish_video(args: argparse.Namespace) -> None: | ||
| 272 | + """发布视频内容。""" | ||
| 273 | + from xhs.publish_video import publish_video_content | ||
| 274 | + from xhs.types import PublishVideoContent | ||
| 275 | + | ||
| 276 | + with open(args.title_file, encoding="utf-8") as f: | ||
| 277 | + title = f.read().strip() | ||
| 278 | + with open(args.content_file, encoding="utf-8") as f: | ||
| 279 | + content = f.read().strip() | ||
| 280 | + | ||
| 281 | + browser, page = _connect(args) | ||
| 282 | + try: | ||
| 283 | + publish_video_content( | ||
| 284 | + page, | ||
| 285 | + PublishVideoContent( | ||
| 286 | + title=title, | ||
| 287 | + content=content, | ||
| 288 | + tags=args.tags or [], | ||
| 289 | + video_path=args.video, | ||
| 290 | + schedule_time=args.schedule_at, | ||
| 291 | + visibility=args.visibility or "", | ||
| 292 | + ), | ||
| 293 | + ) | ||
| 294 | + _output({"success": True, "title": title, "video": args.video, "status": "发布完成"}) | ||
| 295 | + finally: | ||
| 296 | + browser.close_page(page) | ||
| 297 | + browser.close() | ||
| 298 | + | ||
| 299 | + | ||
| 300 | +# ========== 参数解析 ========== | ||
| 301 | + | ||
| 302 | + | ||
| 303 | +def build_parser() -> argparse.ArgumentParser: | ||
| 304 | + """构建 CLI 参数解析器。""" | ||
| 305 | + parser = argparse.ArgumentParser( | ||
| 306 | + prog="xhs-cli", | ||
| 307 | + description="小红书自动化 CLI", | ||
| 308 | + ) | ||
| 309 | + | ||
| 310 | + # 全局选项 | ||
| 311 | + parser.add_argument("--host", default="127.0.0.1", help="Chrome 调试主机 (default: 127.0.0.1)") | ||
| 312 | + parser.add_argument("--port", type=int, default=9222, help="Chrome 调试端口 (default: 9222)") | ||
| 313 | + parser.add_argument("--account", default="", help="账号名称") | ||
| 314 | + | ||
| 315 | + subparsers = parser.add_subparsers(dest="command", required=True) | ||
| 316 | + | ||
| 317 | + # check-login | ||
| 318 | + sub = subparsers.add_parser("check-login", help="检查登录状态") | ||
| 319 | + sub.set_defaults(func=cmd_check_login) | ||
| 320 | + | ||
| 321 | + # login | ||
| 322 | + sub = subparsers.add_parser("login", help="登录(扫码)") | ||
| 323 | + sub.set_defaults(func=cmd_login) | ||
| 324 | + | ||
| 325 | + # delete-cookies | ||
| 326 | + sub = subparsers.add_parser("delete-cookies", help="删除 cookies") | ||
| 327 | + sub.set_defaults(func=cmd_delete_cookies) | ||
| 328 | + | ||
| 329 | + # list-feeds | ||
| 330 | + sub = subparsers.add_parser("list-feeds", help="获取首页 Feed 列表") | ||
| 331 | + sub.set_defaults(func=cmd_list_feeds) | ||
| 332 | + | ||
| 333 | + # search-feeds | ||
| 334 | + sub = subparsers.add_parser("search-feeds", help="搜索 Feeds") | ||
| 335 | + sub.add_argument("--keyword", required=True, help="搜索关键词") | ||
| 336 | + sub.add_argument("--sort-by", help="排序: 综合|最新|最多点赞|最多评论|最多收藏") | ||
| 337 | + sub.add_argument("--note-type", help="类型: 不限|视频|图文") | ||
| 338 | + sub.add_argument("--publish-time", help="时间: 不限|一天内|一周内|半年内") | ||
| 339 | + sub.add_argument("--search-scope", help="范围: 不限|已看过|未看过|已关注") | ||
| 340 | + sub.add_argument("--location", help="位置: 不限|同城|附近") | ||
| 341 | + sub.set_defaults(func=cmd_search_feeds) | ||
| 342 | + | ||
| 343 | + # get-feed-detail | ||
| 344 | + sub = subparsers.add_parser("get-feed-detail", help="获取 Feed 详情") | ||
| 345 | + sub.add_argument("--feed-id", required=True, help="Feed ID") | ||
| 346 | + sub.add_argument("--xsec-token", required=True, help="xsec_token") | ||
| 347 | + sub.add_argument("--load-all-comments", action="store_true", help="加载全部评论") | ||
| 348 | + sub.add_argument("--click-more-replies", action="store_true", help="点击展开更多回复") | ||
| 349 | + sub.add_argument("--max-replies-threshold", type=int, default=10, help="展开回复数阈值") | ||
| 350 | + sub.add_argument("--max-comment-items", type=int, default=0, help="最大评论数 (0=不限)") | ||
| 351 | + sub.add_argument("--scroll-speed", default="normal", help="滚动速度: slow|normal|fast") | ||
| 352 | + sub.set_defaults(func=cmd_get_feed_detail) | ||
| 353 | + | ||
| 354 | + # user-profile | ||
| 355 | + sub = subparsers.add_parser("user-profile", help="获取用户主页") | ||
| 356 | + sub.add_argument("--user-id", required=True, help="用户 ID") | ||
| 357 | + sub.add_argument("--xsec-token", required=True, help="xsec_token") | ||
| 358 | + sub.set_defaults(func=cmd_user_profile) | ||
| 359 | + | ||
| 360 | + # post-comment | ||
| 361 | + sub = subparsers.add_parser("post-comment", help="发表评论") | ||
| 362 | + sub.add_argument("--feed-id", required=True, help="Feed ID") | ||
| 363 | + sub.add_argument("--xsec-token", required=True, help="xsec_token") | ||
| 364 | + sub.add_argument("--content", required=True, help="评论内容") | ||
| 365 | + sub.set_defaults(func=cmd_post_comment) | ||
| 366 | + | ||
| 367 | + # reply-comment | ||
| 368 | + sub = subparsers.add_parser("reply-comment", help="回复评论") | ||
| 369 | + sub.add_argument("--feed-id", required=True, help="Feed ID") | ||
| 370 | + sub.add_argument("--xsec-token", required=True, help="xsec_token") | ||
| 371 | + sub.add_argument("--content", required=True, help="回复内容") | ||
| 372 | + sub.add_argument("--comment-id", help="目标评论 ID") | ||
| 373 | + sub.add_argument("--user-id", help="目标用户 ID") | ||
| 374 | + sub.set_defaults(func=cmd_reply_comment) | ||
| 375 | + | ||
| 376 | + # like-feed | ||
| 377 | + sub = subparsers.add_parser("like-feed", help="点赞") | ||
| 378 | + sub.add_argument("--feed-id", required=True, help="Feed ID") | ||
| 379 | + sub.add_argument("--xsec-token", required=True, help="xsec_token") | ||
| 380 | + sub.add_argument("--unlike", action="store_true", help="取消点赞") | ||
| 381 | + sub.set_defaults(func=cmd_like_feed) | ||
| 382 | + | ||
| 383 | + # favorite-feed | ||
| 384 | + sub = subparsers.add_parser("favorite-feed", help="收藏") | ||
| 385 | + sub.add_argument("--feed-id", required=True, help="Feed ID") | ||
| 386 | + sub.add_argument("--xsec-token", required=True, help="xsec_token") | ||
| 387 | + sub.add_argument("--unfavorite", action="store_true", help="取消收藏") | ||
| 388 | + sub.set_defaults(func=cmd_favorite_feed) | ||
| 389 | + | ||
| 390 | + # publish | ||
| 391 | + sub = subparsers.add_parser("publish", help="发布图文") | ||
| 392 | + sub.add_argument("--title-file", required=True, help="标题文件路径") | ||
| 393 | + sub.add_argument("--content-file", required=True, help="正文文件路径") | ||
| 394 | + sub.add_argument("--images", nargs="+", required=True, help="图片路径/URL") | ||
| 395 | + sub.add_argument("--tags", nargs="*", help="标签") | ||
| 396 | + sub.add_argument("--schedule-at", help="定时发布 (ISO8601)") | ||
| 397 | + sub.add_argument("--original", action="store_true", help="声明原创") | ||
| 398 | + sub.add_argument("--visibility", help="可见范围") | ||
| 399 | + sub.set_defaults(func=cmd_publish) | ||
| 400 | + | ||
| 401 | + # publish-video | ||
| 402 | + sub = subparsers.add_parser("publish-video", help="发布视频") | ||
| 403 | + sub.add_argument("--title-file", required=True, help="标题文件路径") | ||
| 404 | + sub.add_argument("--content-file", required=True, help="正文文件路径") | ||
| 405 | + sub.add_argument("--video", required=True, help="视频文件路径") | ||
| 406 | + sub.add_argument("--tags", nargs="*", help="标签") | ||
| 407 | + sub.add_argument("--schedule-at", help="定时发布 (ISO8601)") | ||
| 408 | + sub.add_argument("--visibility", help="可见范围") | ||
| 409 | + sub.set_defaults(func=cmd_publish_video) | ||
| 410 | + | ||
| 411 | + return parser | ||
| 412 | + | ||
| 413 | + | ||
| 414 | +def main() -> None: | ||
| 415 | + """CLI 入口。""" | ||
| 416 | + parser = build_parser() | ||
| 417 | + args = parser.parse_args() | ||
| 418 | + | ||
| 419 | + try: | ||
| 420 | + args.func(args) | ||
| 421 | + except Exception as e: | ||
| 422 | + logger.error("执行失败: %s", e, exc_info=True) | ||
| 423 | + _output({"success": False, "error": str(e)}, exit_code=2) | ||
| 424 | + | ||
| 425 | + | ||
| 426 | +if __name__ == "__main__": | ||
| 427 | + main() |
scripts/image_downloader.py
0 → 100644
| 1 | +"""媒体下载(SHA256 缓存),对应 Go pkg/downloader/images.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import hashlib | ||
| 6 | +import logging | ||
| 7 | +import os | ||
| 8 | +import time | ||
| 9 | +from urllib.parse import urlparse | ||
| 10 | + | ||
| 11 | +import requests | ||
| 12 | + | ||
| 13 | +logger = logging.getLogger(__name__) | ||
| 14 | + | ||
| 15 | +_USER_AGENT = ( | ||
| 16 | + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) " | ||
| 17 | + "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" | ||
| 18 | +) | ||
| 19 | + | ||
| 20 | +# 已知图片扩展名 | ||
| 21 | +_IMAGE_EXTENSIONS = {".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp", ".svg"} | ||
| 22 | + | ||
| 23 | + | ||
| 24 | +def is_image_url(path: str) -> bool: | ||
| 25 | + """判断字符串是否为图片/媒体 URL。""" | ||
| 26 | + return path.lower().startswith(("http://", "https://")) | ||
| 27 | + | ||
| 28 | + | ||
| 29 | +class ImageDownloader: | ||
| 30 | + """图片下载器(带 SHA256 缓存)。""" | ||
| 31 | + | ||
| 32 | + def __init__(self, save_path: str) -> None: | ||
| 33 | + self.save_path = save_path | ||
| 34 | + os.makedirs(save_path, exist_ok=True) | ||
| 35 | + self._session = requests.Session() | ||
| 36 | + self._session.timeout = 30 | ||
| 37 | + | ||
| 38 | + def download_image(self, image_url: str) -> str: | ||
| 39 | + """下载单张图片,返回本地文件路径。 | ||
| 40 | + | ||
| 41 | + 如果文件已存在(通过 URL hash 判断),直接返回路径。 | ||
| 42 | + | ||
| 43 | + Raises: | ||
| 44 | + ValueError: URL 格式无效。 | ||
| 45 | + RuntimeError: 下载失败。 | ||
| 46 | + """ | ||
| 47 | + if not is_image_url(image_url): | ||
| 48 | + raise ValueError(f"无效的图片 URL: {image_url}") | ||
| 49 | + | ||
| 50 | + # 生成文件名 | ||
| 51 | + url_hash = hashlib.sha256(image_url.encode()).hexdigest()[:16] | ||
| 52 | + ext = self._detect_extension(image_url) | ||
| 53 | + filename = f"img_{url_hash}_{int(time.time())}{ext}" | ||
| 54 | + filepath = os.path.join(self.save_path, filename) | ||
| 55 | + | ||
| 56 | + # 检查是否已有同 hash 的文件 | ||
| 57 | + existing = self._find_existing(url_hash) | ||
| 58 | + if existing: | ||
| 59 | + return existing | ||
| 60 | + | ||
| 61 | + # 下载 | ||
| 62 | + parsed = urlparse(image_url) | ||
| 63 | + headers = { | ||
| 64 | + "User-Agent": _USER_AGENT, | ||
| 65 | + "Referer": f"{parsed.scheme}://{parsed.hostname}/", | ||
| 66 | + } | ||
| 67 | + | ||
| 68 | + resp = self._session.get(image_url, headers=headers) | ||
| 69 | + if resp.status_code != 200: | ||
| 70 | + raise RuntimeError(f"下载失败 (status={resp.status_code}): {image_url}") | ||
| 71 | + | ||
| 72 | + # 保存 | ||
| 73 | + with open(filepath, "wb") as f: | ||
| 74 | + f.write(resp.content) | ||
| 75 | + | ||
| 76 | + logger.info("下载完成: %s -> %s", image_url, filepath) | ||
| 77 | + return filepath | ||
| 78 | + | ||
| 79 | + def download_images(self, image_urls: list[str]) -> list[str]: | ||
| 80 | + """批量下载图片。""" | ||
| 81 | + paths = [] | ||
| 82 | + for url in image_urls: | ||
| 83 | + try: | ||
| 84 | + path = self.download_image(url) | ||
| 85 | + paths.append(path) | ||
| 86 | + except Exception as e: | ||
| 87 | + logger.error("下载失败 %s: %s", url, e) | ||
| 88 | + return paths | ||
| 89 | + | ||
| 90 | + def _detect_extension(self, url: str) -> str: | ||
| 91 | + """从 URL 推断文件扩展名。""" | ||
| 92 | + parsed = urlparse(url) | ||
| 93 | + path = parsed.path.lower() | ||
| 94 | + for ext in _IMAGE_EXTENSIONS: | ||
| 95 | + if path.endswith(ext): | ||
| 96 | + return ext | ||
| 97 | + return ".jpg" # 默认 | ||
| 98 | + | ||
| 99 | + def _find_existing(self, url_hash: str) -> str | None: | ||
| 100 | + """查找已有同 hash 的文件。""" | ||
| 101 | + prefix = f"img_{url_hash}_" | ||
| 102 | + for filename in os.listdir(self.save_path): | ||
| 103 | + if filename.startswith(prefix): | ||
| 104 | + return os.path.join(self.save_path, filename) | ||
| 105 | + return None | ||
| 106 | + | ||
| 107 | + | ||
| 108 | +def process_images(images: list[str], save_dir: str | None = None) -> list[str]: | ||
| 109 | + """处理图片列表(URL 下载,本地路径直接返回)。""" | ||
| 110 | + if not save_dir: | ||
| 111 | + save_dir = os.path.join(os.path.expanduser("~"), ".xhs", "images") | ||
| 112 | + | ||
| 113 | + downloader = ImageDownloader(save_dir) | ||
| 114 | + result = [] | ||
| 115 | + | ||
| 116 | + for img in images: | ||
| 117 | + if is_image_url(img): | ||
| 118 | + path = downloader.download_image(img) | ||
| 119 | + result.append(path) | ||
| 120 | + else: | ||
| 121 | + # 本地路径 | ||
| 122 | + if os.path.exists(img): | ||
| 123 | + result.append(os.path.abspath(img)) | ||
| 124 | + else: | ||
| 125 | + logger.warning("文件不存在: %s", img) | ||
| 126 | + | ||
| 127 | + return result |
scripts/publish_pipeline.py
0 → 100644
| 1 | +"""发布编排器:下载 → 登录检查 → 发布 → 报告。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import sys | ||
| 8 | + | ||
| 9 | +from image_downloader import process_images | ||
| 10 | +from title_utils import calc_title_length | ||
| 11 | +from xhs.cdp import Browser | ||
| 12 | +from xhs.login import check_login_status | ||
| 13 | +from xhs.publish import publish_image_content | ||
| 14 | +from xhs.publish_video import publish_video_content | ||
| 15 | +from xhs.types import PublishImageContent, PublishVideoContent | ||
| 16 | + | ||
| 17 | +logger = logging.getLogger(__name__) | ||
| 18 | + | ||
| 19 | + | ||
| 20 | +def run_publish_pipeline( | ||
| 21 | + title: str, | ||
| 22 | + content: str, | ||
| 23 | + images: list[str] | None = None, | ||
| 24 | + video: str | None = None, | ||
| 25 | + tags: list[str] | None = None, | ||
| 26 | + schedule_time: str | None = None, | ||
| 27 | + is_original: bool = False, | ||
| 28 | + visibility: str = "", | ||
| 29 | + host: str = "127.0.0.1", | ||
| 30 | + port: int = 9222, | ||
| 31 | + account: str = "", | ||
| 32 | +) -> dict: | ||
| 33 | + """执行完整发布流水线。 | ||
| 34 | + | ||
| 35 | + Returns: | ||
| 36 | + 发布结果字典。 | ||
| 37 | + """ | ||
| 38 | + # 标题长度校验 | ||
| 39 | + title_len = calc_title_length(title) | ||
| 40 | + if title_len > 20: | ||
| 41 | + return {"success": False, "error": f"标题长度超限: {title_len}/20"} | ||
| 42 | + | ||
| 43 | + # 处理图片(下载 URL / 验证本地路径) | ||
| 44 | + local_images: list[str] = [] | ||
| 45 | + if images: | ||
| 46 | + local_images = process_images(images) | ||
| 47 | + if not local_images: | ||
| 48 | + return {"success": False, "error": "没有有效的图片"} | ||
| 49 | + | ||
| 50 | + # 连接浏览器 | ||
| 51 | + browser = Browser(host=host, port=port) | ||
| 52 | + browser.connect() | ||
| 53 | + | ||
| 54 | + try: | ||
| 55 | + page = browser.new_page() | ||
| 56 | + try: | ||
| 57 | + # 登录检查 | ||
| 58 | + if not check_login_status(page): | ||
| 59 | + return {"success": False, "error": "未登录", "exit_code": 1} | ||
| 60 | + | ||
| 61 | + # 发布 | ||
| 62 | + if video: | ||
| 63 | + publish_video_content( | ||
| 64 | + page, | ||
| 65 | + PublishVideoContent( | ||
| 66 | + title=title, | ||
| 67 | + content=content, | ||
| 68 | + tags=tags or [], | ||
| 69 | + video_path=video, | ||
| 70 | + schedule_time=schedule_time, | ||
| 71 | + visibility=visibility, | ||
| 72 | + ), | ||
| 73 | + ) | ||
| 74 | + else: | ||
| 75 | + publish_image_content( | ||
| 76 | + page, | ||
| 77 | + PublishImageContent( | ||
| 78 | + title=title, | ||
| 79 | + content=content, | ||
| 80 | + tags=tags or [], | ||
| 81 | + image_paths=local_images, | ||
| 82 | + schedule_time=schedule_time, | ||
| 83 | + is_original=is_original, | ||
| 84 | + visibility=visibility, | ||
| 85 | + ), | ||
| 86 | + ) | ||
| 87 | + | ||
| 88 | + return { | ||
| 89 | + "success": True, | ||
| 90 | + "title": title, | ||
| 91 | + "content_length": len(content), | ||
| 92 | + "images": len(local_images), | ||
| 93 | + "video": video or "", | ||
| 94 | + "status": "发布完成", | ||
| 95 | + } | ||
| 96 | + | ||
| 97 | + finally: | ||
| 98 | + browser.close_page(page) | ||
| 99 | + finally: | ||
| 100 | + browser.close() | ||
| 101 | + | ||
| 102 | + | ||
| 103 | +def main() -> None: | ||
| 104 | + """CLI 入口(被 cli.py 的 publish/publish-video 子命令调用时使用)。""" | ||
| 105 | + import argparse | ||
| 106 | + | ||
| 107 | + parser = argparse.ArgumentParser(description="小红书发布流水线") | ||
| 108 | + parser.add_argument("--title-file", required=True, help="标题文件路径") | ||
| 109 | + parser.add_argument("--content-file", required=True, help="正文文件路径") | ||
| 110 | + parser.add_argument("--images", nargs="*", help="图片路径或 URL 列表") | ||
| 111 | + parser.add_argument("--video", help="视频文件路径") | ||
| 112 | + parser.add_argument("--tags", nargs="*", help="标签列表") | ||
| 113 | + parser.add_argument("--schedule-at", help="定时发布时间 (ISO8601)") | ||
| 114 | + parser.add_argument("--original", action="store_true", help="声明原创") | ||
| 115 | + parser.add_argument("--visibility", default="", help="可见范围") | ||
| 116 | + parser.add_argument("--host", default="127.0.0.1") | ||
| 117 | + parser.add_argument("--port", type=int, default=9222) | ||
| 118 | + parser.add_argument("--account", default="") | ||
| 119 | + args = parser.parse_args() | ||
| 120 | + | ||
| 121 | + # 读取标题和正文 | ||
| 122 | + with open(args.title_file, encoding="utf-8") as f: | ||
| 123 | + title = f.read().strip() | ||
| 124 | + with open(args.content_file, encoding="utf-8") as f: | ||
| 125 | + content = f.read().strip() | ||
| 126 | + | ||
| 127 | + result = run_publish_pipeline( | ||
| 128 | + title=title, | ||
| 129 | + content=content, | ||
| 130 | + images=args.images, | ||
| 131 | + video=args.video, | ||
| 132 | + tags=args.tags, | ||
| 133 | + schedule_time=args.schedule_at, | ||
| 134 | + is_original=args.original, | ||
| 135 | + visibility=args.visibility, | ||
| 136 | + host=args.host, | ||
| 137 | + port=args.port, | ||
| 138 | + account=args.account, | ||
| 139 | + ) | ||
| 140 | + | ||
| 141 | + print(json.dumps(result, ensure_ascii=False, indent=2)) | ||
| 142 | + sys.exit(0 if result["success"] else 2) | ||
| 143 | + | ||
| 144 | + | ||
| 145 | +if __name__ == "__main__": | ||
| 146 | + main() |
scripts/run_lock.py
0 → 100644
| 1 | +"""单实例锁,防止多个进程同时操作浏览器。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import contextlib | ||
| 6 | +import logging | ||
| 7 | +import os | ||
| 8 | +import time | ||
| 9 | + | ||
| 10 | +logger = logging.getLogger(__name__) | ||
| 11 | + | ||
| 12 | +_DEFAULT_LOCK_FILE = os.path.join(os.path.expanduser("~"), ".xhs", "run.lock") | ||
| 13 | + | ||
| 14 | + | ||
| 15 | +class RunLock: | ||
| 16 | + """文件锁,确保同一时间只有一个进程在操作。""" | ||
| 17 | + | ||
| 18 | + def __init__(self, lock_file: str = _DEFAULT_LOCK_FILE) -> None: | ||
| 19 | + self.lock_file = lock_file | ||
| 20 | + self._fd: int | None = None | ||
| 21 | + | ||
| 22 | + def acquire(self, timeout: float = 30.0) -> bool: | ||
| 23 | + """获取锁。 | ||
| 24 | + | ||
| 25 | + Args: | ||
| 26 | + timeout: 超时时间(秒)。 | ||
| 27 | + | ||
| 28 | + Returns: | ||
| 29 | + True 获取成功,False 超时。 | ||
| 30 | + """ | ||
| 31 | + os.makedirs(os.path.dirname(self.lock_file), exist_ok=True) | ||
| 32 | + deadline = time.monotonic() + timeout | ||
| 33 | + | ||
| 34 | + while time.monotonic() < deadline: | ||
| 35 | + try: | ||
| 36 | + self._fd = os.open( | ||
| 37 | + self.lock_file, | ||
| 38 | + os.O_CREAT | os.O_EXCL | os.O_WRONLY, | ||
| 39 | + ) | ||
| 40 | + # 写入 PID | ||
| 41 | + os.write(self._fd, str(os.getpid()).encode()) | ||
| 42 | + logger.debug("获取锁成功: %s", self.lock_file) | ||
| 43 | + return True | ||
| 44 | + except FileExistsError: | ||
| 45 | + # 检查持有者是否还活着 | ||
| 46 | + if self._is_stale(): | ||
| 47 | + self._force_release() | ||
| 48 | + continue | ||
| 49 | + time.sleep(1) | ||
| 50 | + | ||
| 51 | + logger.warning("获取锁超时: %s", self.lock_file) | ||
| 52 | + return False | ||
| 53 | + | ||
| 54 | + def release(self) -> None: | ||
| 55 | + """释放锁。""" | ||
| 56 | + if self._fd is not None: | ||
| 57 | + with contextlib.suppress(OSError): | ||
| 58 | + os.close(self._fd) | ||
| 59 | + self._fd = None | ||
| 60 | + | ||
| 61 | + with contextlib.suppress(FileNotFoundError): | ||
| 62 | + os.remove(self.lock_file) | ||
| 63 | + | ||
| 64 | + logger.debug("释放锁: %s", self.lock_file) | ||
| 65 | + | ||
| 66 | + def _is_stale(self) -> bool: | ||
| 67 | + """检查锁文件是否已过时(持有进程已退出)。""" | ||
| 68 | + try: | ||
| 69 | + with open(self.lock_file) as f: | ||
| 70 | + pid = int(f.read().strip()) | ||
| 71 | + # 检查进程是否存在 | ||
| 72 | + os.kill(pid, 0) | ||
| 73 | + return False | ||
| 74 | + except (FileNotFoundError, ValueError, ProcessLookupError, PermissionError): | ||
| 75 | + return True | ||
| 76 | + | ||
| 77 | + def _force_release(self) -> None: | ||
| 78 | + """强制释放过时的锁。""" | ||
| 79 | + with contextlib.suppress(FileNotFoundError): | ||
| 80 | + os.remove(self.lock_file) | ||
| 81 | + logger.info("强制释放过时锁: %s", self.lock_file) | ||
| 82 | + | ||
| 83 | + def __enter__(self) -> RunLock: | ||
| 84 | + if not self.acquire(): | ||
| 85 | + raise TimeoutError(f"无法获取锁: {self.lock_file}") | ||
| 86 | + return self | ||
| 87 | + | ||
| 88 | + def __exit__(self, *args: object) -> None: | ||
| 89 | + self.release() |
scripts/title_utils.py
0 → 100644
| 1 | +"""UTF-16 标题长度计算,对应 Go pkg/xhsutil/title.go。""" | ||
| 2 | + | ||
| 3 | + | ||
| 4 | +def calc_title_length(s: str) -> int: | ||
| 5 | + """计算小红书标题长度。 | ||
| 6 | + | ||
| 7 | + 规则:非 ASCII 字符(中文、全角符号等)算 2 字节, | ||
| 8 | + ASCII 字符算 1 字节,最终结果向上取整除以 2。 | ||
| 9 | + | ||
| 10 | + Examples: | ||
| 11 | + >>> calc_title_length("你好世界") | ||
| 12 | + 4 | ||
| 13 | + >>> calc_title_length("hello") | ||
| 14 | + 3 | ||
| 15 | + >>> calc_title_length("OOTD穿搭分享") | ||
| 16 | + 6 | ||
| 17 | + """ | ||
| 18 | + byte_len = 0 | ||
| 19 | + # 用 UTF-16 编码来处理(包括 surrogate pairs) | ||
| 20 | + encoded = s.encode("utf-16-le") | ||
| 21 | + for i in range(0, len(encoded), 2): | ||
| 22 | + code_unit = int.from_bytes(encoded[i : i + 2], "little") | ||
| 23 | + if code_unit > 127: | ||
| 24 | + byte_len += 2 | ||
| 25 | + else: | ||
| 26 | + byte_len += 1 | ||
| 27 | + return (byte_len + 1) // 2 |
scripts/xhs/__init__.py
0 → 100644
| 1 | +"""小红书 CDP 自动化核心包。""" |
scripts/xhs/cdp.py
0 → 100644
| 1 | +"""CDP WebSocket 客户端(Browser, Page, Element),对应 Go browser/browser.go + go-rod API。 | ||
| 2 | + | ||
| 3 | +通过原生 WebSocket 与 Chrome DevTools Protocol 通信,实现浏览器自动化控制。 | ||
| 4 | +""" | ||
| 5 | + | ||
| 6 | +from __future__ import annotations | ||
| 7 | + | ||
| 8 | +import json | ||
| 9 | +import logging | ||
| 10 | +import time | ||
| 11 | +from typing import Any | ||
| 12 | + | ||
| 13 | +import requests | ||
| 14 | +import websockets.sync.client as ws_client | ||
| 15 | + | ||
| 16 | +from .errors import CDPError, ElementNotFoundError | ||
| 17 | +from .stealth import STEALTH_JS | ||
| 18 | + | ||
| 19 | +logger = logging.getLogger(__name__) | ||
| 20 | + | ||
| 21 | + | ||
| 22 | +class CDPClient: | ||
| 23 | + """底层 CDP WebSocket 通信客户端。""" | ||
| 24 | + | ||
| 25 | + def __init__(self, ws_url: str) -> None: | ||
| 26 | + self._ws = ws_client.connect(ws_url, max_size=50 * 1024 * 1024) | ||
| 27 | + self._id = 0 | ||
| 28 | + self._callbacks: dict[int, Any] = {} | ||
| 29 | + | ||
| 30 | + def send(self, method: str, params: dict | None = None) -> dict: | ||
| 31 | + """发送 CDP 命令并等待结果。""" | ||
| 32 | + self._id += 1 | ||
| 33 | + msg: dict[str, Any] = {"id": self._id, "method": method} | ||
| 34 | + if params: | ||
| 35 | + msg["params"] = params | ||
| 36 | + self._ws.send(json.dumps(msg)) | ||
| 37 | + return self._wait_for(self._id) | ||
| 38 | + | ||
| 39 | + def _wait_for(self, msg_id: int, timeout: float = 30.0) -> dict: | ||
| 40 | + """等待指定 id 的响应。""" | ||
| 41 | + deadline = time.monotonic() + timeout | ||
| 42 | + while time.monotonic() < deadline: | ||
| 43 | + try: | ||
| 44 | + raw = self._ws.recv(timeout=max(0.1, deadline - time.monotonic())) | ||
| 45 | + except TimeoutError: | ||
| 46 | + break | ||
| 47 | + data = json.loads(raw) | ||
| 48 | + if data.get("id") == msg_id: | ||
| 49 | + if "error" in data: | ||
| 50 | + raise CDPError(f"CDP 错误: {data['error']}") | ||
| 51 | + return data.get("result", {}) | ||
| 52 | + raise CDPError(f"等待 CDP 响应超时 (id={msg_id})") | ||
| 53 | + | ||
| 54 | + def close(self) -> None: | ||
| 55 | + import contextlib | ||
| 56 | + | ||
| 57 | + with contextlib.suppress(Exception): | ||
| 58 | + self._ws.close() | ||
| 59 | + | ||
| 60 | + | ||
| 61 | +class Page: | ||
| 62 | + """CDP 页面对象,封装常用操作。""" | ||
| 63 | + | ||
| 64 | + def __init__(self, cdp: CDPClient, target_id: str, session_id: str) -> None: | ||
| 65 | + self._cdp = cdp | ||
| 66 | + self.target_id = target_id | ||
| 67 | + self.session_id = session_id | ||
| 68 | + self._ws = cdp._ws | ||
| 69 | + self._id_counter = 1000 | ||
| 70 | + | ||
| 71 | + def _send_session(self, method: str, params: dict | None = None) -> dict: | ||
| 72 | + """向 session 发送命令。""" | ||
| 73 | + self._id_counter += 1 | ||
| 74 | + msg: dict[str, Any] = { | ||
| 75 | + "id": self._id_counter, | ||
| 76 | + "method": method, | ||
| 77 | + "sessionId": self.session_id, | ||
| 78 | + } | ||
| 79 | + if params: | ||
| 80 | + msg["params"] = params | ||
| 81 | + self._ws.send(json.dumps(msg)) | ||
| 82 | + return self._wait_session(self._id_counter) | ||
| 83 | + | ||
| 84 | + def _wait_session(self, msg_id: int, timeout: float = 60.0) -> dict: | ||
| 85 | + """等待 session 响应。""" | ||
| 86 | + deadline = time.monotonic() + timeout | ||
| 87 | + while time.monotonic() < deadline: | ||
| 88 | + try: | ||
| 89 | + raw = self._ws.recv(timeout=max(0.1, deadline - time.monotonic())) | ||
| 90 | + except TimeoutError: | ||
| 91 | + break | ||
| 92 | + data = json.loads(raw) | ||
| 93 | + if data.get("id") == msg_id: | ||
| 94 | + if "error" in data: | ||
| 95 | + raise CDPError(f"CDP 错误: {data['error']}") | ||
| 96 | + return data.get("result", {}) | ||
| 97 | + raise CDPError(f"等待 session 响应超时 (id={msg_id})") | ||
| 98 | + | ||
| 99 | + def navigate(self, url: str) -> None: | ||
| 100 | + """导航到指定 URL。""" | ||
| 101 | + logger.info("导航到: %s", url) | ||
| 102 | + self._send_session("Page.navigate", {"url": url}) | ||
| 103 | + | ||
| 104 | + def wait_for_load(self, timeout: float = 60.0) -> None: | ||
| 105 | + """等待页面加载完成(通过轮询 document.readyState)。""" | ||
| 106 | + deadline = time.monotonic() + timeout | ||
| 107 | + while time.monotonic() < deadline: | ||
| 108 | + try: | ||
| 109 | + state = self.evaluate("document.readyState") | ||
| 110 | + if state == "complete": | ||
| 111 | + return | ||
| 112 | + except CDPError: | ||
| 113 | + pass | ||
| 114 | + time.sleep(0.5) | ||
| 115 | + logger.warning("等待页面加载超时") | ||
| 116 | + | ||
| 117 | + def wait_dom_stable(self, timeout: float = 10.0, interval: float = 0.5) -> None: | ||
| 118 | + """等待 DOM 稳定(连续两次 DOM 快照一致)。""" | ||
| 119 | + last_html = "" | ||
| 120 | + deadline = time.monotonic() + timeout | ||
| 121 | + while time.monotonic() < deadline: | ||
| 122 | + try: | ||
| 123 | + html = self.evaluate("document.body ? document.body.innerHTML.length : 0") | ||
| 124 | + if html == last_html and html != "": | ||
| 125 | + return | ||
| 126 | + last_html = html | ||
| 127 | + except CDPError: | ||
| 128 | + pass | ||
| 129 | + time.sleep(interval) | ||
| 130 | + | ||
| 131 | + def evaluate(self, expression: str, timeout: float = 30.0) -> Any: | ||
| 132 | + """执行 JavaScript 表达式并返回结果。""" | ||
| 133 | + result = self._send_session( | ||
| 134 | + "Runtime.evaluate", | ||
| 135 | + { | ||
| 136 | + "expression": expression, | ||
| 137 | + "returnByValue": True, | ||
| 138 | + "awaitPromise": False, | ||
| 139 | + }, | ||
| 140 | + ) | ||
| 141 | + if "exceptionDetails" in result: | ||
| 142 | + raise CDPError(f"JS 执行异常: {result['exceptionDetails']}") | ||
| 143 | + remote_obj = result.get("result", {}) | ||
| 144 | + return remote_obj.get("value") | ||
| 145 | + | ||
| 146 | + def evaluate_function(self, function_body: str, *args: Any) -> Any: | ||
| 147 | + """执行 JavaScript 函数并返回结果。 | ||
| 148 | + | ||
| 149 | + function_body 是一个完整的函数体,如 `() => { return 1; }` | ||
| 150 | + """ | ||
| 151 | + result = self._send_session( | ||
| 152 | + "Runtime.evaluate", | ||
| 153 | + { | ||
| 154 | + "expression": f"({function_body})()", | ||
| 155 | + "returnByValue": True, | ||
| 156 | + "awaitPromise": False, | ||
| 157 | + }, | ||
| 158 | + ) | ||
| 159 | + if "exceptionDetails" in result: | ||
| 160 | + raise CDPError(f"JS 函数执行异常: {result['exceptionDetails']}") | ||
| 161 | + remote_obj = result.get("result", {}) | ||
| 162 | + return remote_obj.get("value") | ||
| 163 | + | ||
| 164 | + def query_selector(self, selector: str) -> str | None: | ||
| 165 | + """查找单个元素,返回 objectId 或 None。""" | ||
| 166 | + result = self._send_session( | ||
| 167 | + "Runtime.evaluate", | ||
| 168 | + { | ||
| 169 | + "expression": f"document.querySelector({json.dumps(selector)})", | ||
| 170 | + "returnByValue": False, | ||
| 171 | + }, | ||
| 172 | + ) | ||
| 173 | + remote_obj = result.get("result", {}) | ||
| 174 | + if remote_obj.get("subtype") == "null" or remote_obj.get("type") == "undefined": | ||
| 175 | + return None | ||
| 176 | + return remote_obj.get("objectId") | ||
| 177 | + | ||
| 178 | + def query_selector_all(self, selector: str) -> list[str]: | ||
| 179 | + """查找多个元素,返回 objectId 列表。""" | ||
| 180 | + # 通过 JS 返回元素数量,然后逐个获取 | ||
| 181 | + count = self.evaluate(f"document.querySelectorAll({json.dumps(selector)}).length") | ||
| 182 | + if not count: | ||
| 183 | + return [] | ||
| 184 | + object_ids = [] | ||
| 185 | + for i in range(count): | ||
| 186 | + result = self._send_session( | ||
| 187 | + "Runtime.evaluate", | ||
| 188 | + { | ||
| 189 | + "expression": (f"document.querySelectorAll({json.dumps(selector)})[{i}]"), | ||
| 190 | + "returnByValue": False, | ||
| 191 | + }, | ||
| 192 | + ) | ||
| 193 | + obj = result.get("result", {}) | ||
| 194 | + oid = obj.get("objectId") | ||
| 195 | + if oid: | ||
| 196 | + object_ids.append(oid) | ||
| 197 | + return object_ids | ||
| 198 | + | ||
| 199 | + def has_element(self, selector: str) -> bool: | ||
| 200 | + """检查元素是否存在。""" | ||
| 201 | + return self.evaluate(f"document.querySelector({json.dumps(selector)}) !== null") is True | ||
| 202 | + | ||
| 203 | + def wait_for_element(self, selector: str, timeout: float = 30.0) -> str: | ||
| 204 | + """等待元素出现,返回 objectId。""" | ||
| 205 | + deadline = time.monotonic() + timeout | ||
| 206 | + while time.monotonic() < deadline: | ||
| 207 | + oid = self.query_selector(selector) | ||
| 208 | + if oid: | ||
| 209 | + return oid | ||
| 210 | + time.sleep(0.5) | ||
| 211 | + raise ElementNotFoundError(selector) | ||
| 212 | + | ||
| 213 | + def click_element(self, selector: str) -> None: | ||
| 214 | + """点击指定选择器的元素。""" | ||
| 215 | + self.evaluate( | ||
| 216 | + f""" | ||
| 217 | + (() => {{ | ||
| 218 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 219 | + if (el) el.click(); | ||
| 220 | + }})() | ||
| 221 | + """ | ||
| 222 | + ) | ||
| 223 | + | ||
| 224 | + def input_text(self, selector: str, text: str) -> None: | ||
| 225 | + """向指定选择器的元素输入文本。""" | ||
| 226 | + self.evaluate( | ||
| 227 | + f""" | ||
| 228 | + (() => {{ | ||
| 229 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 230 | + if (!el) return; | ||
| 231 | + el.focus(); | ||
| 232 | + el.value = {json.dumps(text)}; | ||
| 233 | + el.dispatchEvent(new Event('input', {{bubbles: true}})); | ||
| 234 | + el.dispatchEvent(new Event('change', {{bubbles: true}})); | ||
| 235 | + }})() | ||
| 236 | + """ | ||
| 237 | + ) | ||
| 238 | + | ||
| 239 | + def input_content_editable(self, selector: str, text: str) -> None: | ||
| 240 | + """向 contentEditable 元素输入文本(如 div.ql-editor)。""" | ||
| 241 | + self.evaluate( | ||
| 242 | + f""" | ||
| 243 | + (() => {{ | ||
| 244 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 245 | + if (!el) return; | ||
| 246 | + el.focus(); | ||
| 247 | + el.textContent = {json.dumps(text)}; | ||
| 248 | + el.dispatchEvent(new Event('input', {{bubbles: true}})); | ||
| 249 | + }})() | ||
| 250 | + """ | ||
| 251 | + ) | ||
| 252 | + | ||
| 253 | + def get_element_text(self, selector: str) -> str | None: | ||
| 254 | + """获取元素文本内容。""" | ||
| 255 | + return self.evaluate( | ||
| 256 | + f""" | ||
| 257 | + (() => {{ | ||
| 258 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 259 | + return el ? el.textContent : null; | ||
| 260 | + }})() | ||
| 261 | + """ | ||
| 262 | + ) | ||
| 263 | + | ||
| 264 | + def get_element_attribute(self, selector: str, attr: str) -> str | None: | ||
| 265 | + """获取元素属性值。""" | ||
| 266 | + return self.evaluate( | ||
| 267 | + f""" | ||
| 268 | + (() => {{ | ||
| 269 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 270 | + return el ? el.getAttribute({json.dumps(attr)}) : null; | ||
| 271 | + }})() | ||
| 272 | + """ | ||
| 273 | + ) | ||
| 274 | + | ||
| 275 | + def get_elements_count(self, selector: str) -> int: | ||
| 276 | + """获取匹配元素数量。""" | ||
| 277 | + result = self.evaluate(f"document.querySelectorAll({json.dumps(selector)}).length") | ||
| 278 | + return result if isinstance(result, int) else 0 | ||
| 279 | + | ||
| 280 | + def scroll_by(self, x: int, y: int) -> None: | ||
| 281 | + """滚动页面。""" | ||
| 282 | + self.evaluate(f"window.scrollBy({x}, {y})") | ||
| 283 | + | ||
| 284 | + def scroll_to(self, x: int, y: int) -> None: | ||
| 285 | + """滚动到指定位置。""" | ||
| 286 | + self.evaluate(f"window.scrollTo({x}, {y})") | ||
| 287 | + | ||
| 288 | + def scroll_to_bottom(self) -> None: | ||
| 289 | + """滚动到页面底部。""" | ||
| 290 | + self.evaluate("window.scrollTo(0, document.body.scrollHeight)") | ||
| 291 | + | ||
| 292 | + def scroll_element_into_view(self, selector: str) -> None: | ||
| 293 | + """将元素滚动到可视区域。""" | ||
| 294 | + self.evaluate( | ||
| 295 | + f""" | ||
| 296 | + (() => {{ | ||
| 297 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 298 | + if (el) el.scrollIntoView({{behavior: 'smooth', block: 'center'}}); | ||
| 299 | + }})() | ||
| 300 | + """ | ||
| 301 | + ) | ||
| 302 | + | ||
| 303 | + def scroll_nth_element_into_view(self, selector: str, index: int) -> None: | ||
| 304 | + """将第 N 个匹配元素滚动到可视区域。""" | ||
| 305 | + self.evaluate( | ||
| 306 | + f""" | ||
| 307 | + (() => {{ | ||
| 308 | + const els = document.querySelectorAll({json.dumps(selector)}); | ||
| 309 | + if (els[{index}]) els[{index}].scrollIntoView( | ||
| 310 | + {{behavior: 'smooth', block: 'center'}} | ||
| 311 | + ); | ||
| 312 | + }})() | ||
| 313 | + """ | ||
| 314 | + ) | ||
| 315 | + | ||
| 316 | + def get_scroll_top(self) -> int: | ||
| 317 | + """获取当前滚动位置。""" | ||
| 318 | + result = self.evaluate( | ||
| 319 | + "window.pageYOffset || document.documentElement.scrollTop" | ||
| 320 | + " || document.body.scrollTop || 0" | ||
| 321 | + ) | ||
| 322 | + return int(result) if result else 0 | ||
| 323 | + | ||
| 324 | + def get_viewport_height(self) -> int: | ||
| 325 | + """获取视口高度。""" | ||
| 326 | + result = self.evaluate("window.innerHeight") | ||
| 327 | + return int(result) if result else 768 | ||
| 328 | + | ||
| 329 | + def set_file_input(self, selector: str, files: list[str]) -> None: | ||
| 330 | + """设置文件输入框的文件(通过 CDP DOM.setFileInputFiles)。""" | ||
| 331 | + # 先获取 nodeId | ||
| 332 | + doc = self._send_session("DOM.getDocument", {"depth": 0}) | ||
| 333 | + root_node_id = doc["root"]["nodeId"] | ||
| 334 | + result = self._send_session( | ||
| 335 | + "DOM.querySelector", | ||
| 336 | + {"nodeId": root_node_id, "selector": selector}, | ||
| 337 | + ) | ||
| 338 | + node_id = result.get("nodeId", 0) | ||
| 339 | + if node_id == 0: | ||
| 340 | + raise ElementNotFoundError(selector) | ||
| 341 | + self._send_session( | ||
| 342 | + "DOM.setFileInputFiles", | ||
| 343 | + {"nodeId": node_id, "files": files}, | ||
| 344 | + ) | ||
| 345 | + | ||
| 346 | + def dispatch_wheel_event(self, delta_y: float) -> None: | ||
| 347 | + """触发滚轮事件以激活懒加载。""" | ||
| 348 | + self.evaluate( | ||
| 349 | + f""" | ||
| 350 | + (() => {{ | ||
| 351 | + let target = document.querySelector('.note-scroller') | ||
| 352 | + || document.querySelector('.interaction-container') | ||
| 353 | + || document.documentElement; | ||
| 354 | + const event = new WheelEvent('wheel', {{ | ||
| 355 | + deltaY: {delta_y}, | ||
| 356 | + deltaMode: 0, | ||
| 357 | + bubbles: true, | ||
| 358 | + cancelable: true, | ||
| 359 | + view: window, | ||
| 360 | + }}); | ||
| 361 | + target.dispatchEvent(event); | ||
| 362 | + }})() | ||
| 363 | + """ | ||
| 364 | + ) | ||
| 365 | + | ||
| 366 | + def mouse_move(self, x: float, y: float) -> None: | ||
| 367 | + """移动鼠标。""" | ||
| 368 | + self._send_session( | ||
| 369 | + "Input.dispatchMouseEvent", | ||
| 370 | + {"type": "mouseMoved", "x": x, "y": y}, | ||
| 371 | + ) | ||
| 372 | + | ||
| 373 | + def mouse_click(self, x: float, y: float, button: str = "left") -> None: | ||
| 374 | + """在指定坐标点击。""" | ||
| 375 | + self._send_session( | ||
| 376 | + "Input.dispatchMouseEvent", | ||
| 377 | + {"type": "mousePressed", "x": x, "y": y, "button": button, "clickCount": 1}, | ||
| 378 | + ) | ||
| 379 | + self._send_session( | ||
| 380 | + "Input.dispatchMouseEvent", | ||
| 381 | + {"type": "mouseReleased", "x": x, "y": y, "button": button, "clickCount": 1}, | ||
| 382 | + ) | ||
| 383 | + | ||
| 384 | + def type_text(self, text: str, delay_ms: int = 50) -> None: | ||
| 385 | + """逐字符输入文本。""" | ||
| 386 | + for char in text: | ||
| 387 | + self._send_session( | ||
| 388 | + "Input.dispatchKeyEvent", | ||
| 389 | + {"type": "keyDown", "text": char}, | ||
| 390 | + ) | ||
| 391 | + self._send_session( | ||
| 392 | + "Input.dispatchKeyEvent", | ||
| 393 | + {"type": "keyUp", "text": char}, | ||
| 394 | + ) | ||
| 395 | + if delay_ms > 0: | ||
| 396 | + time.sleep(delay_ms / 1000.0) | ||
| 397 | + | ||
| 398 | + def press_key(self, key: str) -> None: | ||
| 399 | + """按下并释放指定键。""" | ||
| 400 | + key_map = { | ||
| 401 | + "Enter": {"key": "Enter", "code": "Enter", "windowsVirtualKeyCode": 13}, | ||
| 402 | + "ArrowDown": { | ||
| 403 | + "key": "ArrowDown", | ||
| 404 | + "code": "ArrowDown", | ||
| 405 | + "windowsVirtualKeyCode": 40, | ||
| 406 | + }, | ||
| 407 | + "Tab": {"key": "Tab", "code": "Tab", "windowsVirtualKeyCode": 9}, | ||
| 408 | + } | ||
| 409 | + info = key_map.get(key, {"key": key, "code": key}) | ||
| 410 | + self._send_session( | ||
| 411 | + "Input.dispatchKeyEvent", | ||
| 412 | + {"type": "keyDown", **info}, | ||
| 413 | + ) | ||
| 414 | + self._send_session( | ||
| 415 | + "Input.dispatchKeyEvent", | ||
| 416 | + {"type": "keyUp", **info}, | ||
| 417 | + ) | ||
| 418 | + | ||
| 419 | + def inject_stealth(self) -> None: | ||
| 420 | + """注入反检测脚本。""" | ||
| 421 | + self._send_session( | ||
| 422 | + "Page.addScriptToEvaluateOnNewDocument", | ||
| 423 | + {"source": STEALTH_JS}, | ||
| 424 | + ) | ||
| 425 | + | ||
| 426 | + def remove_element(self, selector: str) -> None: | ||
| 427 | + """移除 DOM 元素。""" | ||
| 428 | + self.evaluate( | ||
| 429 | + f""" | ||
| 430 | + (() => {{ | ||
| 431 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 432 | + if (el) el.remove(); | ||
| 433 | + }})() | ||
| 434 | + """ | ||
| 435 | + ) | ||
| 436 | + | ||
| 437 | + def hover_element(self, selector: str) -> None: | ||
| 438 | + """悬停到元素中心。""" | ||
| 439 | + box = self.evaluate( | ||
| 440 | + f""" | ||
| 441 | + (() => {{ | ||
| 442 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 443 | + if (!el) return null; | ||
| 444 | + const rect = el.getBoundingClientRect(); | ||
| 445 | + return {{x: rect.left + rect.width / 2, y: rect.top + rect.height / 2}}; | ||
| 446 | + }})() | ||
| 447 | + """ | ||
| 448 | + ) | ||
| 449 | + if box: | ||
| 450 | + self.mouse_move(box["x"], box["y"]) | ||
| 451 | + | ||
| 452 | + def select_all_text(self, selector: str) -> None: | ||
| 453 | + """选中输入框内所有文本。""" | ||
| 454 | + self.evaluate( | ||
| 455 | + f""" | ||
| 456 | + (() => {{ | ||
| 457 | + const el = document.querySelector({json.dumps(selector)}); | ||
| 458 | + if (!el) return; | ||
| 459 | + el.focus(); | ||
| 460 | + el.select ? el.select() : document.execCommand('selectAll'); | ||
| 461 | + }})() | ||
| 462 | + """ | ||
| 463 | + ) | ||
| 464 | + | ||
| 465 | + | ||
| 466 | +class Browser: | ||
| 467 | + """Chrome 浏览器 CDP 控制器。""" | ||
| 468 | + | ||
| 469 | + def __init__(self, host: str = "127.0.0.1", port: int = 9222) -> None: | ||
| 470 | + self.host = host | ||
| 471 | + self.port = port | ||
| 472 | + self.base_url = f"http://{host}:{port}" | ||
| 473 | + self._cdp: CDPClient | None = None | ||
| 474 | + | ||
| 475 | + def connect(self) -> None: | ||
| 476 | + """连接到 Chrome DevTools。""" | ||
| 477 | + resp = requests.get(f"{self.base_url}/json/version", timeout=5) | ||
| 478 | + resp.raise_for_status() | ||
| 479 | + info = resp.json() | ||
| 480 | + ws_url = info["webSocketDebuggerUrl"] | ||
| 481 | + logger.info("连接到 Chrome: %s", ws_url) | ||
| 482 | + self._cdp = CDPClient(ws_url) | ||
| 483 | + | ||
| 484 | + def new_page(self, url: str = "about:blank") -> Page: | ||
| 485 | + """创建新页面。""" | ||
| 486 | + if not self._cdp: | ||
| 487 | + self.connect() | ||
| 488 | + assert self._cdp is not None | ||
| 489 | + | ||
| 490 | + # 创建 target | ||
| 491 | + result = self._cdp.send("Target.createTarget", {"url": url}) | ||
| 492 | + target_id = result["targetId"] | ||
| 493 | + | ||
| 494 | + # 附加到 target | ||
| 495 | + result = self._cdp.send( | ||
| 496 | + "Target.attachToTarget", | ||
| 497 | + {"targetId": target_id, "flatten": True}, | ||
| 498 | + ) | ||
| 499 | + session_id = result["sessionId"] | ||
| 500 | + | ||
| 501 | + page = Page(self._cdp, target_id, session_id) | ||
| 502 | + | ||
| 503 | + # 启用必要的 domain | ||
| 504 | + page._send_session("Page.enable") | ||
| 505 | + page._send_session("DOM.enable") | ||
| 506 | + page._send_session("Runtime.enable") | ||
| 507 | + | ||
| 508 | + # 注入反检测 | ||
| 509 | + page.inject_stealth() | ||
| 510 | + | ||
| 511 | + return page | ||
| 512 | + | ||
| 513 | + def get_existing_page(self) -> Page | None: | ||
| 514 | + """获取已有页面(取第一个非 about:blank 的 page target)。""" | ||
| 515 | + if not self._cdp: | ||
| 516 | + self.connect() | ||
| 517 | + assert self._cdp is not None | ||
| 518 | + | ||
| 519 | + resp = requests.get(f"{self.base_url}/json", timeout=5) | ||
| 520 | + targets = resp.json() | ||
| 521 | + | ||
| 522 | + for target in targets: | ||
| 523 | + if target.get("type") == "page" and target.get("url") != "about:blank": | ||
| 524 | + target_id = target["id"] | ||
| 525 | + result = self._cdp.send( | ||
| 526 | + "Target.attachToTarget", | ||
| 527 | + {"targetId": target_id, "flatten": True}, | ||
| 528 | + ) | ||
| 529 | + session_id = result["sessionId"] | ||
| 530 | + page = Page(self._cdp, target_id, session_id) | ||
| 531 | + page._send_session("Page.enable") | ||
| 532 | + page._send_session("DOM.enable") | ||
| 533 | + page._send_session("Runtime.enable") | ||
| 534 | + page.inject_stealth() | ||
| 535 | + return page | ||
| 536 | + return None | ||
| 537 | + | ||
| 538 | + def close_page(self, page: Page) -> None: | ||
| 539 | + """关闭页面。""" | ||
| 540 | + import contextlib | ||
| 541 | + | ||
| 542 | + if self._cdp: | ||
| 543 | + with contextlib.suppress(CDPError): | ||
| 544 | + self._cdp.send("Target.closeTarget", {"targetId": page.target_id}) | ||
| 545 | + | ||
| 546 | + def close(self) -> None: | ||
| 547 | + """关闭连接。""" | ||
| 548 | + if self._cdp: | ||
| 549 | + self._cdp.close() | ||
| 550 | + self._cdp = None |
scripts/xhs/comment.py
0 → 100644
| 1 | +"""评论操作,对应 Go xiaohongshu/comment_feed.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import logging | ||
| 6 | +import time | ||
| 7 | + | ||
| 8 | +from .cdp import Page | ||
| 9 | +from .feed_detail import _check_end_container, _check_page_accessible, _get_comment_count | ||
| 10 | +from .selectors import ( | ||
| 11 | + COMMENT_INPUT_FIELD, | ||
| 12 | + COMMENT_INPUT_TRIGGER, | ||
| 13 | + COMMENT_SUBMIT_BUTTON, | ||
| 14 | + PARENT_COMMENT, | ||
| 15 | + REPLY_BUTTON, | ||
| 16 | +) | ||
| 17 | +from .urls import make_feed_detail_url | ||
| 18 | + | ||
| 19 | +logger = logging.getLogger(__name__) | ||
| 20 | + | ||
| 21 | + | ||
| 22 | +def post_comment(page: Page, feed_id: str, xsec_token: str, content: str) -> None: | ||
| 23 | + """发表评论到 Feed。 | ||
| 24 | + | ||
| 25 | + Args: | ||
| 26 | + page: CDP 页面对象。 | ||
| 27 | + feed_id: Feed ID。 | ||
| 28 | + xsec_token: xsec_token。 | ||
| 29 | + content: 评论内容。 | ||
| 30 | + | ||
| 31 | + Raises: | ||
| 32 | + RuntimeError: 评论失败。 | ||
| 33 | + """ | ||
| 34 | + url = make_feed_detail_url(feed_id, xsec_token) | ||
| 35 | + logger.info("打开 feed 详情页: %s", url) | ||
| 36 | + | ||
| 37 | + page.navigate(url) | ||
| 38 | + page.wait_for_load() | ||
| 39 | + page.wait_dom_stable() | ||
| 40 | + time.sleep(1) | ||
| 41 | + | ||
| 42 | + _check_page_accessible(page) | ||
| 43 | + | ||
| 44 | + # 点击评论输入触发区域 | ||
| 45 | + if not page.has_element(COMMENT_INPUT_TRIGGER): | ||
| 46 | + raise RuntimeError("未找到评论输入框,该帖子可能不支持评论或网页端不可访问") | ||
| 47 | + | ||
| 48 | + page.click_element(COMMENT_INPUT_TRIGGER) | ||
| 49 | + time.sleep(0.5) | ||
| 50 | + | ||
| 51 | + # 输入评论内容 | ||
| 52 | + page.wait_for_element(COMMENT_INPUT_FIELD, timeout=5) | ||
| 53 | + page.evaluate( | ||
| 54 | + f""" | ||
| 55 | + (() => {{ | ||
| 56 | + const el = document.querySelector({_js_str(COMMENT_INPUT_FIELD)}); | ||
| 57 | + if (el) {{ | ||
| 58 | + el.focus(); | ||
| 59 | + el.textContent = {_js_str(content)}; | ||
| 60 | + el.dispatchEvent(new Event('input', {{bubbles: true}})); | ||
| 61 | + }} | ||
| 62 | + }})() | ||
| 63 | + """ | ||
| 64 | + ) | ||
| 65 | + time.sleep(1) | ||
| 66 | + | ||
| 67 | + # 点击提交 | ||
| 68 | + page.click_element(COMMENT_SUBMIT_BUTTON) | ||
| 69 | + time.sleep(1) | ||
| 70 | + | ||
| 71 | + logger.info("评论发送成功: feed=%s", feed_id) | ||
| 72 | + | ||
| 73 | + | ||
| 74 | +def reply_comment( | ||
| 75 | + page: Page, | ||
| 76 | + feed_id: str, | ||
| 77 | + xsec_token: str, | ||
| 78 | + content: str, | ||
| 79 | + comment_id: str = "", | ||
| 80 | + user_id: str = "", | ||
| 81 | +) -> None: | ||
| 82 | + """回复指定评论。 | ||
| 83 | + | ||
| 84 | + 通过 comment_id 或 user_id 定位评论,然后回复。 | ||
| 85 | + | ||
| 86 | + Args: | ||
| 87 | + page: CDP 页面对象。 | ||
| 88 | + feed_id: Feed ID。 | ||
| 89 | + xsec_token: xsec_token。 | ||
| 90 | + content: 回复内容。 | ||
| 91 | + comment_id: 评论 ID(优先使用)。 | ||
| 92 | + user_id: 用户 ID(备选)。 | ||
| 93 | + | ||
| 94 | + Raises: | ||
| 95 | + RuntimeError: 回复失败。 | ||
| 96 | + """ | ||
| 97 | + if not comment_id and not user_id: | ||
| 98 | + raise ValueError("comment_id 和 user_id 至少提供一个") | ||
| 99 | + | ||
| 100 | + url = make_feed_detail_url(feed_id, xsec_token) | ||
| 101 | + logger.info("打开 feed 详情页进行回复: %s", url) | ||
| 102 | + | ||
| 103 | + page.navigate(url) | ||
| 104 | + page.wait_for_load() | ||
| 105 | + page.wait_dom_stable() | ||
| 106 | + time.sleep(1) | ||
| 107 | + | ||
| 108 | + _check_page_accessible(page) | ||
| 109 | + time.sleep(2) | ||
| 110 | + | ||
| 111 | + # 查找目标评论 | ||
| 112 | + comment_found = _find_and_scroll_to_comment(page, comment_id, user_id) | ||
| 113 | + if not comment_found: | ||
| 114 | + raise RuntimeError(f"未找到评论 (commentID: {comment_id}, userID: {user_id})") | ||
| 115 | + | ||
| 116 | + time.sleep(1) | ||
| 117 | + | ||
| 118 | + # 点击回复按钮 | ||
| 119 | + reply_selector = f"#comment-{comment_id} {REPLY_BUTTON}" if comment_id else REPLY_BUTTON | ||
| 120 | + page.click_element(reply_selector) | ||
| 121 | + time.sleep(1) | ||
| 122 | + | ||
| 123 | + # 输入回复内容 | ||
| 124 | + page.wait_for_element(COMMENT_INPUT_FIELD, timeout=5) | ||
| 125 | + page.evaluate( | ||
| 126 | + f""" | ||
| 127 | + (() => {{ | ||
| 128 | + const el = document.querySelector({_js_str(COMMENT_INPUT_FIELD)}); | ||
| 129 | + if (el) {{ | ||
| 130 | + el.focus(); | ||
| 131 | + el.textContent = {_js_str(content)}; | ||
| 132 | + el.dispatchEvent(new Event('input', {{bubbles: true}})); | ||
| 133 | + }} | ||
| 134 | + }})() | ||
| 135 | + """ | ||
| 136 | + ) | ||
| 137 | + time.sleep(0.5) | ||
| 138 | + | ||
| 139 | + # 点击提交 | ||
| 140 | + page.click_element(COMMENT_SUBMIT_BUTTON) | ||
| 141 | + time.sleep(2) | ||
| 142 | + | ||
| 143 | + logger.info("回复评论成功") | ||
| 144 | + | ||
| 145 | + | ||
| 146 | +def _find_and_scroll_to_comment( | ||
| 147 | + page: Page, | ||
| 148 | + comment_id: str, | ||
| 149 | + user_id: str, | ||
| 150 | + max_attempts: int = 100, | ||
| 151 | +) -> bool: | ||
| 152 | + """查找并滚动到目标评论。""" | ||
| 153 | + logger.info("开始查找评论 - commentID: %s, userID: %s", comment_id, user_id) | ||
| 154 | + | ||
| 155 | + # 先滚动到评论区 | ||
| 156 | + page.scroll_element_into_view(".comments-container") | ||
| 157 | + time.sleep(1) | ||
| 158 | + | ||
| 159 | + last_count = 0 | ||
| 160 | + stagnant = 0 | ||
| 161 | + | ||
| 162 | + for attempt in range(max_attempts): | ||
| 163 | + # 检查是否到底 | ||
| 164 | + if _check_end_container(page): | ||
| 165 | + logger.info("已到达评论底部,未找到目标评论") | ||
| 166 | + break | ||
| 167 | + | ||
| 168 | + # 停滞检测 | ||
| 169 | + current_count = _get_comment_count(page) | ||
| 170 | + if current_count != last_count: | ||
| 171 | + last_count = current_count | ||
| 172 | + stagnant = 0 | ||
| 173 | + else: | ||
| 174 | + stagnant += 1 | ||
| 175 | + if stagnant >= 10: | ||
| 176 | + logger.info("评论数量停滞超过10次") | ||
| 177 | + break | ||
| 178 | + | ||
| 179 | + # 滚动到最后一条评论 | ||
| 180 | + if current_count > 0: | ||
| 181 | + page.scroll_nth_element_into_view(PARENT_COMMENT, current_count - 1) | ||
| 182 | + time.sleep(0.3) | ||
| 183 | + | ||
| 184 | + # 继续滚动 | ||
| 185 | + page.evaluate("window.scrollBy(0, window.innerHeight * 0.8)") | ||
| 186 | + time.sleep(0.5) | ||
| 187 | + | ||
| 188 | + # 通过 commentID 查找 | ||
| 189 | + if comment_id: | ||
| 190 | + selector = f"#comment-{comment_id}" | ||
| 191 | + if page.has_element(selector): | ||
| 192 | + logger.info("通过 commentID 找到评论 (尝试 %d 次)", attempt + 1) | ||
| 193 | + page.scroll_element_into_view(selector) | ||
| 194 | + return True | ||
| 195 | + | ||
| 196 | + # 通过 userID 查找 | ||
| 197 | + if user_id: | ||
| 198 | + found = page.evaluate( | ||
| 199 | + f""" | ||
| 200 | + (() => {{ | ||
| 201 | + const els = document.querySelectorAll( | ||
| 202 | + '.parent-comment, .comment-item, .comment' | ||
| 203 | + ); | ||
| 204 | + for (const el of els) {{ | ||
| 205 | + if (el.querySelector('[data-user-id="{user_id}"]')) {{ | ||
| 206 | + el.scrollIntoView({{behavior: 'smooth', block: 'center'}}); | ||
| 207 | + return true; | ||
| 208 | + }} | ||
| 209 | + }} | ||
| 210 | + return false; | ||
| 211 | + }})() | ||
| 212 | + """ | ||
| 213 | + ) | ||
| 214 | + if found: | ||
| 215 | + logger.info("通过 userID 找到评论 (尝试 %d 次)", attempt + 1) | ||
| 216 | + return True | ||
| 217 | + | ||
| 218 | + time.sleep(0.8) | ||
| 219 | + | ||
| 220 | + return False | ||
| 221 | + | ||
| 222 | + | ||
| 223 | +def _js_str(s: str) -> str: | ||
| 224 | + """将 Python 字符串转为 JS 字面量(含引号)。""" | ||
| 225 | + import json | ||
| 226 | + | ||
| 227 | + return json.dumps(s) |
scripts/xhs/cookies.py
0 → 100644
| 1 | +"""Cookie 文件持久化,对应 Go cookies/cookies.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import os | ||
| 6 | +from pathlib import Path | ||
| 7 | + | ||
| 8 | + | ||
| 9 | +def get_cookies_file_path(account: str = "") -> str: | ||
| 10 | + """获取 cookies 文件路径。 | ||
| 11 | + | ||
| 12 | + 优先级: | ||
| 13 | + 1. /tmp/cookies.json(向后兼容) | ||
| 14 | + 2. COOKIES_PATH 环境变量 | ||
| 15 | + 3. 多账号模式:~/.xhs/accounts/{account}/cookies.json | ||
| 16 | + 4. ./cookies.json(本地调试) | ||
| 17 | + """ | ||
| 18 | + if account: | ||
| 19 | + account_dir = Path.home() / ".xhs" / "accounts" / account | ||
| 20 | + account_dir.mkdir(parents=True, exist_ok=True) | ||
| 21 | + return str(account_dir / "cookies.json") | ||
| 22 | + | ||
| 23 | + # 旧路径 | ||
| 24 | + import tempfile | ||
| 25 | + | ||
| 26 | + old_path = os.path.join(tempfile.gettempdir(), "cookies.json") | ||
| 27 | + if os.path.exists(old_path): | ||
| 28 | + return old_path | ||
| 29 | + | ||
| 30 | + # 环境变量 | ||
| 31 | + env_path = os.getenv("COOKIES_PATH") | ||
| 32 | + if env_path: | ||
| 33 | + return env_path | ||
| 34 | + | ||
| 35 | + return "cookies.json" | ||
| 36 | + | ||
| 37 | + | ||
| 38 | +def load_cookies(path: str) -> bytes | None: | ||
| 39 | + """从文件加载 cookies。""" | ||
| 40 | + try: | ||
| 41 | + with open(path, "rb") as f: | ||
| 42 | + return f.read() | ||
| 43 | + except FileNotFoundError: | ||
| 44 | + return None | ||
| 45 | + | ||
| 46 | + | ||
| 47 | +def save_cookies(path: str, data: bytes) -> None: | ||
| 48 | + """保存 cookies 到文件。""" | ||
| 49 | + os.makedirs(os.path.dirname(path) or ".", exist_ok=True) | ||
| 50 | + with open(path, "wb") as f: | ||
| 51 | + f.write(data) | ||
| 52 | + | ||
| 53 | + | ||
| 54 | +def delete_cookies(path: str) -> None: | ||
| 55 | + """删除 cookies 文件。""" | ||
| 56 | + import contextlib | ||
| 57 | + | ||
| 58 | + with contextlib.suppress(FileNotFoundError): | ||
| 59 | + os.remove(path) |
scripts/xhs/errors.py
0 → 100644
| 1 | +"""小红书自动化异常体系。""" | ||
| 2 | + | ||
| 3 | + | ||
| 4 | +class XHSError(Exception): | ||
| 5 | + """小红书自动化基础异常。""" | ||
| 6 | + | ||
| 7 | + | ||
| 8 | +class NoFeedsError(XHSError): | ||
| 9 | + """没有捕获到 feeds 数据。""" | ||
| 10 | + | ||
| 11 | + def __init__(self) -> None: | ||
| 12 | + super().__init__("没有捕获到 feeds 数据") | ||
| 13 | + | ||
| 14 | + | ||
| 15 | +class NoFeedDetailError(XHSError): | ||
| 16 | + """没有捕获到 feed 详情数据。""" | ||
| 17 | + | ||
| 18 | + def __init__(self) -> None: | ||
| 19 | + super().__init__("没有捕获到 feed 详情数据") | ||
| 20 | + | ||
| 21 | + | ||
| 22 | +class NotLoggedInError(XHSError): | ||
| 23 | + """未登录。""" | ||
| 24 | + | ||
| 25 | + def __init__(self) -> None: | ||
| 26 | + super().__init__("未登录,请先扫码登录") | ||
| 27 | + | ||
| 28 | + | ||
| 29 | +class PageNotAccessibleError(XHSError): | ||
| 30 | + """页面不可访问。""" | ||
| 31 | + | ||
| 32 | + def __init__(self, reason: str) -> None: | ||
| 33 | + self.reason = reason | ||
| 34 | + super().__init__(f"笔记不可访问: {reason}") | ||
| 35 | + | ||
| 36 | + | ||
| 37 | +class UploadTimeoutError(XHSError): | ||
| 38 | + """上传超时。""" | ||
| 39 | + | ||
| 40 | + | ||
| 41 | +class PublishError(XHSError): | ||
| 42 | + """发布失败。""" | ||
| 43 | + | ||
| 44 | + | ||
| 45 | +class TitleTooLongError(PublishError): | ||
| 46 | + """标题超过长度限制。""" | ||
| 47 | + | ||
| 48 | + def __init__(self, current: str, maximum: str) -> None: | ||
| 49 | + self.current = current | ||
| 50 | + self.maximum = maximum | ||
| 51 | + super().__init__(f"当前输入长度为{current},最大长度为{maximum}") | ||
| 52 | + | ||
| 53 | + | ||
| 54 | +class ContentTooLongError(PublishError): | ||
| 55 | + """正文超过长度限制。""" | ||
| 56 | + | ||
| 57 | + def __init__(self, current: str, maximum: str) -> None: | ||
| 58 | + self.current = current | ||
| 59 | + self.maximum = maximum | ||
| 60 | + super().__init__(f"当前输入长度为{current},最大长度为{maximum}") | ||
| 61 | + | ||
| 62 | + | ||
| 63 | +class CDPError(XHSError): | ||
| 64 | + """CDP 通信异常。""" | ||
| 65 | + | ||
| 66 | + | ||
| 67 | +class ElementNotFoundError(XHSError): | ||
| 68 | + """页面元素未找到。""" | ||
| 69 | + | ||
| 70 | + def __init__(self, selector: str) -> None: | ||
| 71 | + self.selector = selector | ||
| 72 | + super().__init__(f"未找到元素: {selector}") |
scripts/xhs/feed_detail.py
0 → 100644
| 1 | +"""Feed 详情 + 评论加载,对应 Go xiaohongshu/feed_detail.go(867 行)。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import random | ||
| 8 | +import re | ||
| 9 | +import time | ||
| 10 | + | ||
| 11 | +from .cdp import Page | ||
| 12 | +from .errors import NoFeedDetailError, PageNotAccessibleError | ||
| 13 | +from .human import ( | ||
| 14 | + BUTTON_CLICK_INTERVAL, | ||
| 15 | + DEFAULT_MAX_ATTEMPTS, | ||
| 16 | + FINAL_SPRINT_PUSH_COUNT, | ||
| 17 | + HUMAN_DELAY, | ||
| 18 | + LARGE_SCROLL_TRIGGER, | ||
| 19 | + MAX_CLICK_PER_ROUND, | ||
| 20 | + MIN_SCROLL_DELTA, | ||
| 21 | + POST_SCROLL, | ||
| 22 | + REACTION_TIME, | ||
| 23 | + READ_TIME, | ||
| 24 | + SCROLL_WAIT, | ||
| 25 | + SHORT_READ, | ||
| 26 | + STAGNANT_LIMIT, | ||
| 27 | + calculate_scroll_delta, | ||
| 28 | + get_scroll_interval, | ||
| 29 | + get_scroll_ratio, | ||
| 30 | + sleep_random, | ||
| 31 | +) | ||
| 32 | +from .selectors import ( | ||
| 33 | + ACCESS_ERROR_WRAPPER, | ||
| 34 | + END_CONTAINER, | ||
| 35 | + NO_COMMENTS_TEXT, | ||
| 36 | + PARENT_COMMENT, | ||
| 37 | + SHOW_MORE_BUTTON, | ||
| 38 | +) | ||
| 39 | +from .types import ( | ||
| 40 | + CommentList, | ||
| 41 | + CommentLoadConfig, | ||
| 42 | + FeedDetail, | ||
| 43 | + FeedDetailResponse, | ||
| 44 | +) | ||
| 45 | +from .urls import make_feed_detail_url | ||
| 46 | + | ||
| 47 | +logger = logging.getLogger(__name__) | ||
| 48 | + | ||
| 49 | +# 页面不可访问关键词 | ||
| 50 | +_INACCESSIBLE_KEYWORDS = [ | ||
| 51 | + "当前笔记暂时无法浏览", | ||
| 52 | + "该内容因违规已被删除", | ||
| 53 | + "该笔记已被删除", | ||
| 54 | + "内容不存在", | ||
| 55 | + "笔记不存在", | ||
| 56 | + "已失效", | ||
| 57 | + "私密笔记", | ||
| 58 | + "仅作者可见", | ||
| 59 | + "因用户设置,你无法查看", | ||
| 60 | + "因违规无法查看", | ||
| 61 | +] | ||
| 62 | + | ||
| 63 | +_REPLY_COUNT_RE = re.compile(r"展开\s*(\d+)\s*条回复") | ||
| 64 | +_TOTAL_COMMENT_RE = re.compile(r"共(\d+)条评论") | ||
| 65 | + | ||
| 66 | + | ||
| 67 | +def get_feed_detail( | ||
| 68 | + page: Page, | ||
| 69 | + feed_id: str, | ||
| 70 | + xsec_token: str, | ||
| 71 | + load_all_comments: bool = False, | ||
| 72 | + config: CommentLoadConfig | None = None, | ||
| 73 | +) -> FeedDetailResponse: | ||
| 74 | + """获取 Feed 详情(含评论)。 | ||
| 75 | + | ||
| 76 | + Args: | ||
| 77 | + page: CDP 页面对象。 | ||
| 78 | + feed_id: Feed ID。 | ||
| 79 | + xsec_token: xsec_token。 | ||
| 80 | + load_all_comments: 是否加载全部评论。 | ||
| 81 | + config: 评论加载配置。 | ||
| 82 | + | ||
| 83 | + Raises: | ||
| 84 | + PageNotAccessibleError: 页面不可访问。 | ||
| 85 | + NoFeedDetailError: 未获取到详情数据。 | ||
| 86 | + """ | ||
| 87 | + if config is None: | ||
| 88 | + config = CommentLoadConfig() | ||
| 89 | + | ||
| 90 | + url = make_feed_detail_url(feed_id, xsec_token) | ||
| 91 | + logger.info("打开 feed 详情页: %s", url) | ||
| 92 | + logger.info( | ||
| 93 | + "配置: 点击更多=%s, 回复阈值=%d, 最大评论数=%d, 滚动速度=%s", | ||
| 94 | + config.click_more_replies, | ||
| 95 | + config.max_replies_threshold, | ||
| 96 | + config.max_comment_items, | ||
| 97 | + config.scroll_speed, | ||
| 98 | + ) | ||
| 99 | + | ||
| 100 | + # 导航(含重试) | ||
| 101 | + for attempt in range(3): | ||
| 102 | + try: | ||
| 103 | + page.navigate(url) | ||
| 104 | + page.wait_for_load() | ||
| 105 | + page.wait_dom_stable() | ||
| 106 | + break | ||
| 107 | + except Exception as e: | ||
| 108 | + logger.debug("页面导航重试 #%d: %s", attempt, e) | ||
| 109 | + time.sleep(0.5 + random.random()) | ||
| 110 | + else: | ||
| 111 | + raise RuntimeError("页面导航失败") | ||
| 112 | + | ||
| 113 | + sleep_random(1000, 1000) | ||
| 114 | + | ||
| 115 | + # 检查页面可访问性 | ||
| 116 | + _check_page_accessible(page) | ||
| 117 | + | ||
| 118 | + # 加载全部评论 | ||
| 119 | + if load_all_comments: | ||
| 120 | + try: | ||
| 121 | + _load_all_comments(page, config) | ||
| 122 | + except Exception as e: | ||
| 123 | + logger.warning("加载全部评论失败: %s", e) | ||
| 124 | + | ||
| 125 | + return _extract_feed_detail(page, feed_id) | ||
| 126 | + | ||
| 127 | + | ||
| 128 | +# ========== 页面检查 ========== | ||
| 129 | + | ||
| 130 | + | ||
| 131 | +def _check_page_accessible(page: Page) -> None: | ||
| 132 | + """检查页面是否可访问。""" | ||
| 133 | + time.sleep(0.5) | ||
| 134 | + | ||
| 135 | + text = page.get_element_text(ACCESS_ERROR_WRAPPER) | ||
| 136 | + if not text: | ||
| 137 | + return | ||
| 138 | + | ||
| 139 | + text = text.strip() | ||
| 140 | + for kw in _INACCESSIBLE_KEYWORDS: | ||
| 141 | + if kw in text: | ||
| 142 | + raise PageNotAccessibleError(kw) | ||
| 143 | + | ||
| 144 | + if text: | ||
| 145 | + raise PageNotAccessibleError(text) | ||
| 146 | + | ||
| 147 | + | ||
| 148 | +# ========== 数据提取 ========== | ||
| 149 | + | ||
| 150 | + | ||
| 151 | +_EXTRACT_DETAIL_JS = """ | ||
| 152 | +(() => { | ||
| 153 | + if (window.__INITIAL_STATE__ && | ||
| 154 | + window.__INITIAL_STATE__.note && | ||
| 155 | + window.__INITIAL_STATE__.note.noteDetailMap) { | ||
| 156 | + return JSON.stringify(window.__INITIAL_STATE__.note.noteDetailMap); | ||
| 157 | + } | ||
| 158 | + return ""; | ||
| 159 | +})() | ||
| 160 | +""" | ||
| 161 | + | ||
| 162 | + | ||
| 163 | +def _extract_feed_detail(page: Page, feed_id: str) -> FeedDetailResponse: | ||
| 164 | + """从 __INITIAL_STATE__ 提取 Feed 详情。""" | ||
| 165 | + result = None | ||
| 166 | + for _ in range(3): | ||
| 167 | + result = page.evaluate(_EXTRACT_DETAIL_JS) | ||
| 168 | + if result: | ||
| 169 | + break | ||
| 170 | + time.sleep(0.2) | ||
| 171 | + | ||
| 172 | + if not result: | ||
| 173 | + raise NoFeedDetailError() | ||
| 174 | + | ||
| 175 | + note_detail_map = json.loads(result) | ||
| 176 | + note_data = note_detail_map.get(feed_id) | ||
| 177 | + if not note_data: | ||
| 178 | + raise NoFeedDetailError() | ||
| 179 | + | ||
| 180 | + return FeedDetailResponse( | ||
| 181 | + note=FeedDetail.from_dict(note_data.get("note", {})), | ||
| 182 | + comments=CommentList.from_dict(note_data.get("comments", {})), | ||
| 183 | + ) | ||
| 184 | + | ||
| 185 | + | ||
| 186 | +# ========== 评论加载状态机 ========== | ||
| 187 | + | ||
| 188 | + | ||
| 189 | +def _load_all_comments(page: Page, config: CommentLoadConfig) -> None: | ||
| 190 | + """加载全部评论的状态机。""" | ||
| 191 | + max_attempts = ( | ||
| 192 | + config.max_comment_items * 3 if config.max_comment_items > 0 else DEFAULT_MAX_ATTEMPTS | ||
| 193 | + ) | ||
| 194 | + scroll_interval = get_scroll_interval(config.scroll_speed) | ||
| 195 | + | ||
| 196 | + logger.info("开始加载评论...") | ||
| 197 | + _scroll_to_comments_area(page) | ||
| 198 | + sleep_random(*HUMAN_DELAY) | ||
| 199 | + | ||
| 200 | + # 检查是否无评论 | ||
| 201 | + if _check_no_comments(page): | ||
| 202 | + logger.info("检测到无评论区域,跳过加载") | ||
| 203 | + return | ||
| 204 | + | ||
| 205 | + # 状态 | ||
| 206 | + last_count = 0 | ||
| 207 | + last_scroll_top = 0 | ||
| 208 | + stagnant_checks = 0 | ||
| 209 | + total_clicked = 0 | ||
| 210 | + total_skipped = 0 | ||
| 211 | + | ||
| 212 | + for attempt in range(max_attempts): | ||
| 213 | + logger.debug("=== 尝试 %d/%d ===", attempt + 1, max_attempts) | ||
| 214 | + | ||
| 215 | + # 检查是否到达底部 | ||
| 216 | + if _check_end_container(page): | ||
| 217 | + count = _get_comment_count(page) | ||
| 218 | + logger.info( | ||
| 219 | + "检测到 THE END,加载完成: %d 条评论, 点击: %d, 跳过: %d", | ||
| 220 | + count, | ||
| 221 | + total_clicked, | ||
| 222 | + total_skipped, | ||
| 223 | + ) | ||
| 224 | + return | ||
| 225 | + | ||
| 226 | + # 定期点击展开按钮 | ||
| 227 | + if config.click_more_replies and attempt % BUTTON_CLICK_INTERVAL == 0: | ||
| 228 | + clicked, skipped = _click_show_more_buttons(page, config.max_replies_threshold) | ||
| 229 | + total_clicked += clicked | ||
| 230 | + total_skipped += skipped | ||
| 231 | + if clicked > 0 or skipped > 0: | ||
| 232 | + sleep_random(*READ_TIME) | ||
| 233 | + # 第二轮 | ||
| 234 | + c2, s2 = _click_show_more_buttons(page, config.max_replies_threshold) | ||
| 235 | + total_clicked += c2 | ||
| 236 | + total_skipped += s2 | ||
| 237 | + if c2 > 0 or s2 > 0: | ||
| 238 | + sleep_random(*SHORT_READ) | ||
| 239 | + | ||
| 240 | + # 获取当前评论数 | ||
| 241 | + current_count = _get_comment_count(page) | ||
| 242 | + if current_count != last_count: | ||
| 243 | + logger.info("评论增加: %d -> %d", last_count, current_count) | ||
| 244 | + last_count = current_count | ||
| 245 | + stagnant_checks = 0 | ||
| 246 | + else: | ||
| 247 | + stagnant_checks += 1 | ||
| 248 | + | ||
| 249 | + # 检查是否达到目标 | ||
| 250 | + if config.max_comment_items > 0 and current_count >= config.max_comment_items: | ||
| 251 | + logger.info("已达到目标评论数: %d/%d", current_count, config.max_comment_items) | ||
| 252 | + return | ||
| 253 | + | ||
| 254 | + # 滚动 | ||
| 255 | + if current_count > 0: | ||
| 256 | + _scroll_to_last_comment(page) | ||
| 257 | + sleep_random(*POST_SCROLL) | ||
| 258 | + | ||
| 259 | + large_mode = stagnant_checks >= LARGE_SCROLL_TRIGGER | ||
| 260 | + push_count = 1 | ||
| 261 | + if large_mode: | ||
| 262 | + push_count = 3 + random.randint(0, 2) | ||
| 263 | + | ||
| 264 | + scroll_delta, current_scroll_top = _human_scroll( | ||
| 265 | + page, config.scroll_speed, large_mode, push_count | ||
| 266 | + ) | ||
| 267 | + | ||
| 268 | + if scroll_delta < MIN_SCROLL_DELTA or current_scroll_top == last_scroll_top: | ||
| 269 | + stagnant_checks += 1 | ||
| 270 | + else: | ||
| 271 | + stagnant_checks = 0 | ||
| 272 | + last_scroll_top = current_scroll_top | ||
| 273 | + | ||
| 274 | + # 停滞处理 | ||
| 275 | + if stagnant_checks >= STAGNANT_LIMIT: | ||
| 276 | + logger.info("停滞过多,尝试大冲刺...") | ||
| 277 | + _human_scroll(page, config.scroll_speed, True, 10) | ||
| 278 | + stagnant_checks = 0 | ||
| 279 | + | ||
| 280 | + time.sleep(scroll_interval) | ||
| 281 | + | ||
| 282 | + # 最终冲刺 | ||
| 283 | + logger.info("达到最大尝试次数,最后冲刺...") | ||
| 284 | + _human_scroll(page, config.scroll_speed, True, FINAL_SPRINT_PUSH_COUNT) | ||
| 285 | + count = _get_comment_count(page) | ||
| 286 | + logger.info("加载结束: %d 条评论, 点击: %d, 跳过: %d", count, total_clicked, total_skipped) | ||
| 287 | + | ||
| 288 | + | ||
| 289 | +# ========== 滚动 ========== | ||
| 290 | + | ||
| 291 | + | ||
| 292 | +def _human_scroll( | ||
| 293 | + page: Page, | ||
| 294 | + speed: str, | ||
| 295 | + large_mode: bool, | ||
| 296 | + push_count: int, | ||
| 297 | +) -> tuple[int, int]: | ||
| 298 | + """人类化滚动。 | ||
| 299 | + | ||
| 300 | + Returns: | ||
| 301 | + (actual_delta, current_scroll_top) | ||
| 302 | + """ | ||
| 303 | + before_top = page.get_scroll_top() | ||
| 304 | + viewport_height = page.get_viewport_height() | ||
| 305 | + | ||
| 306 | + base_ratio = get_scroll_ratio(speed) | ||
| 307 | + if large_mode: | ||
| 308 | + base_ratio *= 2.0 | ||
| 309 | + | ||
| 310 | + actual_delta = 0 | ||
| 311 | + current_scroll_top = before_top | ||
| 312 | + | ||
| 313 | + for i in range(max(1, push_count)): | ||
| 314 | + scroll_delta = calculate_scroll_delta(viewport_height, base_ratio) | ||
| 315 | + page.scroll_by(0, int(scroll_delta)) | ||
| 316 | + sleep_random(*SCROLL_WAIT) | ||
| 317 | + | ||
| 318 | + current_scroll_top = page.get_scroll_top() | ||
| 319 | + delta_this = current_scroll_top - before_top | ||
| 320 | + actual_delta += delta_this | ||
| 321 | + before_top = current_scroll_top | ||
| 322 | + | ||
| 323 | + if i < push_count - 1: | ||
| 324 | + sleep_random(*HUMAN_DELAY) | ||
| 325 | + | ||
| 326 | + # 如果没有滚动,强制到底部 | ||
| 327 | + if actual_delta < MIN_SCROLL_DELTA and push_count > 0: | ||
| 328 | + page.scroll_to_bottom() | ||
| 329 | + sleep_random(*POST_SCROLL) | ||
| 330 | + current_scroll_top = page.get_scroll_top() | ||
| 331 | + actual_delta = current_scroll_top - (before_top - actual_delta) | ||
| 332 | + | ||
| 333 | + return actual_delta, current_scroll_top | ||
| 334 | + | ||
| 335 | + | ||
| 336 | +def _scroll_to_comments_area(page: Page) -> None: | ||
| 337 | + """滚动到评论区。""" | ||
| 338 | + logger.info("滚动到评论区...") | ||
| 339 | + page.scroll_element_into_view(".comments-container") | ||
| 340 | + time.sleep(0.5) | ||
| 341 | + # 触发懒加载 | ||
| 342 | + page.dispatch_wheel_event(100) | ||
| 343 | + | ||
| 344 | + | ||
| 345 | +def _scroll_to_last_comment(page: Page) -> None: | ||
| 346 | + """滚动到最后一条评论。""" | ||
| 347 | + count = page.get_elements_count(PARENT_COMMENT) | ||
| 348 | + if count > 0: | ||
| 349 | + page.scroll_nth_element_into_view(PARENT_COMMENT, count - 1) | ||
| 350 | + | ||
| 351 | + | ||
| 352 | +# ========== DOM 查询 ========== | ||
| 353 | + | ||
| 354 | + | ||
| 355 | +def _get_comment_count(page: Page) -> int: | ||
| 356 | + """获取当前评论数量。""" | ||
| 357 | + return page.get_elements_count(PARENT_COMMENT) | ||
| 358 | + | ||
| 359 | + | ||
| 360 | +def _get_total_comment_count(page: Page) -> int: | ||
| 361 | + """获取总评论数(从 "共N条评论" 提取)。""" | ||
| 362 | + text = page.get_element_text(".comments-container .total") | ||
| 363 | + if not text: | ||
| 364 | + return 0 | ||
| 365 | + match = _TOTAL_COMMENT_RE.search(text) | ||
| 366 | + if match: | ||
| 367 | + return int(match.group(1)) | ||
| 368 | + return 0 | ||
| 369 | + | ||
| 370 | + | ||
| 371 | +def _check_no_comments(page: Page) -> bool: | ||
| 372 | + """检查是否无评论区域。""" | ||
| 373 | + text = page.get_element_text(NO_COMMENTS_TEXT) | ||
| 374 | + if not text: | ||
| 375 | + return False | ||
| 376 | + return "这是一片荒地" in text.strip() | ||
| 377 | + | ||
| 378 | + | ||
| 379 | +def _check_end_container(page: Page) -> bool: | ||
| 380 | + """检查是否到达底部 THE END。""" | ||
| 381 | + text = page.get_element_text(END_CONTAINER) | ||
| 382 | + if not text: | ||
| 383 | + return False | ||
| 384 | + upper = text.strip().upper() | ||
| 385 | + return "THE END" in upper or "THEEND" in upper | ||
| 386 | + | ||
| 387 | + | ||
| 388 | +# ========== 按钮点击 ========== | ||
| 389 | + | ||
| 390 | + | ||
| 391 | +def _click_show_more_buttons(page: Page, max_threshold: int) -> tuple[int, int]: | ||
| 392 | + """点击"展开N条回复"按钮。 | ||
| 393 | + | ||
| 394 | + Returns: | ||
| 395 | + (clicked, skipped) | ||
| 396 | + """ | ||
| 397 | + count = page.get_elements_count(SHOW_MORE_BUTTON) | ||
| 398 | + if count == 0: | ||
| 399 | + return 0, 0 | ||
| 400 | + | ||
| 401 | + max_click = MAX_CLICK_PER_ROUND + random.randint(0, MAX_CLICK_PER_ROUND - 1) | ||
| 402 | + clicked = 0 | ||
| 403 | + skipped = 0 | ||
| 404 | + | ||
| 405 | + for i in range(count): | ||
| 406 | + if clicked >= max_click: | ||
| 407 | + break | ||
| 408 | + | ||
| 409 | + # 获取按钮文本 | ||
| 410 | + text = page.evaluate( | ||
| 411 | + f"document.querySelectorAll({json.dumps(SHOW_MORE_BUTTON)})[{i}]?.textContent || ''" | ||
| 412 | + ) | ||
| 413 | + if not text: | ||
| 414 | + continue | ||
| 415 | + | ||
| 416 | + # 检查是否应该跳过 | ||
| 417 | + if max_threshold > 0: | ||
| 418 | + match = _REPLY_COUNT_RE.search(text) | ||
| 419 | + if match: | ||
| 420 | + reply_count = int(match.group(1)) | ||
| 421 | + if reply_count > max_threshold: | ||
| 422 | + logger.debug( | ||
| 423 | + "跳过 '%s'(回复数 %d > 阈值 %d)", text, reply_count, max_threshold | ||
| 424 | + ) | ||
| 425 | + skipped += 1 | ||
| 426 | + continue | ||
| 427 | + | ||
| 428 | + # 滚动到按钮并点击 | ||
| 429 | + page.scroll_nth_element_into_view(SHOW_MORE_BUTTON, i) | ||
| 430 | + sleep_random(*REACTION_TIME) | ||
| 431 | + page.evaluate(f"document.querySelectorAll({json.dumps(SHOW_MORE_BUTTON)})[{i}]?.click()") | ||
| 432 | + sleep_random(*READ_TIME) | ||
| 433 | + clicked += 1 | ||
| 434 | + | ||
| 435 | + return clicked, skipped |
scripts/xhs/feeds.py
0 → 100644
| 1 | +"""首页 Feed 列表,对应 Go xiaohongshu/feeds.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import time | ||
| 8 | + | ||
| 9 | +from .cdp import Page | ||
| 10 | +from .errors import NoFeedsError | ||
| 11 | +from .types import Feed | ||
| 12 | +from .urls import HOME_URL | ||
| 13 | + | ||
| 14 | +logger = logging.getLogger(__name__) | ||
| 15 | + | ||
| 16 | +# 从 __INITIAL_STATE__ 提取 feeds 的 JS | ||
| 17 | +_EXTRACT_FEEDS_JS = """ | ||
| 18 | +(() => { | ||
| 19 | + if (window.__INITIAL_STATE__ && | ||
| 20 | + window.__INITIAL_STATE__.feed && | ||
| 21 | + window.__INITIAL_STATE__.feed.feeds) { | ||
| 22 | + const feeds = window.__INITIAL_STATE__.feed.feeds; | ||
| 23 | + const feedsData = feeds.value !== undefined ? feeds.value : feeds._value; | ||
| 24 | + if (feedsData) { | ||
| 25 | + return JSON.stringify(feedsData); | ||
| 26 | + } | ||
| 27 | + } | ||
| 28 | + return ""; | ||
| 29 | +})() | ||
| 30 | +""" | ||
| 31 | + | ||
| 32 | + | ||
| 33 | +def list_feeds(page: Page) -> list[Feed]: | ||
| 34 | + """获取首页 Feed 列表。 | ||
| 35 | + | ||
| 36 | + Raises: | ||
| 37 | + NoFeedsError: 没有捕获到 feeds 数据。 | ||
| 38 | + """ | ||
| 39 | + page.navigate(HOME_URL) | ||
| 40 | + page.wait_for_load() | ||
| 41 | + page.wait_dom_stable() | ||
| 42 | + time.sleep(1) | ||
| 43 | + | ||
| 44 | + result = page.evaluate(_EXTRACT_FEEDS_JS) | ||
| 45 | + if not result: | ||
| 46 | + raise NoFeedsError() | ||
| 47 | + | ||
| 48 | + feeds_data = json.loads(result) | ||
| 49 | + return [Feed.from_dict(f) for f in feeds_data] |
scripts/xhs/human.py
0 → 100644
| 1 | +"""人类行为模拟参数(延迟、滚动、悬停),对应 Go feed_detail.go 中的常量。""" | ||
| 2 | + | ||
| 3 | +import random | ||
| 4 | +import time | ||
| 5 | + | ||
| 6 | +# ========== 配置常量 ========== | ||
| 7 | +DEFAULT_MAX_ATTEMPTS = 500 | ||
| 8 | +STAGNANT_LIMIT = 20 | ||
| 9 | +MIN_SCROLL_DELTA = 10 | ||
| 10 | +MAX_CLICK_PER_ROUND = 3 | ||
| 11 | +STAGNANT_CHECK_THRESHOLD = 2 | ||
| 12 | +LARGE_SCROLL_TRIGGER = 5 | ||
| 13 | +BUTTON_CLICK_INTERVAL = 3 | ||
| 14 | +FINAL_SPRINT_PUSH_COUNT = 15 | ||
| 15 | + | ||
| 16 | +# ========== 延迟范围(毫秒) ========== | ||
| 17 | +HUMAN_DELAY = (300, 700) | ||
| 18 | +REACTION_TIME = (300, 800) | ||
| 19 | +HOVER_TIME = (100, 300) | ||
| 20 | +READ_TIME = (500, 1200) | ||
| 21 | +SHORT_READ = (600, 1200) | ||
| 22 | +SCROLL_WAIT = (100, 200) | ||
| 23 | +POST_SCROLL = (300, 500) | ||
| 24 | + | ||
| 25 | + | ||
| 26 | +def sleep_random(min_ms: int, max_ms: int) -> None: | ||
| 27 | + """随机延迟。""" | ||
| 28 | + if max_ms <= min_ms: | ||
| 29 | + time.sleep(min_ms / 1000.0) | ||
| 30 | + return | ||
| 31 | + delay = random.randint(min_ms, max_ms) / 1000.0 | ||
| 32 | + time.sleep(delay) | ||
| 33 | + | ||
| 34 | + | ||
| 35 | +def get_scroll_interval(speed: str) -> float: | ||
| 36 | + """根据速度获取滚动间隔(秒)。""" | ||
| 37 | + if speed == "slow": | ||
| 38 | + return (1200 + random.randint(0, 300)) / 1000.0 | ||
| 39 | + if speed == "fast": | ||
| 40 | + return (300 + random.randint(0, 100)) / 1000.0 | ||
| 41 | + # normal | ||
| 42 | + return (600 + random.randint(0, 200)) / 1000.0 | ||
| 43 | + | ||
| 44 | + | ||
| 45 | +def get_scroll_ratio(speed: str) -> float: | ||
| 46 | + """根据速度获取滚动比例。""" | ||
| 47 | + if speed == "slow": | ||
| 48 | + return 0.5 | ||
| 49 | + if speed == "fast": | ||
| 50 | + return 0.9 | ||
| 51 | + return 0.7 | ||
| 52 | + | ||
| 53 | + | ||
| 54 | +def calculate_scroll_delta(viewport_height: int, base_ratio: float) -> float: | ||
| 55 | + """计算滚动距离。""" | ||
| 56 | + scroll_delta = viewport_height * (base_ratio + random.random() * 0.2) | ||
| 57 | + if scroll_delta < 400: | ||
| 58 | + scroll_delta = 400.0 | ||
| 59 | + return scroll_delta + random.randint(-50, 50) | ||
| 60 | + | ||
| 61 | + | ||
| 62 | +# 页面不可访问关键词 | ||
| 63 | +INACCESSIBLE_KEYWORDS = [ | ||
| 64 | + "当前笔记暂时无法浏览", | ||
| 65 | + "该内容因违规已被删除", | ||
| 66 | + "该笔记已被删除", | ||
| 67 | + "内容不存在", | ||
| 68 | + "笔记不存在", | ||
| 69 | + "已失效", | ||
| 70 | + "私密笔记", | ||
| 71 | + "仅作者可见", | ||
| 72 | + "因用户设置,你无法查看", | ||
| 73 | + "因违规无法查看", | ||
| 74 | +] |
scripts/xhs/like_favorite.py
0 → 100644
| 1 | +"""点赞/收藏操作,对应 Go xiaohongshu/like_favorite.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import time | ||
| 8 | + | ||
| 9 | +from .cdp import Page | ||
| 10 | +from .errors import NoFeedDetailError | ||
| 11 | +from .selectors import COLLECT_BUTTON, LIKE_BUTTON | ||
| 12 | +from .types import ActionResult | ||
| 13 | +from .urls import make_feed_detail_url | ||
| 14 | + | ||
| 15 | +logger = logging.getLogger(__name__) | ||
| 16 | + | ||
| 17 | +# 从 __INITIAL_STATE__ 读取互动状态的 JS | ||
| 18 | +_GET_INTERACT_STATE_JS = """ | ||
| 19 | +(() => { | ||
| 20 | + if (window.__INITIAL_STATE__ && | ||
| 21 | + window.__INITIAL_STATE__.note && | ||
| 22 | + window.__INITIAL_STATE__.note.noteDetailMap) { | ||
| 23 | + return JSON.stringify(window.__INITIAL_STATE__.note.noteDetailMap); | ||
| 24 | + } | ||
| 25 | + return ""; | ||
| 26 | +})() | ||
| 27 | +""" | ||
| 28 | + | ||
| 29 | + | ||
| 30 | +def _get_interact_state(page: Page, feed_id: str) -> tuple[bool, bool]: | ||
| 31 | + """读取笔记的点赞/收藏状态。 | ||
| 32 | + | ||
| 33 | + Returns: | ||
| 34 | + (liked, collected) | ||
| 35 | + | ||
| 36 | + Raises: | ||
| 37 | + NoFeedDetailError: 无法获取状态。 | ||
| 38 | + """ | ||
| 39 | + result = page.evaluate(_GET_INTERACT_STATE_JS) | ||
| 40 | + if not result: | ||
| 41 | + raise NoFeedDetailError() | ||
| 42 | + | ||
| 43 | + note_detail_map = json.loads(result) | ||
| 44 | + detail = note_detail_map.get(feed_id) | ||
| 45 | + if not detail: | ||
| 46 | + raise NoFeedDetailError() | ||
| 47 | + | ||
| 48 | + interact = detail.get("note", {}).get("interactInfo", {}) | ||
| 49 | + return interact.get("liked", False), interact.get("collected", False) | ||
| 50 | + | ||
| 51 | + | ||
| 52 | +def _prepare_page(page: Page, feed_id: str, xsec_token: str) -> None: | ||
| 53 | + """导航到 feed 详情页。""" | ||
| 54 | + url = make_feed_detail_url(feed_id, xsec_token) | ||
| 55 | + page.navigate(url) | ||
| 56 | + page.wait_for_load() | ||
| 57 | + page.wait_dom_stable() | ||
| 58 | + time.sleep(1) | ||
| 59 | + | ||
| 60 | + | ||
| 61 | +# ========== 点赞 ========== | ||
| 62 | + | ||
| 63 | + | ||
| 64 | +def like_feed(page: Page, feed_id: str, xsec_token: str) -> ActionResult: | ||
| 65 | + """点赞笔记(幂等:已点赞则跳过)。""" | ||
| 66 | + _prepare_page(page, feed_id, xsec_token) | ||
| 67 | + return _toggle_like(page, feed_id, target_liked=True) | ||
| 68 | + | ||
| 69 | + | ||
| 70 | +def unlike_feed(page: Page, feed_id: str, xsec_token: str) -> ActionResult: | ||
| 71 | + """取消点赞(幂等:未点赞则跳过)。""" | ||
| 72 | + _prepare_page(page, feed_id, xsec_token) | ||
| 73 | + return _toggle_like(page, feed_id, target_liked=False) | ||
| 74 | + | ||
| 75 | + | ||
| 76 | +def _toggle_like(page: Page, feed_id: str, target_liked: bool) -> ActionResult: | ||
| 77 | + """执行点赞/取消点赞操作。""" | ||
| 78 | + action_name = "点赞" if target_liked else "取消点赞" | ||
| 79 | + | ||
| 80 | + try: | ||
| 81 | + liked, _ = _get_interact_state(page, feed_id) | ||
| 82 | + except NoFeedDetailError: | ||
| 83 | + logger.warning("无法读取互动状态,直接点击") | ||
| 84 | + liked = not target_liked # 强制执行点击 | ||
| 85 | + | ||
| 86 | + # 幂等检查 | ||
| 87 | + if liked == target_liked: | ||
| 88 | + logger.info("feed %s 已%s,跳过", feed_id, action_name) | ||
| 89 | + return ActionResult(feed_id=feed_id, success=True, message=f"已{action_name}") | ||
| 90 | + | ||
| 91 | + # 点击 | ||
| 92 | + page.click_element(LIKE_BUTTON) | ||
| 93 | + time.sleep(3) | ||
| 94 | + | ||
| 95 | + # 验证 | ||
| 96 | + try: | ||
| 97 | + liked, _ = _get_interact_state(page, feed_id) | ||
| 98 | + if liked == target_liked: | ||
| 99 | + logger.info("feed %s %s成功", feed_id, action_name) | ||
| 100 | + return ActionResult(feed_id=feed_id, success=True, message=f"{action_name}成功") | ||
| 101 | + except NoFeedDetailError: | ||
| 102 | + pass | ||
| 103 | + | ||
| 104 | + # 重试一次 | ||
| 105 | + logger.warning("feed %s %s可能未成功,重试", feed_id, action_name) | ||
| 106 | + page.click_element(LIKE_BUTTON) | ||
| 107 | + time.sleep(2) | ||
| 108 | + | ||
| 109 | + return ActionResult(feed_id=feed_id, success=True, message=f"{action_name}已执行") | ||
| 110 | + | ||
| 111 | + | ||
| 112 | +# ========== 收藏 ========== | ||
| 113 | + | ||
| 114 | + | ||
| 115 | +def favorite_feed(page: Page, feed_id: str, xsec_token: str) -> ActionResult: | ||
| 116 | + """收藏笔记(幂等:已收藏则跳过)。""" | ||
| 117 | + _prepare_page(page, feed_id, xsec_token) | ||
| 118 | + return _toggle_favorite(page, feed_id, target_collected=True) | ||
| 119 | + | ||
| 120 | + | ||
| 121 | +def unfavorite_feed(page: Page, feed_id: str, xsec_token: str) -> ActionResult: | ||
| 122 | + """取消收藏(幂等:未收藏则跳过)。""" | ||
| 123 | + _prepare_page(page, feed_id, xsec_token) | ||
| 124 | + return _toggle_favorite(page, feed_id, target_collected=False) | ||
| 125 | + | ||
| 126 | + | ||
| 127 | +def _toggle_favorite(page: Page, feed_id: str, target_collected: bool) -> ActionResult: | ||
| 128 | + """执行收藏/取消收藏操作。""" | ||
| 129 | + action_name = "收藏" if target_collected else "取消收藏" | ||
| 130 | + | ||
| 131 | + try: | ||
| 132 | + _, collected = _get_interact_state(page, feed_id) | ||
| 133 | + except NoFeedDetailError: | ||
| 134 | + logger.warning("无法读取互动状态,直接点击") | ||
| 135 | + collected = not target_collected | ||
| 136 | + | ||
| 137 | + # 幂等检查 | ||
| 138 | + if collected == target_collected: | ||
| 139 | + logger.info("feed %s 已%s,跳过", feed_id, action_name) | ||
| 140 | + return ActionResult(feed_id=feed_id, success=True, message=f"已{action_name}") | ||
| 141 | + | ||
| 142 | + # 点击 | ||
| 143 | + page.click_element(COLLECT_BUTTON) | ||
| 144 | + time.sleep(3) | ||
| 145 | + | ||
| 146 | + # 验证 | ||
| 147 | + try: | ||
| 148 | + _, collected = _get_interact_state(page, feed_id) | ||
| 149 | + if collected == target_collected: | ||
| 150 | + logger.info("feed %s %s成功", feed_id, action_name) | ||
| 151 | + return ActionResult(feed_id=feed_id, success=True, message=f"{action_name}成功") | ||
| 152 | + except NoFeedDetailError: | ||
| 153 | + pass | ||
| 154 | + | ||
| 155 | + # 重试 | ||
| 156 | + logger.warning("feed %s %s可能未成功,重试", feed_id, action_name) | ||
| 157 | + page.click_element(COLLECT_BUTTON) | ||
| 158 | + time.sleep(2) | ||
| 159 | + | ||
| 160 | + return ActionResult(feed_id=feed_id, success=True, message=f"{action_name}已执行") |
scripts/xhs/login.py
0 → 100644
| 1 | +"""登录管理,对应 Go xiaohongshu/login.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import base64 | ||
| 6 | +import logging | ||
| 7 | +import os | ||
| 8 | +import tempfile | ||
| 9 | +import time | ||
| 10 | + | ||
| 11 | +from .cdp import Page | ||
| 12 | +from .selectors import LOGIN_STATUS, QRCODE_IMG | ||
| 13 | +from .urls import EXPLORE_URL | ||
| 14 | + | ||
| 15 | +logger = logging.getLogger(__name__) | ||
| 16 | + | ||
| 17 | + | ||
| 18 | +def check_login_status(page: Page) -> bool: | ||
| 19 | + """检查登录状态。 | ||
| 20 | + | ||
| 21 | + Returns: | ||
| 22 | + True 已登录,False 未登录。 | ||
| 23 | + """ | ||
| 24 | + page.navigate(EXPLORE_URL) | ||
| 25 | + page.wait_for_load() | ||
| 26 | + time.sleep(1) | ||
| 27 | + | ||
| 28 | + return page.has_element(LOGIN_STATUS) | ||
| 29 | + | ||
| 30 | + | ||
| 31 | +def fetch_qrcode(page: Page) -> tuple[str, bool]: | ||
| 32 | + """获取登录二维码。 | ||
| 33 | + | ||
| 34 | + Returns: | ||
| 35 | + (qrcode_src, already_logged_in) | ||
| 36 | + - 如果已登录,返回 ("", True) | ||
| 37 | + - 如果未登录,返回 (qrcode_base64_or_url, False) | ||
| 38 | + """ | ||
| 39 | + page.navigate(EXPLORE_URL) | ||
| 40 | + page.wait_for_load() | ||
| 41 | + time.sleep(2) | ||
| 42 | + | ||
| 43 | + # 检查是否已登录 | ||
| 44 | + if page.has_element(LOGIN_STATUS): | ||
| 45 | + return "", True | ||
| 46 | + | ||
| 47 | + # 获取二维码图片 src | ||
| 48 | + src = page.get_element_attribute(QRCODE_IMG, "src") | ||
| 49 | + if not src: | ||
| 50 | + raise RuntimeError("二维码图片 src 为空") | ||
| 51 | + | ||
| 52 | + return src, False | ||
| 53 | + | ||
| 54 | + | ||
| 55 | +def save_qrcode_to_file(src: str) -> str: | ||
| 56 | + """将二维码 data URL 保存为临时 PNG 文件。 | ||
| 57 | + | ||
| 58 | + Args: | ||
| 59 | + src: 二维码图片的 data URL(data:image/png;base64,...)或普通 URL。 | ||
| 60 | + | ||
| 61 | + Returns: | ||
| 62 | + 保存的文件绝对路径。 | ||
| 63 | + """ | ||
| 64 | + prefix = "data:image/png;base64," | ||
| 65 | + if src.startswith(prefix): | ||
| 66 | + img_data = base64.b64decode(src[len(prefix) :]) | ||
| 67 | + elif src.startswith("data:image/"): | ||
| 68 | + # 处理其他 MIME 类型,如 data:image/jpeg;base64,... | ||
| 69 | + _, encoded = src.split(",", 1) | ||
| 70 | + img_data = base64.b64decode(encoded) | ||
| 71 | + else: | ||
| 72 | + # 不是 data URL,无法保存 | ||
| 73 | + raise ValueError(f"不支持的二维码格式,需要 data URL: {src[:50]}...") | ||
| 74 | + | ||
| 75 | + qr_dir = os.path.join(tempfile.gettempdir(), "xhs") | ||
| 76 | + os.makedirs(qr_dir, exist_ok=True) | ||
| 77 | + filepath = os.path.join(qr_dir, "login_qrcode.png") | ||
| 78 | + | ||
| 79 | + with open(filepath, "wb") as f: | ||
| 80 | + f.write(img_data) | ||
| 81 | + | ||
| 82 | + logger.info("二维码已保存: %s", filepath) | ||
| 83 | + return filepath | ||
| 84 | + | ||
| 85 | + | ||
| 86 | +def wait_for_login(page: Page, timeout: float = 120.0) -> bool: | ||
| 87 | + """等待扫码登录完成。 | ||
| 88 | + | ||
| 89 | + Args: | ||
| 90 | + page: CDP 页面对象。 | ||
| 91 | + timeout: 超时时间(秒)。 | ||
| 92 | + | ||
| 93 | + Returns: | ||
| 94 | + True 登录成功,False 超时。 | ||
| 95 | + """ | ||
| 96 | + deadline = time.monotonic() + timeout | ||
| 97 | + while time.monotonic() < deadline: | ||
| 98 | + if page.has_element(LOGIN_STATUS): | ||
| 99 | + logger.info("登录成功") | ||
| 100 | + return True | ||
| 101 | + time.sleep(0.5) | ||
| 102 | + return False |
scripts/xhs/publish.py
0 → 100644
| 1 | +"""图文发布,对应 Go xiaohongshu/publish.go(837 行)。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import random | ||
| 8 | +import time | ||
| 9 | + | ||
| 10 | +from .cdp import Page | ||
| 11 | +from .errors import ContentTooLongError, PublishError, TitleTooLongError, UploadTimeoutError | ||
| 12 | +from .selectors import ( | ||
| 13 | + CONTENT_EDITOR, | ||
| 14 | + CONTENT_LENGTH_ERROR, | ||
| 15 | + CREATOR_TAB, | ||
| 16 | + DATETIME_INPUT, | ||
| 17 | + FILE_INPUT, | ||
| 18 | + IMAGE_PREVIEW, | ||
| 19 | + ORIGINAL_SWITCH, | ||
| 20 | + ORIGINAL_SWITCH_CARD, | ||
| 21 | + POPOVER, | ||
| 22 | + PUBLISH_BUTTON, | ||
| 23 | + SCHEDULE_SWITCH, | ||
| 24 | + TAG_FIRST_ITEM, | ||
| 25 | + TAG_TOPIC_CONTAINER, | ||
| 26 | + TITLE_INPUT, | ||
| 27 | + TITLE_MAX_SUFFIX, | ||
| 28 | + UPLOAD_CONTENT, | ||
| 29 | + UPLOAD_INPUT, | ||
| 30 | + VISIBILITY_DROPDOWN, | ||
| 31 | + VISIBILITY_OPTIONS, | ||
| 32 | +) | ||
| 33 | +from .types import PublishImageContent | ||
| 34 | +from .urls import PUBLISH_URL | ||
| 35 | + | ||
| 36 | +logger = logging.getLogger(__name__) | ||
| 37 | + | ||
| 38 | + | ||
| 39 | +def publish_image_content(page: Page, content: PublishImageContent) -> None: | ||
| 40 | + """发布图文内容。 | ||
| 41 | + | ||
| 42 | + Args: | ||
| 43 | + page: CDP 页面对象。 | ||
| 44 | + content: 发布内容。 | ||
| 45 | + | ||
| 46 | + Raises: | ||
| 47 | + PublishError: 发布失败。 | ||
| 48 | + UploadTimeoutError: 上传超时。 | ||
| 49 | + TitleTooLongError: 标题超长。 | ||
| 50 | + ContentTooLongError: 正文超长。 | ||
| 51 | + """ | ||
| 52 | + if not content.image_paths: | ||
| 53 | + raise PublishError("图片不能为空") | ||
| 54 | + | ||
| 55 | + # 导航到发布页 | ||
| 56 | + _navigate_to_publish_page(page) | ||
| 57 | + | ||
| 58 | + # 点击"上传图文" TAB | ||
| 59 | + _click_publish_tab(page, "上传图文") | ||
| 60 | + time.sleep(1) | ||
| 61 | + | ||
| 62 | + # 上传图片 | ||
| 63 | + _upload_images(page, content.image_paths) | ||
| 64 | + | ||
| 65 | + # 标签截取 | ||
| 66 | + tags = content.tags[:10] if len(content.tags) > 10 else content.tags | ||
| 67 | + if len(content.tags) > 10: | ||
| 68 | + logger.warning("标签数量超过10,截取前10个") | ||
| 69 | + | ||
| 70 | + logger.info( | ||
| 71 | + "发布内容: title=%s, images=%d, tags=%d, schedule=%s, original=%s, visibility=%s", | ||
| 72 | + content.title, | ||
| 73 | + len(content.image_paths), | ||
| 74 | + len(tags), | ||
| 75 | + content.schedule_time, | ||
| 76 | + content.is_original, | ||
| 77 | + content.visibility, | ||
| 78 | + ) | ||
| 79 | + | ||
| 80 | + # 提交发布 | ||
| 81 | + _submit_publish( | ||
| 82 | + page, | ||
| 83 | + content.title, | ||
| 84 | + content.content, | ||
| 85 | + tags, | ||
| 86 | + content.schedule_time, | ||
| 87 | + content.is_original, | ||
| 88 | + content.visibility, | ||
| 89 | + ) | ||
| 90 | + | ||
| 91 | + | ||
| 92 | +# ========== 页面导航 ========== | ||
| 93 | + | ||
| 94 | + | ||
| 95 | +def _navigate_to_publish_page(page: Page) -> None: | ||
| 96 | + """导航到发布页面。""" | ||
| 97 | + page.navigate(PUBLISH_URL) | ||
| 98 | + page.wait_for_load(timeout=300) | ||
| 99 | + time.sleep(2) | ||
| 100 | + page.wait_dom_stable() | ||
| 101 | + time.sleep(1) | ||
| 102 | + | ||
| 103 | + | ||
| 104 | +def _click_publish_tab(page: Page, tab_name: str) -> None: | ||
| 105 | + """点击发布页 TAB(上传图文/上传视频)。""" | ||
| 106 | + page.wait_for_element(UPLOAD_CONTENT, timeout=15) | ||
| 107 | + | ||
| 108 | + deadline = time.monotonic() + 15 | ||
| 109 | + while time.monotonic() < deadline: | ||
| 110 | + # 查找匹配的 TAB | ||
| 111 | + found = page.evaluate( | ||
| 112 | + f""" | ||
| 113 | + (() => {{ | ||
| 114 | + const tabs = document.querySelectorAll({json.dumps(CREATOR_TAB)}); | ||
| 115 | + for (const tab of tabs) {{ | ||
| 116 | + if (tab.textContent.trim() === {json.dumps(tab_name)}) {{ | ||
| 117 | + // 检查是否被遮挡 | ||
| 118 | + const rect = tab.getBoundingClientRect(); | ||
| 119 | + if (rect.width === 0 || rect.height === 0) continue; | ||
| 120 | + const x = rect.left + rect.width / 2; | ||
| 121 | + const y = rect.top + rect.height / 2; | ||
| 122 | + const target = document.elementFromPoint(x, y); | ||
| 123 | + if (target === tab || tab.contains(target)) {{ | ||
| 124 | + tab.click(); | ||
| 125 | + return 'clicked'; | ||
| 126 | + }} | ||
| 127 | + return 'blocked'; | ||
| 128 | + }} | ||
| 129 | + }} | ||
| 130 | + return 'not_found'; | ||
| 131 | + }})() | ||
| 132 | + """ | ||
| 133 | + ) | ||
| 134 | + | ||
| 135 | + if found == "clicked": | ||
| 136 | + return | ||
| 137 | + | ||
| 138 | + if found == "blocked": | ||
| 139 | + # 尝试移除弹窗 | ||
| 140 | + _remove_pop_cover(page) | ||
| 141 | + | ||
| 142 | + time.sleep(0.2) | ||
| 143 | + | ||
| 144 | + raise PublishError(f"没有找到发布 TAB - {tab_name}") | ||
| 145 | + | ||
| 146 | + | ||
| 147 | +def _remove_pop_cover(page: Page) -> None: | ||
| 148 | + """移除弹窗遮挡。""" | ||
| 149 | + if page.has_element(POPOVER): | ||
| 150 | + page.remove_element(POPOVER) | ||
| 151 | + # 点击空位置 | ||
| 152 | + x = 380 + random.randint(0, 100) | ||
| 153 | + y = 20 + random.randint(0, 60) | ||
| 154 | + page.mouse_click(float(x), float(y)) | ||
| 155 | + | ||
| 156 | + | ||
| 157 | +# ========== 图片上传 ========== | ||
| 158 | + | ||
| 159 | + | ||
| 160 | +def _upload_images(page: Page, image_paths: list[str]) -> None: | ||
| 161 | + """逐张上传图片。""" | ||
| 162 | + import os | ||
| 163 | + | ||
| 164 | + valid_paths = [p for p in image_paths if os.path.exists(p)] | ||
| 165 | + if not valid_paths: | ||
| 166 | + raise PublishError("没有有效的图片文件") | ||
| 167 | + | ||
| 168 | + for i, path in enumerate(valid_paths): | ||
| 169 | + selector = UPLOAD_INPUT if i == 0 else FILE_INPUT | ||
| 170 | + logger.info("上传第 %d 张图片: %s", i + 1, path) | ||
| 171 | + | ||
| 172 | + page.set_file_input(selector, [path]) | ||
| 173 | + _wait_for_upload_complete(page, i + 1) | ||
| 174 | + time.sleep(1) | ||
| 175 | + | ||
| 176 | + | ||
| 177 | +def _wait_for_upload_complete(page: Page, expected_count: int) -> None: | ||
| 178 | + """等待图片上传完成。""" | ||
| 179 | + max_wait = 60.0 | ||
| 180 | + start = time.monotonic() | ||
| 181 | + | ||
| 182 | + while time.monotonic() - start < max_wait: | ||
| 183 | + count = page.get_elements_count(IMAGE_PREVIEW) | ||
| 184 | + if count >= expected_count: | ||
| 185 | + logger.info("图片上传完成: %d", count) | ||
| 186 | + return | ||
| 187 | + time.sleep(0.5) | ||
| 188 | + | ||
| 189 | + raise UploadTimeoutError(f"第{expected_count}张图片上传超时(60s)") | ||
| 190 | + | ||
| 191 | + | ||
| 192 | +# ========== 表单提交 ========== | ||
| 193 | + | ||
| 194 | + | ||
| 195 | +def _submit_publish( | ||
| 196 | + page: Page, | ||
| 197 | + title: str, | ||
| 198 | + content: str, | ||
| 199 | + tags: list[str], | ||
| 200 | + schedule_time: str | None, | ||
| 201 | + is_original: bool, | ||
| 202 | + visibility: str, | ||
| 203 | +) -> None: | ||
| 204 | + """填写表单并提交。""" | ||
| 205 | + # 标题 | ||
| 206 | + page.input_text(TITLE_INPUT, title) | ||
| 207 | + time.sleep(0.5) | ||
| 208 | + _check_title_max_length(page) | ||
| 209 | + logger.info("标题长度检查通过") | ||
| 210 | + time.sleep(1) | ||
| 211 | + | ||
| 212 | + # 正文 | ||
| 213 | + content_selector = _find_content_element(page) | ||
| 214 | + page.input_content_editable(content_selector, content) | ||
| 215 | + | ||
| 216 | + # 回点标题(增强稳定性) | ||
| 217 | + time.sleep(1) | ||
| 218 | + page.click_element(TITLE_INPUT) | ||
| 219 | + logger.info("已回点标题输入框") | ||
| 220 | + | ||
| 221 | + # 标签 | ||
| 222 | + if tags: | ||
| 223 | + _input_tags(page, content_selector, tags) | ||
| 224 | + time.sleep(1) | ||
| 225 | + _check_content_max_length(page) | ||
| 226 | + logger.info("正文长度检查通过") | ||
| 227 | + | ||
| 228 | + # 定时发布 | ||
| 229 | + if schedule_time: | ||
| 230 | + _set_schedule_publish(page, schedule_time) | ||
| 231 | + | ||
| 232 | + # 可见范围 | ||
| 233 | + _set_visibility(page, visibility) | ||
| 234 | + | ||
| 235 | + # 原创声明 | ||
| 236 | + if is_original: | ||
| 237 | + try: | ||
| 238 | + _set_original(page) | ||
| 239 | + logger.info("已声明原创") | ||
| 240 | + except Exception as e: | ||
| 241 | + logger.warning("设置原创声明失败: %s", e) | ||
| 242 | + | ||
| 243 | + # 点击发布 | ||
| 244 | + page.click_element(PUBLISH_BUTTON) | ||
| 245 | + time.sleep(3) | ||
| 246 | + logger.info("发布完成") | ||
| 247 | + | ||
| 248 | + | ||
| 249 | +def _find_content_element(page: Page) -> str: | ||
| 250 | + """查找内容输入框(兼容两种 UI)。""" | ||
| 251 | + if page.has_element(CONTENT_EDITOR): | ||
| 252 | + return CONTENT_EDITOR | ||
| 253 | + | ||
| 254 | + # 查找带 placeholder 的 p 元素的 textbox 父元素 | ||
| 255 | + found = page.evaluate( | ||
| 256 | + """ | ||
| 257 | + (() => { | ||
| 258 | + const ps = document.querySelectorAll('p'); | ||
| 259 | + for (const p of ps) { | ||
| 260 | + const placeholder = p.getAttribute('data-placeholder'); | ||
| 261 | + if (placeholder && placeholder.includes('输入正文描述')) { | ||
| 262 | + let current = p; | ||
| 263 | + for (let i = 0; i < 5; i++) { | ||
| 264 | + current = current.parentElement; | ||
| 265 | + if (!current) break; | ||
| 266 | + if (current.getAttribute('role') === 'textbox') { | ||
| 267 | + return 'found'; | ||
| 268 | + } | ||
| 269 | + } | ||
| 270 | + } | ||
| 271 | + } | ||
| 272 | + return ''; | ||
| 273 | + })() | ||
| 274 | + """ | ||
| 275 | + ) | ||
| 276 | + if found == "found": | ||
| 277 | + return "[role='textbox']" | ||
| 278 | + | ||
| 279 | + raise PublishError("没有找到内容输入框") | ||
| 280 | + | ||
| 281 | + | ||
| 282 | +def _check_title_max_length(page: Page) -> None: | ||
| 283 | + """检查标题长度是否超限。""" | ||
| 284 | + text = page.get_element_text(TITLE_MAX_SUFFIX) | ||
| 285 | + if text: | ||
| 286 | + parts = text.split("/") | ||
| 287 | + if len(parts) == 2: | ||
| 288 | + raise TitleTooLongError(parts[0], parts[1]) | ||
| 289 | + raise TitleTooLongError(text, "?") | ||
| 290 | + | ||
| 291 | + | ||
| 292 | +def _check_content_max_length(page: Page) -> None: | ||
| 293 | + """检查正文长度是否超限。""" | ||
| 294 | + text = page.get_element_text(CONTENT_LENGTH_ERROR) | ||
| 295 | + if text: | ||
| 296 | + parts = text.split("/") | ||
| 297 | + if len(parts) == 2: | ||
| 298 | + raise ContentTooLongError(parts[0], parts[1]) | ||
| 299 | + raise ContentTooLongError(text, "?") | ||
| 300 | + | ||
| 301 | + | ||
| 302 | +# ========== 标签输入 ========== | ||
| 303 | + | ||
| 304 | + | ||
| 305 | +def _input_tags(page: Page, content_selector: str, tags: list[str]) -> None: | ||
| 306 | + """输入标签。""" | ||
| 307 | + time.sleep(1) | ||
| 308 | + | ||
| 309 | + # 移动光标到正文末尾(20次 ArrowDown) | ||
| 310 | + for _ in range(20): | ||
| 311 | + page.press_key("ArrowDown") | ||
| 312 | + time.sleep(0.01) | ||
| 313 | + | ||
| 314 | + # 按两次回车换行 | ||
| 315 | + page.press_key("Enter") | ||
| 316 | + page.press_key("Enter") | ||
| 317 | + time.sleep(1) | ||
| 318 | + | ||
| 319 | + for tag in tags: | ||
| 320 | + tag = tag.lstrip("#") | ||
| 321 | + _input_single_tag(page, content_selector, tag) | ||
| 322 | + | ||
| 323 | + | ||
| 324 | +def _input_single_tag(page: Page, content_selector: str, tag: str) -> None: | ||
| 325 | + """输入单个标签。""" | ||
| 326 | + # 输入 # | ||
| 327 | + page.type_text("#", delay_ms=0) | ||
| 328 | + time.sleep(0.2) | ||
| 329 | + | ||
| 330 | + # 逐字输入标签 | ||
| 331 | + for char in tag: | ||
| 332 | + page.type_text(char, delay_ms=50) | ||
| 333 | + | ||
| 334 | + time.sleep(1) | ||
| 335 | + | ||
| 336 | + # 尝试点击标签联想 | ||
| 337 | + if page.has_element(TAG_TOPIC_CONTAINER): | ||
| 338 | + item_selector = f"{TAG_TOPIC_CONTAINER} {TAG_FIRST_ITEM}" | ||
| 339 | + if page.has_element(item_selector): | ||
| 340 | + page.click_element(item_selector) | ||
| 341 | + logger.info("点击标签联想: %s", tag) | ||
| 342 | + time.sleep(0.5) | ||
| 343 | + return | ||
| 344 | + | ||
| 345 | + # 没有联想,直接空格 | ||
| 346 | + logger.warning("未找到标签联想,直接输入空格: %s", tag) | ||
| 347 | + page.type_text(" ", delay_ms=0) | ||
| 348 | + time.sleep(0.5) | ||
| 349 | + | ||
| 350 | + | ||
| 351 | +# ========== 定时发布 ========== | ||
| 352 | + | ||
| 353 | + | ||
| 354 | +def _set_schedule_publish(page: Page, schedule_time: str) -> None: | ||
| 355 | + """设置定时发布。""" | ||
| 356 | + from datetime import datetime | ||
| 357 | + | ||
| 358 | + # 解析 ISO8601 时间 | ||
| 359 | + try: | ||
| 360 | + dt = datetime.fromisoformat(schedule_time) | ||
| 361 | + except ValueError as e: | ||
| 362 | + raise PublishError(f"定时发布时间格式错误: {e}") from e | ||
| 363 | + | ||
| 364 | + # 点击定时发布开关 | ||
| 365 | + page.click_element(SCHEDULE_SWITCH) | ||
| 366 | + time.sleep(0.8) | ||
| 367 | + | ||
| 368 | + # 设置日期时间 | ||
| 369 | + datetime_str = dt.strftime("%Y-%m-%d %H:%M") | ||
| 370 | + page.select_all_text(DATETIME_INPUT) | ||
| 371 | + page.input_text(DATETIME_INPUT, datetime_str) | ||
| 372 | + time.sleep(0.5) | ||
| 373 | + | ||
| 374 | + logger.info("已设置定时发布: %s", datetime_str) | ||
| 375 | + | ||
| 376 | + | ||
| 377 | +# ========== 可见范围 ========== | ||
| 378 | + | ||
| 379 | + | ||
| 380 | +def _set_visibility(page: Page, visibility: str) -> None: | ||
| 381 | + """设置可见范围。""" | ||
| 382 | + if not visibility or visibility == "公开可见": | ||
| 383 | + logger.info("可见范围: 公开可见(默认)") | ||
| 384 | + return | ||
| 385 | + | ||
| 386 | + supported = {"仅自己可见", "仅互关好友可见"} | ||
| 387 | + if visibility not in supported: | ||
| 388 | + raise PublishError( | ||
| 389 | + f"不支持的可见范围: {visibility},支持: 公开可见、仅自己可见、仅互关好友可见" | ||
| 390 | + ) | ||
| 391 | + | ||
| 392 | + # 点击下拉框 | ||
| 393 | + page.click_element(VISIBILITY_DROPDOWN) | ||
| 394 | + time.sleep(0.5) | ||
| 395 | + | ||
| 396 | + # 查找并点击目标选项 | ||
| 397 | + clicked = page.evaluate( | ||
| 398 | + f""" | ||
| 399 | + (() => {{ | ||
| 400 | + const opts = document.querySelectorAll({json.dumps(VISIBILITY_OPTIONS)}); | ||
| 401 | + for (const opt of opts) {{ | ||
| 402 | + if (opt.textContent.includes({json.dumps(visibility)})) {{ | ||
| 403 | + opt.click(); | ||
| 404 | + return true; | ||
| 405 | + }} | ||
| 406 | + }} | ||
| 407 | + return false; | ||
| 408 | + }})() | ||
| 409 | + """ | ||
| 410 | + ) | ||
| 411 | + | ||
| 412 | + if not clicked: | ||
| 413 | + raise PublishError(f"未找到可见范围选项: {visibility}") | ||
| 414 | + | ||
| 415 | + logger.info("已设置可见范围: %s", visibility) | ||
| 416 | + time.sleep(0.2) | ||
| 417 | + | ||
| 418 | + | ||
| 419 | +# ========== 原创声明 ========== | ||
| 420 | + | ||
| 421 | + | ||
| 422 | +def _set_original(page: Page) -> None: | ||
| 423 | + """设置原创声明。""" | ||
| 424 | + # 查找原创声明卡片并点击开关 | ||
| 425 | + result = page.evaluate( | ||
| 426 | + f""" | ||
| 427 | + (() => {{ | ||
| 428 | + const cards = document.querySelectorAll({json.dumps(ORIGINAL_SWITCH_CARD)}); | ||
| 429 | + for (const card of cards) {{ | ||
| 430 | + if (!card.textContent.includes('原创声明')) continue; | ||
| 431 | + const sw = card.querySelector({json.dumps(ORIGINAL_SWITCH)}); | ||
| 432 | + if (!sw) continue; | ||
| 433 | + const input = sw.querySelector('input[type="checkbox"]'); | ||
| 434 | + if (input && input.checked) return 'already_on'; | ||
| 435 | + sw.click(); | ||
| 436 | + return 'clicked'; | ||
| 437 | + }} | ||
| 438 | + return 'not_found'; | ||
| 439 | + }})() | ||
| 440 | + """ | ||
| 441 | + ) | ||
| 442 | + | ||
| 443 | + if result == "already_on": | ||
| 444 | + logger.info("原创声明已开启") | ||
| 445 | + return | ||
| 446 | + | ||
| 447 | + if result == "not_found": | ||
| 448 | + raise PublishError("未找到原创声明选项") | ||
| 449 | + | ||
| 450 | + time.sleep(0.5) | ||
| 451 | + | ||
| 452 | + # 处理确认弹窗 | ||
| 453 | + _confirm_original_declaration(page) | ||
| 454 | + | ||
| 455 | + | ||
| 456 | +def _confirm_original_declaration(page: Page) -> None: | ||
| 457 | + """处理原创声明确认弹窗。""" | ||
| 458 | + time.sleep(0.8) | ||
| 459 | + | ||
| 460 | + # 勾选 checkbox | ||
| 461 | + page.evaluate( | ||
| 462 | + """ | ||
| 463 | + (() => { | ||
| 464 | + const footers = document.querySelectorAll('div.footer'); | ||
| 465 | + for (const footer of footers) { | ||
| 466 | + if (!footer.textContent.includes('原创声明须知')) continue; | ||
| 467 | + const cb = footer.querySelector('div.d-checkbox input[type="checkbox"]'); | ||
| 468 | + if (cb && !cb.checked) cb.click(); | ||
| 469 | + return; | ||
| 470 | + } | ||
| 471 | + })() | ||
| 472 | + """ | ||
| 473 | + ) | ||
| 474 | + time.sleep(0.5) | ||
| 475 | + | ||
| 476 | + # 点击声明原创按钮 | ||
| 477 | + result = page.evaluate( | ||
| 478 | + """ | ||
| 479 | + (() => { | ||
| 480 | + const footers = document.querySelectorAll('div.footer'); | ||
| 481 | + for (const footer of footers) { | ||
| 482 | + if (!footer.textContent.includes('声明原创')) continue; | ||
| 483 | + const btn = footer.querySelector('button.custom-button'); | ||
| 484 | + if (btn) { | ||
| 485 | + if (btn.classList.contains('disabled') || btn.disabled) { | ||
| 486 | + const cb = footer.querySelector('div.d-checkbox input[type="checkbox"]'); | ||
| 487 | + if (cb && !cb.checked) cb.click(); | ||
| 488 | + return 'button_disabled'; | ||
| 489 | + } | ||
| 490 | + btn.click(); | ||
| 491 | + return 'clicked'; | ||
| 492 | + } | ||
| 493 | + } | ||
| 494 | + return 'button_not_found'; | ||
| 495 | + })() | ||
| 496 | + """ | ||
| 497 | + ) | ||
| 498 | + | ||
| 499 | + if result == "button_not_found": | ||
| 500 | + raise PublishError("未找到声明原创按钮") | ||
| 501 | + if result == "button_disabled": | ||
| 502 | + raise PublishError("声明原创按钮仍处于禁用状态") | ||
| 503 | + | ||
| 504 | + logger.info("已成功点击声明原创按钮") | ||
| 505 | + time.sleep(0.3) |
scripts/xhs/publish_video.py
0 → 100644
| 1 | +"""视频发布,对应 Go xiaohongshu/publish_video.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import logging | ||
| 6 | +import os | ||
| 7 | +import time | ||
| 8 | + | ||
| 9 | +from .cdp import Page | ||
| 10 | +from .errors import PublishError, UploadTimeoutError | ||
| 11 | +from .publish import ( | ||
| 12 | + _click_publish_tab, | ||
| 13 | + _find_content_element, | ||
| 14 | + _input_tags, | ||
| 15 | + _navigate_to_publish_page, | ||
| 16 | + _set_schedule_publish, | ||
| 17 | + _set_visibility, | ||
| 18 | +) | ||
| 19 | +from .selectors import ( | ||
| 20 | + FILE_INPUT, | ||
| 21 | + PUBLISH_BUTTON, | ||
| 22 | + TITLE_INPUT, | ||
| 23 | + UPLOAD_INPUT, | ||
| 24 | +) | ||
| 25 | +from .types import PublishVideoContent | ||
| 26 | + | ||
| 27 | +logger = logging.getLogger(__name__) | ||
| 28 | + | ||
| 29 | + | ||
| 30 | +def publish_video_content(page: Page, content: PublishVideoContent) -> None: | ||
| 31 | + """发布视频内容。 | ||
| 32 | + | ||
| 33 | + Args: | ||
| 34 | + page: CDP 页面对象。 | ||
| 35 | + content: 视频发布内容。 | ||
| 36 | + | ||
| 37 | + Raises: | ||
| 38 | + PublishError: 发布失败。 | ||
| 39 | + UploadTimeoutError: 上传/处理超时。 | ||
| 40 | + """ | ||
| 41 | + if not content.video_path: | ||
| 42 | + raise PublishError("视频不能为空") | ||
| 43 | + | ||
| 44 | + # 导航到发布页 | ||
| 45 | + _navigate_to_publish_page(page) | ||
| 46 | + | ||
| 47 | + # 点击"上传视频" TAB | ||
| 48 | + _click_publish_tab(page, "上传视频") | ||
| 49 | + time.sleep(1) | ||
| 50 | + | ||
| 51 | + # 上传视频 | ||
| 52 | + _upload_video(page, content.video_path) | ||
| 53 | + | ||
| 54 | + # 提交 | ||
| 55 | + _submit_publish_video( | ||
| 56 | + page, | ||
| 57 | + content.title, | ||
| 58 | + content.content, | ||
| 59 | + content.tags, | ||
| 60 | + content.schedule_time, | ||
| 61 | + content.visibility, | ||
| 62 | + ) | ||
| 63 | + | ||
| 64 | + | ||
| 65 | +def _upload_video(page: Page, video_path: str) -> None: | ||
| 66 | + """上传视频文件。""" | ||
| 67 | + if not os.path.exists(video_path): | ||
| 68 | + raise PublishError(f"视频文件不存在: {video_path}") | ||
| 69 | + | ||
| 70 | + # 查找上传输入框 | ||
| 71 | + selector = UPLOAD_INPUT if page.has_element(UPLOAD_INPUT) else FILE_INPUT | ||
| 72 | + page.set_file_input(selector, [video_path]) | ||
| 73 | + | ||
| 74 | + # 等待发布按钮可点击(视频处理完成) | ||
| 75 | + _wait_for_publish_button_clickable(page) | ||
| 76 | + logger.info("视频上传/处理完成") | ||
| 77 | + | ||
| 78 | + | ||
| 79 | +def _wait_for_publish_button_clickable(page: Page) -> None: | ||
| 80 | + """等待发布按钮可点击(视频处理可能需要较长时间)。""" | ||
| 81 | + max_wait = 600.0 # 10 分钟 | ||
| 82 | + start = time.monotonic() | ||
| 83 | + | ||
| 84 | + logger.info("开始等待发布按钮可点击(视频)") | ||
| 85 | + | ||
| 86 | + while time.monotonic() - start < max_wait: | ||
| 87 | + clickable = page.evaluate( | ||
| 88 | + f""" | ||
| 89 | + (() => {{ | ||
| 90 | + const btn = document.querySelector({_js_str(PUBLISH_BUTTON)}); | ||
| 91 | + if (!btn) return false; | ||
| 92 | + const rect = btn.getBoundingClientRect(); | ||
| 93 | + if (rect.width === 0 || rect.height === 0) return false; | ||
| 94 | + if (btn.disabled) return false; | ||
| 95 | + if (btn.classList.contains('disabled')) return false; | ||
| 96 | + return true; | ||
| 97 | + }})() | ||
| 98 | + """ | ||
| 99 | + ) | ||
| 100 | + if clickable: | ||
| 101 | + return | ||
| 102 | + time.sleep(1) | ||
| 103 | + | ||
| 104 | + raise UploadTimeoutError("等待发布按钮可点击超时(10分钟)") | ||
| 105 | + | ||
| 106 | + | ||
| 107 | +def _submit_publish_video( | ||
| 108 | + page: Page, | ||
| 109 | + title: str, | ||
| 110 | + content: str, | ||
| 111 | + tags: list[str], | ||
| 112 | + schedule_time: str | None, | ||
| 113 | + visibility: str, | ||
| 114 | +) -> None: | ||
| 115 | + """填写视频表单并提交。""" | ||
| 116 | + # 标题 | ||
| 117 | + page.input_text(TITLE_INPUT, title) | ||
| 118 | + time.sleep(1) | ||
| 119 | + | ||
| 120 | + # 正文 + 标签 | ||
| 121 | + content_selector = _find_content_element(page) | ||
| 122 | + page.input_content_editable(content_selector, content) | ||
| 123 | + | ||
| 124 | + # 回点标题 | ||
| 125 | + time.sleep(1) | ||
| 126 | + page.click_element(TITLE_INPUT) | ||
| 127 | + | ||
| 128 | + if tags: | ||
| 129 | + _input_tags(page, content_selector, tags) | ||
| 130 | + time.sleep(1) | ||
| 131 | + | ||
| 132 | + # 定时发布 | ||
| 133 | + if schedule_time: | ||
| 134 | + _set_schedule_publish(page, schedule_time) | ||
| 135 | + | ||
| 136 | + # 可见范围 | ||
| 137 | + _set_visibility(page, visibility) | ||
| 138 | + | ||
| 139 | + # 等待发布按钮可点击 | ||
| 140 | + _wait_for_publish_button_clickable(page) | ||
| 141 | + | ||
| 142 | + # 点击发布 | ||
| 143 | + page.click_element(PUBLISH_BUTTON) | ||
| 144 | + time.sleep(3) | ||
| 145 | + logger.info("视频发布完成") | ||
| 146 | + | ||
| 147 | + | ||
| 148 | +def _js_str(s: str) -> str: | ||
| 149 | + """将 Python 字符串转为 JS 字面量。""" | ||
| 150 | + import json | ||
| 151 | + | ||
| 152 | + return json.dumps(s) |
scripts/xhs/search.py
0 → 100644
| 1 | +"""搜索 Feeds,对应 Go xiaohongshu/search.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import time | ||
| 8 | + | ||
| 9 | +from .cdp import Page | ||
| 10 | +from .errors import NoFeedsError | ||
| 11 | +from .selectors import FILTER_BUTTON, FILTER_PANEL | ||
| 12 | +from .types import Feed, FilterOption | ||
| 13 | +from .urls import make_search_url | ||
| 14 | + | ||
| 15 | +logger = logging.getLogger(__name__) | ||
| 16 | + | ||
| 17 | +# 筛选选项映射表:{筛选组索引: [(标签索引, 文本), ...]} | ||
| 18 | +_FILTER_OPTIONS: dict[int, list[tuple[int, str]]] = { | ||
| 19 | + 1: [(1, "综合"), (2, "最新"), (3, "最多点赞"), (4, "最多评论"), (5, "最多收藏")], | ||
| 20 | + 2: [(1, "不限"), (2, "视频"), (3, "图文")], | ||
| 21 | + 3: [(1, "不限"), (2, "一天内"), (3, "一周内"), (4, "半年内")], | ||
| 22 | + 4: [(1, "不限"), (2, "已看过"), (3, "未看过"), (4, "已关注")], | ||
| 23 | + 5: [(1, "不限"), (2, "同城"), (3, "附近")], | ||
| 24 | +} | ||
| 25 | + | ||
| 26 | +# 从 __INITIAL_STATE__ 提取搜索结果的 JS | ||
| 27 | +_EXTRACT_SEARCH_JS = """ | ||
| 28 | +(() => { | ||
| 29 | + if (window.__INITIAL_STATE__ && | ||
| 30 | + window.__INITIAL_STATE__.search && | ||
| 31 | + window.__INITIAL_STATE__.search.feeds) { | ||
| 32 | + const feeds = window.__INITIAL_STATE__.search.feeds; | ||
| 33 | + const feedsData = feeds.value !== undefined ? feeds.value : feeds._value; | ||
| 34 | + if (feedsData) { | ||
| 35 | + return JSON.stringify(feedsData); | ||
| 36 | + } | ||
| 37 | + } | ||
| 38 | + return ""; | ||
| 39 | +})() | ||
| 40 | +""" | ||
| 41 | + | ||
| 42 | + | ||
| 43 | +def _find_internal_option(group_index: int, text: str) -> tuple[int, int]: | ||
| 44 | + """查找内部筛选选项索引。 | ||
| 45 | + | ||
| 46 | + Returns: | ||
| 47 | + (filters_index, tags_index) | ||
| 48 | + | ||
| 49 | + Raises: | ||
| 50 | + ValueError: 未找到匹配的选项。 | ||
| 51 | + """ | ||
| 52 | + options = _FILTER_OPTIONS.get(group_index) | ||
| 53 | + if not options: | ||
| 54 | + raise ValueError(f"筛选组 {group_index} 不存在") | ||
| 55 | + | ||
| 56 | + for tags_index, option_text in options: | ||
| 57 | + if option_text == text: | ||
| 58 | + return group_index, tags_index | ||
| 59 | + | ||
| 60 | + valid = [t for _, t in options] | ||
| 61 | + raise ValueError(f"在筛选组 {group_index} 中未找到 '{text}',有效值: {valid}") | ||
| 62 | + | ||
| 63 | + | ||
| 64 | +def _convert_filters(filter_opt: FilterOption) -> list[tuple[int, int]]: | ||
| 65 | + """将 FilterOption 转换为内部 (filters_index, tags_index) 列表。""" | ||
| 66 | + result: list[tuple[int, int]] = [] | ||
| 67 | + | ||
| 68 | + if filter_opt.sort_by: | ||
| 69 | + result.append(_find_internal_option(1, filter_opt.sort_by)) | ||
| 70 | + if filter_opt.note_type: | ||
| 71 | + result.append(_find_internal_option(2, filter_opt.note_type)) | ||
| 72 | + if filter_opt.publish_time: | ||
| 73 | + result.append(_find_internal_option(3, filter_opt.publish_time)) | ||
| 74 | + if filter_opt.search_scope: | ||
| 75 | + result.append(_find_internal_option(4, filter_opt.search_scope)) | ||
| 76 | + if filter_opt.location: | ||
| 77 | + result.append(_find_internal_option(5, filter_opt.location)) | ||
| 78 | + | ||
| 79 | + return result | ||
| 80 | + | ||
| 81 | + | ||
| 82 | +def search_feeds( | ||
| 83 | + page: Page, | ||
| 84 | + keyword: str, | ||
| 85 | + filter_option: FilterOption | None = None, | ||
| 86 | +) -> list[Feed]: | ||
| 87 | + """搜索 Feeds。 | ||
| 88 | + | ||
| 89 | + Args: | ||
| 90 | + page: CDP 页面对象。 | ||
| 91 | + keyword: 搜索关键词。 | ||
| 92 | + filter_option: 可选筛选条件。 | ||
| 93 | + | ||
| 94 | + Raises: | ||
| 95 | + NoFeedsError: 没有捕获到搜索结果。 | ||
| 96 | + ValueError: 筛选选项无效。 | ||
| 97 | + """ | ||
| 98 | + search_url = make_search_url(keyword) | ||
| 99 | + page.navigate(search_url) | ||
| 100 | + page.wait_for_load() | ||
| 101 | + page.wait_dom_stable() | ||
| 102 | + | ||
| 103 | + # 等待 __INITIAL_STATE__ 初始化 | ||
| 104 | + _wait_for_initial_state(page) | ||
| 105 | + | ||
| 106 | + # 应用筛选条件 | ||
| 107 | + if filter_option: | ||
| 108 | + internal_filters = _convert_filters(filter_option) | ||
| 109 | + if internal_filters: | ||
| 110 | + _apply_filters(page, internal_filters) | ||
| 111 | + | ||
| 112 | + # 提取搜索结果 | ||
| 113 | + result = page.evaluate(_EXTRACT_SEARCH_JS) | ||
| 114 | + if not result: | ||
| 115 | + raise NoFeedsError() | ||
| 116 | + | ||
| 117 | + feeds_data = json.loads(result) | ||
| 118 | + return [Feed.from_dict(f) for f in feeds_data] | ||
| 119 | + | ||
| 120 | + | ||
| 121 | +def _wait_for_initial_state(page: Page, timeout: float = 10.0) -> None: | ||
| 122 | + """等待 __INITIAL_STATE__ 就绪。""" | ||
| 123 | + deadline = time.monotonic() + timeout | ||
| 124 | + while time.monotonic() < deadline: | ||
| 125 | + ready = page.evaluate("window.__INITIAL_STATE__ !== undefined") | ||
| 126 | + if ready: | ||
| 127 | + return | ||
| 128 | + time.sleep(0.5) | ||
| 129 | + logger.warning("等待 __INITIAL_STATE__ 超时") | ||
| 130 | + | ||
| 131 | + | ||
| 132 | +def _apply_filters(page: Page, filters: list[tuple[int, int]]) -> None: | ||
| 133 | + """应用筛选条件。""" | ||
| 134 | + # 悬停筛选按钮 | ||
| 135 | + page.hover_element(FILTER_BUTTON) | ||
| 136 | + | ||
| 137 | + # 等待筛选面板出现 | ||
| 138 | + deadline = time.monotonic() + 5.0 | ||
| 139 | + while time.monotonic() < deadline: | ||
| 140 | + if page.has_element(FILTER_PANEL): | ||
| 141 | + break | ||
| 142 | + time.sleep(0.3) | ||
| 143 | + | ||
| 144 | + # 点击各筛选项 | ||
| 145 | + for filters_index, tags_index in filters: | ||
| 146 | + selector = ( | ||
| 147 | + f"div.filter-panel div.filters:nth-child({filters_index}) " | ||
| 148 | + f"div.tags:nth-child({tags_index})" | ||
| 149 | + ) | ||
| 150 | + page.click_element(selector) | ||
| 151 | + time.sleep(0.3) | ||
| 152 | + | ||
| 153 | + # 等待页面更新 | ||
| 154 | + page.wait_dom_stable() | ||
| 155 | + _wait_for_initial_state(page) |
scripts/xhs/selectors.py
0 → 100644
| 1 | +"""小红书页面 CSS 选择器常量。""" | ||
| 2 | + | ||
| 3 | +# ========== 登录 ========== | ||
| 4 | +LOGIN_STATUS = ".main-container .user .link-wrapper .channel" | ||
| 5 | +QRCODE_IMG = ".login-container .qrcode-img" | ||
| 6 | + | ||
| 7 | +# ========== 首页 / 搜索 ========== | ||
| 8 | +FILTER_BUTTON = "div.filter" | ||
| 9 | +FILTER_PANEL = "div.filter-panel" | ||
| 10 | + | ||
| 11 | +# ========== Feed 详情 ========== | ||
| 12 | +COMMENTS_CONTAINER = ".comments-container" | ||
| 13 | +PARENT_COMMENT = ".parent-comment" | ||
| 14 | +NO_COMMENTS_TEXT = ".no-comments-text" | ||
| 15 | +END_CONTAINER = ".end-container" | ||
| 16 | +TOTAL_COMMENT = ".comments-container .total" | ||
| 17 | +SHOW_MORE_BUTTON = ".show-more" | ||
| 18 | +NOTE_SCROLLER = ".note-scroller" | ||
| 19 | +INTERACTION_CONTAINER = ".interaction-container" | ||
| 20 | + | ||
| 21 | +# 页面不可访问容器 | ||
| 22 | +ACCESS_ERROR_WRAPPER = ".access-wrapper, .error-wrapper, .not-found-wrapper, .blocked-wrapper" | ||
| 23 | + | ||
| 24 | +# ========== 评论输入 ========== | ||
| 25 | +COMMENT_INPUT_TRIGGER = "div.input-box div.content-edit span" | ||
| 26 | +COMMENT_INPUT_FIELD = "div.input-box div.content-edit p.content-input" | ||
| 27 | +COMMENT_SUBMIT_BUTTON = "div.bottom button.submit" | ||
| 28 | +REPLY_BUTTON = ".right .interactions .reply" | ||
| 29 | + | ||
| 30 | +# ========== 点赞 / 收藏 ========== | ||
| 31 | +LIKE_BUTTON = ".interact-container .left .like-lottie" | ||
| 32 | +COLLECT_BUTTON = ".interact-container .left .reds-icon.collect-icon" | ||
| 33 | + | ||
| 34 | +# ========== 发布页 ========== | ||
| 35 | +UPLOAD_CONTENT = "div.upload-content" | ||
| 36 | +CREATOR_TAB = "div.creator-tab" | ||
| 37 | +UPLOAD_INPUT = ".upload-input" | ||
| 38 | +FILE_INPUT = 'input[type="file"]' | ||
| 39 | +TITLE_INPUT = "div.d-input input" | ||
| 40 | +CONTENT_EDITOR = "div.ql-editor" | ||
| 41 | +IMAGE_PREVIEW = ".img-preview-area .pr" | ||
| 42 | +PUBLISH_BUTTON = ".publish-page-publish-btn button.bg-red" | ||
| 43 | + | ||
| 44 | +# 标题/正文长度校验 | ||
| 45 | +TITLE_MAX_SUFFIX = "div.title-container div.max_suffix" | ||
| 46 | +CONTENT_LENGTH_ERROR = "div.edit-container div.length-error" | ||
| 47 | + | ||
| 48 | +# 可见范围 | ||
| 49 | +VISIBILITY_DROPDOWN = "div.permission-card-wrapper div.d-select-content" | ||
| 50 | +VISIBILITY_OPTIONS = "div.d-options-wrapper div.d-grid-item div.custom-option" | ||
| 51 | + | ||
| 52 | +# 定时发布 | ||
| 53 | +SCHEDULE_SWITCH = ".post-time-wrapper .d-switch" | ||
| 54 | +DATETIME_INPUT = ".date-picker-container input" | ||
| 55 | + | ||
| 56 | +# 原创声明 | ||
| 57 | +ORIGINAL_SWITCH_CARD = "div.custom-switch-card" | ||
| 58 | +ORIGINAL_SWITCH = "div.d-switch" | ||
| 59 | + | ||
| 60 | +# 标签联想 | ||
| 61 | +TAG_TOPIC_CONTAINER = "#creator-editor-topic-container" | ||
| 62 | +TAG_FIRST_ITEM = ".item" | ||
| 63 | + | ||
| 64 | +# 弹窗 | ||
| 65 | +POPOVER = "div.d-popover" | ||
| 66 | + | ||
| 67 | +# ========== 用户主页 ========== | ||
| 68 | +SIDEBAR_PROFILE = "div.main-container li.user.side-bar-component a.link-wrapper span.channel" |
scripts/xhs/stealth.py
0 → 100644
| 1 | +"""反检测 JS 注入 + Chrome 启动参数,对应 go-rod/stealth。""" | ||
| 2 | + | ||
| 3 | +# 反检测 JS 脚本:在页面加载时注入 | ||
| 4 | +STEALTH_JS = """ | ||
| 5 | +(() => { | ||
| 6 | + // 1. navigator.webdriver | ||
| 7 | + Object.defineProperty(navigator, 'webdriver', { | ||
| 8 | + get: () => undefined, | ||
| 9 | + configurable: true, | ||
| 10 | + }); | ||
| 11 | + | ||
| 12 | + // 2. chrome.runtime | ||
| 13 | + if (!window.chrome) { | ||
| 14 | + window.chrome = {}; | ||
| 15 | + } | ||
| 16 | + if (!window.chrome.runtime) { | ||
| 17 | + window.chrome.runtime = { | ||
| 18 | + connect: () => {}, | ||
| 19 | + sendMessage: () => {}, | ||
| 20 | + }; | ||
| 21 | + } | ||
| 22 | + | ||
| 23 | + // 3. plugins | ||
| 24 | + Object.defineProperty(navigator, 'plugins', { | ||
| 25 | + get: () => { | ||
| 26 | + return [ | ||
| 27 | + { | ||
| 28 | + 0: {type: 'application/x-google-chrome-pdf'}, | ||
| 29 | + description: 'Portable Document Format', | ||
| 30 | + filename: 'internal-pdf-viewer', | ||
| 31 | + length: 1, | ||
| 32 | + name: 'Chrome PDF Plugin', | ||
| 33 | + }, | ||
| 34 | + { | ||
| 35 | + 0: {type: 'application/pdf'}, | ||
| 36 | + description: '', | ||
| 37 | + filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai', | ||
| 38 | + length: 1, | ||
| 39 | + name: 'Chrome PDF Viewer', | ||
| 40 | + }, | ||
| 41 | + { | ||
| 42 | + 0: {type: 'application/x-nacl'}, | ||
| 43 | + description: '', | ||
| 44 | + filename: 'internal-nacl-plugin', | ||
| 45 | + length: 1, | ||
| 46 | + name: 'Native Client', | ||
| 47 | + }, | ||
| 48 | + ]; | ||
| 49 | + }, | ||
| 50 | + configurable: true, | ||
| 51 | + }); | ||
| 52 | + | ||
| 53 | + // 4. languages | ||
| 54 | + Object.defineProperty(navigator, 'languages', { | ||
| 55 | + get: () => ['zh-CN', 'zh', 'en-US', 'en'], | ||
| 56 | + configurable: true, | ||
| 57 | + }); | ||
| 58 | + | ||
| 59 | + // 5. permissions | ||
| 60 | + const originalQuery = window.navigator.permissions?.query; | ||
| 61 | + if (originalQuery) { | ||
| 62 | + window.navigator.permissions.query = (parameters) => | ||
| 63 | + parameters.name === 'notifications' | ||
| 64 | + ? Promise.resolve({ state: Notification.permission }) | ||
| 65 | + : originalQuery(parameters); | ||
| 66 | + } | ||
| 67 | + | ||
| 68 | + // 6. WebGL vendor/renderer | ||
| 69 | + const getParameter = WebGLRenderingContext.prototype.getParameter; | ||
| 70 | + WebGLRenderingContext.prototype.getParameter = function(parameter) { | ||
| 71 | + if (parameter === 37445) return 'Intel Inc.'; | ||
| 72 | + if (parameter === 37446) return 'Intel Iris OpenGL Engine'; | ||
| 73 | + return getParameter.call(this, parameter); | ||
| 74 | + }; | ||
| 75 | +})(); | ||
| 76 | +""" | ||
| 77 | + | ||
| 78 | +# Chrome 启动参数(反检测相关) | ||
| 79 | +STEALTH_ARGS = [ | ||
| 80 | + "--disable-blink-features=AutomationControlled", | ||
| 81 | + "--disable-infobars", | ||
| 82 | + "--no-first-run", | ||
| 83 | + "--no-default-browser-check", | ||
| 84 | + "--disable-background-timer-throttling", | ||
| 85 | + "--disable-backgrounding-occluded-windows", | ||
| 86 | + "--disable-renderer-backgrounding", | ||
| 87 | + "--disable-component-update", | ||
| 88 | +] |
scripts/xhs/types.py
0 → 100644
| 1 | +"""小红书数据类型定义,对应 Go types.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +from dataclasses import dataclass, field | ||
| 6 | + | ||
| 7 | +# ========== Feed 列表 ========== | ||
| 8 | + | ||
| 9 | + | ||
| 10 | +@dataclass | ||
| 11 | +class ImageInfo: | ||
| 12 | + image_scene: str = "" | ||
| 13 | + url: str = "" | ||
| 14 | + | ||
| 15 | + @classmethod | ||
| 16 | + def from_dict(cls, d: dict) -> ImageInfo: | ||
| 17 | + return cls( | ||
| 18 | + image_scene=d.get("imageScene", ""), | ||
| 19 | + url=d.get("url", ""), | ||
| 20 | + ) | ||
| 21 | + | ||
| 22 | + | ||
| 23 | +@dataclass | ||
| 24 | +class VideoCapability: | ||
| 25 | + duration: int = 0 # 秒 | ||
| 26 | + | ||
| 27 | + @classmethod | ||
| 28 | + def from_dict(cls, d: dict) -> VideoCapability: | ||
| 29 | + return cls(duration=d.get("duration", 0)) | ||
| 30 | + | ||
| 31 | + | ||
| 32 | +@dataclass | ||
| 33 | +class Video: | ||
| 34 | + capa: VideoCapability = field(default_factory=VideoCapability) | ||
| 35 | + | ||
| 36 | + @classmethod | ||
| 37 | + def from_dict(cls, d: dict) -> Video: | ||
| 38 | + return cls(capa=VideoCapability.from_dict(d.get("capa", {}))) | ||
| 39 | + | ||
| 40 | + | ||
| 41 | +@dataclass | ||
| 42 | +class Cover: | ||
| 43 | + width: int = 0 | ||
| 44 | + height: int = 0 | ||
| 45 | + url: str = "" | ||
| 46 | + file_id: str = "" | ||
| 47 | + url_pre: str = "" | ||
| 48 | + url_default: str = "" | ||
| 49 | + info_list: list[ImageInfo] = field(default_factory=list) | ||
| 50 | + | ||
| 51 | + @classmethod | ||
| 52 | + def from_dict(cls, d: dict) -> Cover: | ||
| 53 | + return cls( | ||
| 54 | + width=d.get("width", 0), | ||
| 55 | + height=d.get("height", 0), | ||
| 56 | + url=d.get("url", ""), | ||
| 57 | + file_id=d.get("fileId", ""), | ||
| 58 | + url_pre=d.get("urlPre", ""), | ||
| 59 | + url_default=d.get("urlDefault", ""), | ||
| 60 | + info_list=[ImageInfo.from_dict(i) for i in d.get("infoList", [])], | ||
| 61 | + ) | ||
| 62 | + | ||
| 63 | + | ||
| 64 | +@dataclass | ||
| 65 | +class User: | ||
| 66 | + user_id: str = "" | ||
| 67 | + nickname: str = "" | ||
| 68 | + nick_name: str = "" | ||
| 69 | + avatar: str = "" | ||
| 70 | + | ||
| 71 | + @classmethod | ||
| 72 | + def from_dict(cls, d: dict) -> User: | ||
| 73 | + return cls( | ||
| 74 | + user_id=d.get("userId", ""), | ||
| 75 | + nickname=d.get("nickname", ""), | ||
| 76 | + nick_name=d.get("nickName", ""), | ||
| 77 | + avatar=d.get("avatar", ""), | ||
| 78 | + ) | ||
| 79 | + | ||
| 80 | + | ||
| 81 | +@dataclass | ||
| 82 | +class InteractInfo: | ||
| 83 | + liked: bool = False | ||
| 84 | + liked_count: str = "" | ||
| 85 | + shared_count: str = "" | ||
| 86 | + comment_count: str = "" | ||
| 87 | + collected_count: str = "" | ||
| 88 | + collected: bool = False | ||
| 89 | + | ||
| 90 | + @classmethod | ||
| 91 | + def from_dict(cls, d: dict) -> InteractInfo: | ||
| 92 | + return cls( | ||
| 93 | + liked=d.get("liked", False), | ||
| 94 | + liked_count=d.get("likedCount", ""), | ||
| 95 | + shared_count=d.get("sharedCount", ""), | ||
| 96 | + comment_count=d.get("commentCount", ""), | ||
| 97 | + collected_count=d.get("collectedCount", ""), | ||
| 98 | + collected=d.get("collected", False), | ||
| 99 | + ) | ||
| 100 | + | ||
| 101 | + | ||
| 102 | +@dataclass | ||
| 103 | +class NoteCard: | ||
| 104 | + type: str = "" | ||
| 105 | + display_title: str = "" | ||
| 106 | + user: User = field(default_factory=User) | ||
| 107 | + interact_info: InteractInfo = field(default_factory=InteractInfo) | ||
| 108 | + cover: Cover = field(default_factory=Cover) | ||
| 109 | + video: Video | None = None | ||
| 110 | + | ||
| 111 | + @classmethod | ||
| 112 | + def from_dict(cls, d: dict) -> NoteCard: | ||
| 113 | + video_data = d.get("video") | ||
| 114 | + return cls( | ||
| 115 | + type=d.get("type", ""), | ||
| 116 | + display_title=d.get("displayTitle", ""), | ||
| 117 | + user=User.from_dict(d.get("user", {})), | ||
| 118 | + interact_info=InteractInfo.from_dict(d.get("interactInfo", {})), | ||
| 119 | + cover=Cover.from_dict(d.get("cover", {})), | ||
| 120 | + video=Video.from_dict(video_data) if video_data else None, | ||
| 121 | + ) | ||
| 122 | + | ||
| 123 | + | ||
| 124 | +@dataclass | ||
| 125 | +class Feed: | ||
| 126 | + xsec_token: str = "" | ||
| 127 | + id: str = "" | ||
| 128 | + model_type: str = "" | ||
| 129 | + note_card: NoteCard = field(default_factory=NoteCard) | ||
| 130 | + index: int = 0 | ||
| 131 | + | ||
| 132 | + @classmethod | ||
| 133 | + def from_dict(cls, d: dict) -> Feed: | ||
| 134 | + return cls( | ||
| 135 | + xsec_token=d.get("xsecToken", ""), | ||
| 136 | + id=d.get("id", ""), | ||
| 137 | + model_type=d.get("modelType", ""), | ||
| 138 | + note_card=NoteCard.from_dict(d.get("noteCard", {})), | ||
| 139 | + index=d.get("index", 0), | ||
| 140 | + ) | ||
| 141 | + | ||
| 142 | + def to_dict(self) -> dict: | ||
| 143 | + """序列化为 JSON 兼容的字典。""" | ||
| 144 | + result: dict = { | ||
| 145 | + "id": self.id, | ||
| 146 | + "xsecToken": self.xsec_token, | ||
| 147 | + "modelType": self.model_type, | ||
| 148 | + "index": self.index, | ||
| 149 | + "displayTitle": self.note_card.display_title, | ||
| 150 | + "type": self.note_card.type, | ||
| 151 | + "user": { | ||
| 152 | + "userId": self.note_card.user.user_id, | ||
| 153 | + "nickname": self.note_card.user.nickname or self.note_card.user.nick_name, | ||
| 154 | + }, | ||
| 155 | + "interactInfo": { | ||
| 156 | + "likedCount": self.note_card.interact_info.liked_count, | ||
| 157 | + "collectedCount": self.note_card.interact_info.collected_count, | ||
| 158 | + "commentCount": self.note_card.interact_info.comment_count, | ||
| 159 | + "sharedCount": self.note_card.interact_info.shared_count, | ||
| 160 | + }, | ||
| 161 | + } | ||
| 162 | + if self.note_card.video: | ||
| 163 | + result["video"] = {"duration": self.note_card.video.capa.duration} | ||
| 164 | + return result | ||
| 165 | + | ||
| 166 | + | ||
| 167 | +# ========== Feed 详情 ========== | ||
| 168 | + | ||
| 169 | + | ||
| 170 | +@dataclass | ||
| 171 | +class DetailImageInfo: | ||
| 172 | + width: int = 0 | ||
| 173 | + height: int = 0 | ||
| 174 | + url_default: str = "" | ||
| 175 | + url_pre: str = "" | ||
| 176 | + live_photo: bool = False | ||
| 177 | + | ||
| 178 | + @classmethod | ||
| 179 | + def from_dict(cls, d: dict) -> DetailImageInfo: | ||
| 180 | + return cls( | ||
| 181 | + width=d.get("width", 0), | ||
| 182 | + height=d.get("height", 0), | ||
| 183 | + url_default=d.get("urlDefault", ""), | ||
| 184 | + url_pre=d.get("urlPre", ""), | ||
| 185 | + live_photo=d.get("livePhoto", False), | ||
| 186 | + ) | ||
| 187 | + | ||
| 188 | + | ||
| 189 | +@dataclass | ||
| 190 | +class Comment: | ||
| 191 | + id: str = "" | ||
| 192 | + note_id: str = "" | ||
| 193 | + content: str = "" | ||
| 194 | + like_count: str = "" | ||
| 195 | + create_time: int = 0 | ||
| 196 | + ip_location: str = "" | ||
| 197 | + liked: bool = False | ||
| 198 | + user_info: User = field(default_factory=User) | ||
| 199 | + sub_comment_count: str = "" | ||
| 200 | + sub_comments: list[Comment] = field(default_factory=list) | ||
| 201 | + show_tags: list[str] = field(default_factory=list) | ||
| 202 | + | ||
| 203 | + @classmethod | ||
| 204 | + def from_dict(cls, d: dict) -> Comment: | ||
| 205 | + return cls( | ||
| 206 | + id=d.get("id", ""), | ||
| 207 | + note_id=d.get("noteId", ""), | ||
| 208 | + content=d.get("content", ""), | ||
| 209 | + like_count=d.get("likeCount", ""), | ||
| 210 | + create_time=d.get("createTime", 0), | ||
| 211 | + ip_location=d.get("ipLocation", ""), | ||
| 212 | + liked=d.get("liked", False), | ||
| 213 | + user_info=User.from_dict(d.get("userInfo", {})), | ||
| 214 | + sub_comment_count=d.get("subCommentCount", ""), | ||
| 215 | + sub_comments=[cls.from_dict(c) for c in d.get("subComments", []) or []], | ||
| 216 | + show_tags=d.get("showTags", []) or [], | ||
| 217 | + ) | ||
| 218 | + | ||
| 219 | + def to_dict(self) -> dict: | ||
| 220 | + result: dict = { | ||
| 221 | + "id": self.id, | ||
| 222 | + "content": self.content, | ||
| 223 | + "likeCount": self.like_count, | ||
| 224 | + "createTime": self.create_time, | ||
| 225 | + "ipLocation": self.ip_location, | ||
| 226 | + "user": { | ||
| 227 | + "userId": self.user_info.user_id, | ||
| 228 | + "nickname": self.user_info.nickname or self.user_info.nick_name, | ||
| 229 | + }, | ||
| 230 | + "subCommentCount": self.sub_comment_count, | ||
| 231 | + } | ||
| 232 | + if self.sub_comments: | ||
| 233 | + result["subComments"] = [c.to_dict() for c in self.sub_comments] | ||
| 234 | + return result | ||
| 235 | + | ||
| 236 | + | ||
| 237 | +@dataclass | ||
| 238 | +class CommentList: | ||
| 239 | + list_: list[Comment] = field(default_factory=list) | ||
| 240 | + cursor: str = "" | ||
| 241 | + has_more: bool = False | ||
| 242 | + | ||
| 243 | + @classmethod | ||
| 244 | + def from_dict(cls, d: dict) -> CommentList: | ||
| 245 | + return cls( | ||
| 246 | + list_=[Comment.from_dict(c) for c in d.get("list", []) or []], | ||
| 247 | + cursor=d.get("cursor", ""), | ||
| 248 | + has_more=d.get("hasMore", False), | ||
| 249 | + ) | ||
| 250 | + | ||
| 251 | + | ||
| 252 | +@dataclass | ||
| 253 | +class FeedDetail: | ||
| 254 | + note_id: str = "" | ||
| 255 | + xsec_token: str = "" | ||
| 256 | + title: str = "" | ||
| 257 | + desc: str = "" | ||
| 258 | + type: str = "" | ||
| 259 | + time: int = 0 | ||
| 260 | + ip_location: str = "" | ||
| 261 | + user: User = field(default_factory=User) | ||
| 262 | + interact_info: InteractInfo = field(default_factory=InteractInfo) | ||
| 263 | + image_list: list[DetailImageInfo] = field(default_factory=list) | ||
| 264 | + | ||
| 265 | + @classmethod | ||
| 266 | + def from_dict(cls, d: dict) -> FeedDetail: | ||
| 267 | + return cls( | ||
| 268 | + note_id=d.get("noteId", ""), | ||
| 269 | + xsec_token=d.get("xsecToken", ""), | ||
| 270 | + title=d.get("title", ""), | ||
| 271 | + desc=d.get("desc", ""), | ||
| 272 | + type=d.get("type", ""), | ||
| 273 | + time=d.get("time", 0), | ||
| 274 | + ip_location=d.get("ipLocation", ""), | ||
| 275 | + user=User.from_dict(d.get("user", {})), | ||
| 276 | + interact_info=InteractInfo.from_dict(d.get("interactInfo", {})), | ||
| 277 | + image_list=[DetailImageInfo.from_dict(i) for i in d.get("imageList", []) or []], | ||
| 278 | + ) | ||
| 279 | + | ||
| 280 | + def to_dict(self) -> dict: | ||
| 281 | + return { | ||
| 282 | + "noteId": self.note_id, | ||
| 283 | + "title": self.title, | ||
| 284 | + "desc": self.desc, | ||
| 285 | + "type": self.type, | ||
| 286 | + "time": self.time, | ||
| 287 | + "ipLocation": self.ip_location, | ||
| 288 | + "user": { | ||
| 289 | + "userId": self.user.user_id, | ||
| 290 | + "nickname": self.user.nickname or self.user.nick_name, | ||
| 291 | + }, | ||
| 292 | + "interactInfo": { | ||
| 293 | + "liked": self.interact_info.liked, | ||
| 294 | + "likedCount": self.interact_info.liked_count, | ||
| 295 | + "collectedCount": self.interact_info.collected_count, | ||
| 296 | + "collected": self.interact_info.collected, | ||
| 297 | + "commentCount": self.interact_info.comment_count, | ||
| 298 | + "sharedCount": self.interact_info.shared_count, | ||
| 299 | + }, | ||
| 300 | + "imageList": [ | ||
| 301 | + { | ||
| 302 | + "width": img.width, | ||
| 303 | + "height": img.height, | ||
| 304 | + "urlDefault": img.url_default, | ||
| 305 | + } | ||
| 306 | + for img in self.image_list | ||
| 307 | + ], | ||
| 308 | + } | ||
| 309 | + | ||
| 310 | + | ||
| 311 | +@dataclass | ||
| 312 | +class FeedDetailResponse: | ||
| 313 | + note: FeedDetail = field(default_factory=FeedDetail) | ||
| 314 | + comments: CommentList = field(default_factory=CommentList) | ||
| 315 | + | ||
| 316 | + @classmethod | ||
| 317 | + def from_dict(cls, d: dict) -> FeedDetailResponse: | ||
| 318 | + return cls( | ||
| 319 | + note=FeedDetail.from_dict(d.get("note", {})), | ||
| 320 | + comments=CommentList.from_dict(d.get("comments", {})), | ||
| 321 | + ) | ||
| 322 | + | ||
| 323 | + def to_dict(self) -> dict: | ||
| 324 | + return { | ||
| 325 | + "note": self.note.to_dict(), | ||
| 326 | + "comments": [c.to_dict() for c in self.comments.list_], | ||
| 327 | + } | ||
| 328 | + | ||
| 329 | + | ||
| 330 | +# ========== 用户主页 ========== | ||
| 331 | + | ||
| 332 | + | ||
| 333 | +@dataclass | ||
| 334 | +class UserBasicInfo: | ||
| 335 | + gender: int = 0 | ||
| 336 | + ip_location: str = "" | ||
| 337 | + desc: str = "" | ||
| 338 | + imageb: str = "" | ||
| 339 | + nickname: str = "" | ||
| 340 | + images: str = "" | ||
| 341 | + red_id: str = "" | ||
| 342 | + | ||
| 343 | + @classmethod | ||
| 344 | + def from_dict(cls, d: dict) -> UserBasicInfo: | ||
| 345 | + return cls( | ||
| 346 | + gender=d.get("gender", 0), | ||
| 347 | + ip_location=d.get("ipLocation", ""), | ||
| 348 | + desc=d.get("desc", ""), | ||
| 349 | + imageb=d.get("imageb", ""), | ||
| 350 | + nickname=d.get("nickname", ""), | ||
| 351 | + images=d.get("images", ""), | ||
| 352 | + red_id=d.get("redId", ""), | ||
| 353 | + ) | ||
| 354 | + | ||
| 355 | + | ||
| 356 | +@dataclass | ||
| 357 | +class UserInteraction: | ||
| 358 | + type: str = "" | ||
| 359 | + name: str = "" | ||
| 360 | + count: str = "" | ||
| 361 | + | ||
| 362 | + @classmethod | ||
| 363 | + def from_dict(cls, d: dict) -> UserInteraction: | ||
| 364 | + return cls( | ||
| 365 | + type=d.get("type", ""), | ||
| 366 | + name=d.get("name", ""), | ||
| 367 | + count=d.get("count", ""), | ||
| 368 | + ) | ||
| 369 | + | ||
| 370 | + | ||
| 371 | +@dataclass | ||
| 372 | +class UserProfileResponse: | ||
| 373 | + user_basic_info: UserBasicInfo = field(default_factory=UserBasicInfo) | ||
| 374 | + interactions: list[UserInteraction] = field(default_factory=list) | ||
| 375 | + feeds: list[Feed] = field(default_factory=list) | ||
| 376 | + | ||
| 377 | + def to_dict(self) -> dict: | ||
| 378 | + return { | ||
| 379 | + "basicInfo": { | ||
| 380 | + "nickname": self.user_basic_info.nickname, | ||
| 381 | + "redId": self.user_basic_info.red_id, | ||
| 382 | + "desc": self.user_basic_info.desc, | ||
| 383 | + "gender": self.user_basic_info.gender, | ||
| 384 | + "ipLocation": self.user_basic_info.ip_location, | ||
| 385 | + }, | ||
| 386 | + "interactions": [ | ||
| 387 | + {"type": i.type, "name": i.name, "count": i.count} for i in self.interactions | ||
| 388 | + ], | ||
| 389 | + "feeds": [f.to_dict() for f in self.feeds], | ||
| 390 | + } | ||
| 391 | + | ||
| 392 | + | ||
| 393 | +# ========== 搜索 ========== | ||
| 394 | + | ||
| 395 | + | ||
| 396 | +@dataclass | ||
| 397 | +class FilterOption: | ||
| 398 | + """搜索筛选选项。""" | ||
| 399 | + | ||
| 400 | + sort_by: str = "" # 综合|最新|最多点赞|最多评论|最多收藏 | ||
| 401 | + note_type: str = "" # 不限|视频|图文 | ||
| 402 | + publish_time: str = "" # 不限|一天内|一周内|半年内 | ||
| 403 | + search_scope: str = "" # 不限|已看过|未看过|已关注 | ||
| 404 | + location: str = "" # 不限|同城|附近 | ||
| 405 | + | ||
| 406 | + | ||
| 407 | +# ========== 发布 ========== | ||
| 408 | + | ||
| 409 | + | ||
| 410 | +@dataclass | ||
| 411 | +class PublishImageContent: | ||
| 412 | + """图文发布内容。""" | ||
| 413 | + | ||
| 414 | + title: str = "" | ||
| 415 | + content: str = "" | ||
| 416 | + tags: list[str] = field(default_factory=list) | ||
| 417 | + image_paths: list[str] = field(default_factory=list) | ||
| 418 | + schedule_time: str | None = None # ISO8601 格式,None 表示立即发布 | ||
| 419 | + is_original: bool = False | ||
| 420 | + visibility: str = "" # 公开可见(默认)|仅自己可见|仅互关好友可见 | ||
| 421 | + | ||
| 422 | + | ||
| 423 | +@dataclass | ||
| 424 | +class PublishVideoContent: | ||
| 425 | + """视频发布内容。""" | ||
| 426 | + | ||
| 427 | + title: str = "" | ||
| 428 | + content: str = "" | ||
| 429 | + tags: list[str] = field(default_factory=list) | ||
| 430 | + video_path: str = "" | ||
| 431 | + schedule_time: str | None = None # ISO8601 格式 | ||
| 432 | + visibility: str = "" # 公开可见(默认)|仅自己可见|仅互关好友可见 | ||
| 433 | + | ||
| 434 | + | ||
| 435 | +# ========== 互动 ========== | ||
| 436 | + | ||
| 437 | + | ||
| 438 | +@dataclass | ||
| 439 | +class ActionResult: | ||
| 440 | + """通用动作响应(点赞/收藏等)。""" | ||
| 441 | + | ||
| 442 | + feed_id: str = "" | ||
| 443 | + success: bool = False | ||
| 444 | + message: str = "" | ||
| 445 | + | ||
| 446 | + def to_dict(self) -> dict: | ||
| 447 | + return { | ||
| 448 | + "feed_id": self.feed_id, | ||
| 449 | + "success": self.success, | ||
| 450 | + "message": self.message, | ||
| 451 | + } | ||
| 452 | + | ||
| 453 | + | ||
| 454 | +# ========== 评论加载配置 ========== | ||
| 455 | + | ||
| 456 | + | ||
| 457 | +@dataclass | ||
| 458 | +class CommentLoadConfig: | ||
| 459 | + """评论加载配置。""" | ||
| 460 | + | ||
| 461 | + click_more_replies: bool = False | ||
| 462 | + max_replies_threshold: int = 10 | ||
| 463 | + max_comment_items: int = 0 # 0 = 不限 | ||
| 464 | + scroll_speed: str = "normal" # slow|normal|fast |
scripts/xhs/urls.py
0 → 100644
| 1 | +"""小红书 URL 常量和构建函数。""" | ||
| 2 | + | ||
| 3 | +from urllib.parse import urlencode | ||
| 4 | + | ||
| 5 | +# 基础页面 | ||
| 6 | +EXPLORE_URL = "https://www.xiaohongshu.com/explore" | ||
| 7 | +HOME_URL = "https://www.xiaohongshu.com" | ||
| 8 | +PUBLISH_URL = "https://creator.xiaohongshu.com/publish/publish?source=official" | ||
| 9 | + | ||
| 10 | + | ||
| 11 | +def make_feed_detail_url(feed_id: str, xsec_token: str) -> str: | ||
| 12 | + """构建 feed 详情页 URL。""" | ||
| 13 | + return ( | ||
| 14 | + f"https://www.xiaohongshu.com/explore/{feed_id}?xsec_token={xsec_token}&xsec_source=pc_feed" | ||
| 15 | + ) | ||
| 16 | + | ||
| 17 | + | ||
| 18 | +def make_search_url(keyword: str) -> str: | ||
| 19 | + """构建搜索结果页 URL。""" | ||
| 20 | + params = urlencode({"keyword": keyword, "source": "web_explore_feed"}) | ||
| 21 | + return f"https://www.xiaohongshu.com/search_result?{params}" | ||
| 22 | + | ||
| 23 | + | ||
| 24 | +def make_user_profile_url(user_id: str, xsec_token: str) -> str: | ||
| 25 | + """构建用户主页 URL。""" | ||
| 26 | + return ( | ||
| 27 | + f"https://www.xiaohongshu.com/user/profile/{user_id}" | ||
| 28 | + f"?xsec_token={xsec_token}&xsec_source=pc_note" | ||
| 29 | + ) |
scripts/xhs/user_profile.py
0 → 100644
| 1 | +"""用户主页,对应 Go xiaohongshu/user_profile.go。""" | ||
| 2 | + | ||
| 3 | +from __future__ import annotations | ||
| 4 | + | ||
| 5 | +import json | ||
| 6 | +import logging | ||
| 7 | +import time | ||
| 8 | + | ||
| 9 | +from .cdp import Page | ||
| 10 | +from .types import Feed, UserBasicInfo, UserInteraction, UserProfileResponse | ||
| 11 | +from .urls import make_user_profile_url | ||
| 12 | + | ||
| 13 | +logger = logging.getLogger(__name__) | ||
| 14 | + | ||
| 15 | +# 提取用户数据的 JS | ||
| 16 | +_EXTRACT_USER_DATA_JS = """ | ||
| 17 | +(() => { | ||
| 18 | + if (window.__INITIAL_STATE__ && | ||
| 19 | + window.__INITIAL_STATE__.user && | ||
| 20 | + window.__INITIAL_STATE__.user.userPageData) { | ||
| 21 | + const userPageData = window.__INITIAL_STATE__.user.userPageData; | ||
| 22 | + const data = userPageData.value !== undefined ? userPageData.value : userPageData._value; | ||
| 23 | + if (data) { | ||
| 24 | + return JSON.stringify(data); | ||
| 25 | + } | ||
| 26 | + } | ||
| 27 | + return ""; | ||
| 28 | +})() | ||
| 29 | +""" | ||
| 30 | + | ||
| 31 | +_EXTRACT_USER_NOTES_JS = """ | ||
| 32 | +(() => { | ||
| 33 | + if (window.__INITIAL_STATE__ && | ||
| 34 | + window.__INITIAL_STATE__.user && | ||
| 35 | + window.__INITIAL_STATE__.user.notes) { | ||
| 36 | + const notes = window.__INITIAL_STATE__.user.notes; | ||
| 37 | + const data = notes.value !== undefined ? notes.value : notes._value; | ||
| 38 | + if (data) { | ||
| 39 | + return JSON.stringify(data); | ||
| 40 | + } | ||
| 41 | + } | ||
| 42 | + return ""; | ||
| 43 | +})() | ||
| 44 | +""" | ||
| 45 | + | ||
| 46 | + | ||
| 47 | +def get_user_profile(page: Page, user_id: str, xsec_token: str) -> UserProfileResponse: | ||
| 48 | + """获取用户主页信息及帖子。 | ||
| 49 | + | ||
| 50 | + Args: | ||
| 51 | + page: CDP 页面对象。 | ||
| 52 | + user_id: 用户 ID。 | ||
| 53 | + xsec_token: xsec_token。 | ||
| 54 | + | ||
| 55 | + Raises: | ||
| 56 | + RuntimeError: 数据提取失败。 | ||
| 57 | + """ | ||
| 58 | + url = make_user_profile_url(user_id, xsec_token) | ||
| 59 | + page.navigate(url) | ||
| 60 | + page.wait_for_load() | ||
| 61 | + page.wait_dom_stable() | ||
| 62 | + | ||
| 63 | + return _extract_user_profile_data(page) | ||
| 64 | + | ||
| 65 | + | ||
| 66 | +def _extract_user_profile_data(page: Page) -> UserProfileResponse: | ||
| 67 | + """从页面提取用户资料数据。""" | ||
| 68 | + # 等待 __INITIAL_STATE__ | ||
| 69 | + _wait_for_initial_state(page) | ||
| 70 | + | ||
| 71 | + # 提取用户信息 | ||
| 72 | + user_data_result = page.evaluate(_EXTRACT_USER_DATA_JS) | ||
| 73 | + if not user_data_result: | ||
| 74 | + raise RuntimeError("user.userPageData.value not found in __INITIAL_STATE__") | ||
| 75 | + | ||
| 76 | + # 提取用户帖子 | ||
| 77 | + notes_result = page.evaluate(_EXTRACT_USER_NOTES_JS) | ||
| 78 | + if not notes_result: | ||
| 79 | + raise RuntimeError("user.notes.value not found in __INITIAL_STATE__") | ||
| 80 | + | ||
| 81 | + # 解析用户信息 | ||
| 82 | + user_page_data = json.loads(user_data_result) | ||
| 83 | + basic_info = UserBasicInfo.from_dict(user_page_data.get("basicInfo", {})) | ||
| 84 | + interactions = [UserInteraction.from_dict(i) for i in user_page_data.get("interactions", [])] | ||
| 85 | + | ||
| 86 | + # 解析帖子(双重数组,展平) | ||
| 87 | + notes_feeds_raw = json.loads(notes_result) | ||
| 88 | + feeds: list[Feed] = [] | ||
| 89 | + for feed_group in notes_feeds_raw: | ||
| 90 | + if isinstance(feed_group, list): | ||
| 91 | + for f in feed_group: | ||
| 92 | + feeds.append(Feed.from_dict(f)) | ||
| 93 | + elif isinstance(feed_group, dict): | ||
| 94 | + feeds.append(Feed.from_dict(feed_group)) | ||
| 95 | + | ||
| 96 | + return UserProfileResponse( | ||
| 97 | + user_basic_info=basic_info, | ||
| 98 | + interactions=interactions, | ||
| 99 | + feeds=feeds, | ||
| 100 | + ) | ||
| 101 | + | ||
| 102 | + | ||
| 103 | +def _wait_for_initial_state(page: Page, timeout: float = 10.0) -> None: | ||
| 104 | + """等待 __INITIAL_STATE__ 就绪。""" | ||
| 105 | + deadline = time.monotonic() + timeout | ||
| 106 | + while time.monotonic() < deadline: | ||
| 107 | + ready = page.evaluate("window.__INITIAL_STATE__ !== undefined") | ||
| 108 | + if ready: | ||
| 109 | + return | ||
| 110 | + time.sleep(0.5) | ||
| 111 | + logger.warning("等待 __INITIAL_STATE__ 超时") |
skills/xhs-auth/SKILL.md
0 → 100644
| 1 | +--- | ||
| 2 | +name: xhs-auth | ||
| 3 | +description: | | ||
| 4 | + 小红书认证管理技能。检查登录状态、扫码登录、多账号管理。 | ||
| 5 | + 当用户要求登录小红书、检查登录状态、切换账号时触发。 | ||
| 6 | +--- | ||
| 7 | + | ||
| 8 | +# 小红书认证管理 | ||
| 9 | + | ||
| 10 | +你是"小红书认证助手"。负责管理小红书登录状态和多账号切换。 | ||
| 11 | + | ||
| 12 | +## 输入判断 | ||
| 13 | + | ||
| 14 | +按优先级判断用户意图: | ||
| 15 | + | ||
| 16 | +1. 用户要求"检查登录 / 是否登录 / 登录状态":执行登录状态检查。 | ||
| 17 | +2. 用户要求"登录 / 扫码登录 / 打开登录页":执行登录流程。 | ||
| 18 | +3. 用户要求"切换账号 / 换一个账号 / 退出登录 / 清除登录":执行 cookie 清除。 | ||
| 19 | + | ||
| 20 | +## 必做约束 | ||
| 21 | + | ||
| 22 | +- 登录操作需要用户手动扫码,不可自动化完成。 | ||
| 23 | +- 所有 CLI 命令位于 `scripts/cli.py`,输出 JSON。 | ||
| 24 | +- 需要先有运行中的 Chrome(通过 `scripts/chrome_launcher.py` 启动)。 | ||
| 25 | +- 如果使用文件路径,必须使用绝对路径。 | ||
| 26 | + | ||
| 27 | +## 工作流程 | ||
| 28 | + | ||
| 29 | +### 检查登录状态 | ||
| 30 | + | ||
| 31 | +```bash | ||
| 32 | +# 默认连接本地 Chrome | ||
| 33 | +python scripts/cli.py check-login | ||
| 34 | + | ||
| 35 | +# 指定端口 | ||
| 36 | +python scripts/cli.py --port 9222 check-login | ||
| 37 | + | ||
| 38 | +# 连接远程 Chrome | ||
| 39 | +python scripts/cli.py --host 10.0.0.12 --port 9222 check-login | ||
| 40 | +``` | ||
| 41 | + | ||
| 42 | +输出解读: | ||
| 43 | +- `"logged_in": true` + exit code 0 → 已登录,可执行后续操作。 | ||
| 44 | +- `"logged_in": false` + exit code 1 → 未登录,提示用户扫码。 | ||
| 45 | + | ||
| 46 | +### 登录流程 | ||
| 47 | + | ||
| 48 | +1. 确保 Chrome 已启动(有窗口模式,便于扫码): | ||
| 49 | +```bash | ||
| 50 | +python scripts/chrome_launcher.py | ||
| 51 | +``` | ||
| 52 | + | ||
| 53 | +2. 获取登录二维码并等待扫码: | ||
| 54 | +```bash | ||
| 55 | +python scripts/cli.py login | ||
| 56 | +``` | ||
| 57 | + | ||
| 58 | +3. 脚本首先输出一行 JSON,包含 `qrcode_path` 字段(二维码图片保存路径),然后阻塞等待扫码。 | ||
| 59 | + | ||
| 60 | +4. **展示二维码给用户**:从输出中提取 `qrcode_path`,用系统命令打开图片供用户扫码: | ||
| 61 | +```bash | ||
| 62 | +# macOS | ||
| 63 | +open /tmp/xhs/login_qrcode.png | ||
| 64 | + | ||
| 65 | +# Linux | ||
| 66 | +xdg-open /tmp/xhs/login_qrcode.png | ||
| 67 | +``` | ||
| 68 | +告知用户:"请用小红书 App 扫描二维码登录"。 | ||
| 69 | + | ||
| 70 | +5. 用户扫码成功后,脚本自动检测并输出第二行 JSON:`"logged_in": true`。 | ||
| 71 | + | ||
| 72 | +**注意**:`login` 命令会阻塞最多 120 秒等待扫码。由于命令阻塞期间无法执行其他操作,应提前在另一个终端或通过后台方式打开图片。推荐流程是先运行 `login` 命令(它会立即输出二维码路径),然后提示用户自行打开图片文件扫码。 | ||
| 73 | + | ||
| 74 | +### 清除 Cookies(切换账号/退出登录) | ||
| 75 | + | ||
| 76 | +```bash | ||
| 77 | +# 清除当前账号 cookies | ||
| 78 | +python scripts/cli.py delete-cookies | ||
| 79 | + | ||
| 80 | +# 指定账号清除 | ||
| 81 | +python scripts/cli.py --account work delete-cookies | ||
| 82 | +``` | ||
| 83 | + | ||
| 84 | +### 启动 / 关闭浏览器 | ||
| 85 | + | ||
| 86 | +```bash | ||
| 87 | +# 启动 Chrome(有窗口,推荐用于登录) | ||
| 88 | +python scripts/chrome_launcher.py | ||
| 89 | + | ||
| 90 | +# 无头启动 | ||
| 91 | +python scripts/chrome_launcher.py --headless | ||
| 92 | + | ||
| 93 | +# 指定端口 | ||
| 94 | +python scripts/chrome_launcher.py --port 9223 | ||
| 95 | + | ||
| 96 | +# 关闭 Chrome | ||
| 97 | +python scripts/chrome_launcher.py --kill | ||
| 98 | +``` | ||
| 99 | + | ||
| 100 | +## 失败处理 | ||
| 101 | + | ||
| 102 | +- **Chrome 未找到**:提示用户安装 Google Chrome 或设置路径。 | ||
| 103 | +- **端口被占用**:提示使用 `--port` 指定其他端口,或先执行 `--kill` 关闭现有实例。 | ||
| 104 | +- **扫码超时**:提示用户重新执行登录命令。 | ||
| 105 | +- **远程 CDP 连接失败**:检查远程 Chrome 是否已开启调试端口。 |
skills/xhs-content-ops/SKILL.md
0 → 100644
| 1 | +--- | ||
| 2 | +name: xhs-content-ops | ||
| 3 | +description: | | ||
| 4 | + 小红书复合内容运营技能。组合搜索、详情、发布、互动等能力完成运营工作流。 | ||
| 5 | + 当用户要求竞品分析、热点追踪、内容创作、互动管理等复合任务时触发。 | ||
| 6 | +--- | ||
| 7 | + | ||
| 8 | +# 小红书复合内容运营 | ||
| 9 | + | ||
| 10 | +你是"小红书内容运营助手"。帮助用户完成需要多步骤组合的运营任务。 | ||
| 11 | + | ||
| 12 | +## 输入判断 | ||
| 13 | + | ||
| 14 | +按优先级判断: | ||
| 15 | + | ||
| 16 | +1. 用户要求"竞品分析 / 分析竞品 / 对比笔记":执行竞品分析流程。 | ||
| 17 | +2. 用户要求"热点追踪 / 热门话题 / 趋势分析":执行热点追踪流程。 | ||
| 18 | +3. 用户要求"创作发布 / 研究话题后发布 / 一键创作":执行内容创作流程。 | ||
| 19 | +4. 用户要求"互动管理 / 批量互动 / 评论策略":执行互动管理流程。 | ||
| 20 | + | ||
| 21 | +## 必做约束 | ||
| 22 | + | ||
| 23 | +- 复合流程中每一步都应向用户报告进度。 | ||
| 24 | +- 发布类操作必须经过用户确认(参考 xhs-publish 约束)。 | ||
| 25 | +- 评论类操作必须经过用户确认(参考 xhs-interact 约束)。 | ||
| 26 | +- 搜索和浏览操作之间保持合理间隔,避免频率过高。 | ||
| 27 | +- 所有数据分析结果使用 markdown 表格结构化呈现。 | ||
| 28 | + | ||
| 29 | +## 工作流程 | ||
| 30 | + | ||
| 31 | +### 竞品分析 | ||
| 32 | + | ||
| 33 | +目标:搜索竞品笔记 → 获取详情 → 整理分析报告。 | ||
| 34 | + | ||
| 35 | +**步骤:** | ||
| 36 | + | ||
| 37 | +1. 确认分析目标(关键词、竞品账号)。 | ||
| 38 | +2. 搜索相关笔记: | ||
| 39 | +```bash | ||
| 40 | +python scripts/cli.py search-feeds \ | ||
| 41 | + --keyword "目标关键词" --sort-by 最多点赞 | ||
| 42 | +``` | ||
| 43 | +3. 从搜索结果中选取 3-5 篇高互动笔记,逐一获取详情: | ||
| 44 | +```bash | ||
| 45 | +python scripts/cli.py get-feed-detail \ | ||
| 46 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 47 | +``` | ||
| 48 | +4. 整理分析报告,包含: | ||
| 49 | + - 标题风格分析 | ||
| 50 | + - 封面图特点 | ||
| 51 | + - 正文结构(开头/中间/结尾) | ||
| 52 | + - 话题标签使用 | ||
| 53 | + - 互动数据对比(点赞/评论/收藏) | ||
| 54 | + | ||
| 55 | +**输出格式:** | ||
| 56 | + | ||
| 57 | +使用 markdown 表格对比各笔记的关键指标,并总结共性特征和差异化策略。 | ||
| 58 | + | ||
| 59 | +### 热点追踪 | ||
| 60 | + | ||
| 61 | +目标:搜索热门关键词 → 分析趋势 → 提供选题建议。 | ||
| 62 | + | ||
| 63 | +**步骤:** | ||
| 64 | + | ||
| 65 | +1. 确认追踪领域或关键词列表。 | ||
| 66 | +2. 对每个关键词分别搜索: | ||
| 67 | +```bash | ||
| 68 | +# 按最新排序,观察近期热度 | ||
| 69 | +python scripts/cli.py search-feeds \ | ||
| 70 | + --keyword "关键词" --sort-by 最新 --publish-time 一周内 | ||
| 71 | + | ||
| 72 | +# 按最多点赞排序,找爆款 | ||
| 73 | +python scripts/cli.py search-feeds \ | ||
| 74 | + --keyword "关键词" --sort-by 最多点赞 | ||
| 75 | +``` | ||
| 76 | +3. 对高互动笔记获取详情,分析内容模式。 | ||
| 77 | +4. 输出趋势报告: | ||
| 78 | + - 各关键词热度排名 | ||
| 79 | + - 爆款内容特征 | ||
| 80 | + - 选题建议 | ||
| 81 | + | ||
| 82 | +### 内容创作 | ||
| 83 | + | ||
| 84 | +目标:研究话题 → 辅助生成草稿 → 用户确认 → 发布。 | ||
| 85 | + | ||
| 86 | +**步骤:** | ||
| 87 | + | ||
| 88 | +1. 确认创作主题。 | ||
| 89 | +2. 搜索相关笔记,获取灵感: | ||
| 90 | +```bash | ||
| 91 | +python scripts/cli.py search-feeds \ | ||
| 92 | + --keyword "主题关键词" --sort-by 最多点赞 | ||
| 93 | +``` | ||
| 94 | +3. 选取 2-3 篇参考笔记,获取详情分析内容结构。 | ||
| 95 | +4. 基于分析结果,辅助用户生成草稿: | ||
| 96 | + - 标题(符合小红书风格,UTF-16 长度 ≤ 20) | ||
| 97 | + - 正文(段落清晰,口语化) | ||
| 98 | + - 话题标签 | ||
| 99 | +5. 通过 `AskUserQuestion` 让用户确认最终内容。 | ||
| 100 | +6. 执行发布(参考 xhs-publish 流程): | ||
| 101 | +```bash | ||
| 102 | +python scripts/cli.py publish \ | ||
| 103 | + --title-file /tmp/xhs_title.txt \ | ||
| 104 | + --content-file /tmp/xhs_content.txt \ | ||
| 105 | + --images "/abs/path/pic1.jpg" "/abs/path/pic2.jpg" \ | ||
| 106 | + --tags "标签1" "标签2" | ||
| 107 | +``` | ||
| 108 | + | ||
| 109 | +### 互动管理 | ||
| 110 | + | ||
| 111 | +目标:浏览目标笔记 → 有策略地评论/点赞/收藏。 | ||
| 112 | + | ||
| 113 | +**步骤:** | ||
| 114 | + | ||
| 115 | +1. 确认互动目标(关键词、话题领域)。 | ||
| 116 | +2. 搜索目标笔记: | ||
| 117 | +```bash | ||
| 118 | +python scripts/cli.py search-feeds \ | ||
| 119 | + --keyword "目标关键词" --sort-by 最新 | ||
| 120 | +``` | ||
| 121 | +3. 筛选适合互动的笔记(中等互动量、与自身领域相关)。 | ||
| 122 | +4. 获取详情,了解笔记内容: | ||
| 123 | +```bash | ||
| 124 | +python scripts/cli.py get-feed-detail \ | ||
| 125 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 126 | +``` | ||
| 127 | +5. 针对笔记内容生成有价值的评论建议。 | ||
| 128 | +6. 用户确认评论内容后发送: | ||
| 129 | +```bash | ||
| 130 | +python scripts/cli.py post-comment \ | ||
| 131 | + --feed-id FEED_ID \ | ||
| 132 | + --xsec-token XSEC_TOKEN \ | ||
| 133 | + --content "评论内容" | ||
| 134 | +``` | ||
| 135 | +7. 可选:点赞或收藏: | ||
| 136 | +```bash | ||
| 137 | +python scripts/cli.py like-feed \ | ||
| 138 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 139 | + | ||
| 140 | +python scripts/cli.py favorite-feed \ | ||
| 141 | + --feed-id FEED_ID --xsec-token XSEC_TOKEN | ||
| 142 | +``` | ||
| 143 | +8. 每次互动之间保持 30-60 秒间隔。 | ||
| 144 | + | ||
| 145 | +## 运营建议 | ||
| 146 | + | ||
| 147 | +- **竞品分析频率**:每周 1-2 次,跟踪竞品动态。 | ||
| 148 | +- **热点追踪频率**:每天 1 次,抓住时效性内容。 | ||
| 149 | +- **互动频率**:每天不超过 20 条评论,避免被限流。 | ||
| 150 | +- **发布时间**:工作日 12:00-13:00、18:00-21:00 为高峰时段。 | ||
| 151 | + | ||
| 152 | +## 失败处理 | ||
| 153 | + | ||
| 154 | +- **搜索无结果**:扩大关键词范围或调整筛选条件。 | ||
| 155 | +- **详情获取失败**:笔记可能已删除或设为私密。 | ||
| 156 | +- **发布失败**:参考 xhs-publish 的失败处理。 | ||
| 157 | +- **评论失败**:参考 xhs-interact 的失败处理。 | ||
| 158 | +- **频率限制**:增大操作间隔,降低频率。 |
skills/xhs-explore/SKILL.md
0 → 100644
| 1 | +--- | ||
| 2 | +name: xhs-explore | ||
| 3 | +description: | | ||
| 4 | + 小红书内容发现与分析技能。搜索笔记、浏览首页、查看详情、获取用户资料。 | ||
| 5 | + 当用户要求搜索小红书、查看笔记详情、浏览首页、查看用户主页时触发。 | ||
| 6 | +--- | ||
| 7 | + | ||
| 8 | +# 小红书内容发现 | ||
| 9 | + | ||
| 10 | +你是"小红书内容发现助手"。帮助用户搜索、浏览和分析小红书内容。 | ||
| 11 | + | ||
| 12 | +## 输入判断 | ||
| 13 | + | ||
| 14 | +按优先级判断: | ||
| 15 | + | ||
| 16 | +1. 用户要求"搜索笔记 / 找内容 / 搜关键词":执行搜索流程。 | ||
| 17 | +2. 用户要求"查看笔记详情 / 看这篇帖子":执行详情获取流程。 | ||
| 18 | +3. 用户要求"首页推荐 / 浏览首页":执行首页 Feed 获取。 | ||
| 19 | +4. 用户要求"查看用户主页 / 看看这个博主":执行用户资料获取。 | ||
| 20 | + | ||
| 21 | +## 必做约束 | ||
| 22 | + | ||
| 23 | +- 所有操作需要已登录的 Chrome 浏览器。 | ||
| 24 | +- `feed_id` 和 `xsec_token` 必须配对使用,从搜索结果或首页 Feed 中获取。 | ||
| 25 | +- 结果应结构化呈现,突出关键字段。 | ||
| 26 | +- CLI 输出为 JSON 格式。 | ||
| 27 | + | ||
| 28 | +## 工作流程 | ||
| 29 | + | ||
| 30 | +### 首页 Feed 列表 | ||
| 31 | + | ||
| 32 | +获取小红书首页推荐内容: | ||
| 33 | + | ||
| 34 | +```bash | ||
| 35 | +python scripts/cli.py list-feeds | ||
| 36 | +``` | ||
| 37 | + | ||
| 38 | +输出 JSON 包含 `feeds` 数组和 `count`,每个 feed 包含 `id`、`xsec_token`、`note_card`(标题、封面、互动数据等)。 | ||
| 39 | + | ||
| 40 | +### 搜索笔记 | ||
| 41 | + | ||
| 42 | +```bash | ||
| 43 | +# 基础搜索 | ||
| 44 | +python scripts/cli.py search-feeds --keyword "春招" | ||
| 45 | + | ||
| 46 | +# 带筛选搜索 | ||
| 47 | +python scripts/cli.py search-feeds \ | ||
| 48 | + --keyword "春招" \ | ||
| 49 | + --sort-by 最新 \ | ||
| 50 | + --note-type 图文 | ||
| 51 | + | ||
| 52 | +# 完整筛选 | ||
| 53 | +python scripts/cli.py search-feeds \ | ||
| 54 | + --keyword "春招" \ | ||
| 55 | + --sort-by 最多点赞 \ | ||
| 56 | + --note-type 图文 \ | ||
| 57 | + --publish-time 一周内 \ | ||
| 58 | + --search-scope 未看过 | ||
| 59 | +``` | ||
| 60 | + | ||
| 61 | +#### 搜索筛选参数 | ||
| 62 | + | ||
| 63 | +| 参数 | 可选值 | | ||
| 64 | +|------|--------| | ||
| 65 | +| `--sort-by` | 综合、最新、最多点赞、最多评论、最多收藏 | | ||
| 66 | +| `--note-type` | 不限、视频、图文 | | ||
| 67 | +| `--publish-time` | 不限、一天内、一周内、半年内 | | ||
| 68 | +| `--search-scope` | 不限、已看过、未看过、已关注 | | ||
| 69 | +| `--location` | 不限、同城、附近 | | ||
| 70 | + | ||
| 71 | +#### 搜索结果字段 | ||
| 72 | + | ||
| 73 | +输出 JSON 包含: | ||
| 74 | +- `feeds`:笔记列表,每项包含 `id`、`xsec_token`、`note_card`(标题、封面、用户信息、互动数据) | ||
| 75 | +- `count`:结果数量 | ||
| 76 | + | ||
| 77 | +### 获取笔记详情 | ||
| 78 | + | ||
| 79 | +从搜索结果或首页 Feed 中取 `id` 和 `xsec_token`,获取完整内容: | ||
| 80 | + | ||
| 81 | +```bash | ||
| 82 | +# 基础详情 | ||
| 83 | +python scripts/cli.py get-feed-detail \ | ||
| 84 | + --feed-id 67abc1234def567890123456 \ | ||
| 85 | + --xsec-token XSEC_TOKEN | ||
| 86 | + | ||
| 87 | +# 加载全部评论 | ||
| 88 | +python scripts/cli.py get-feed-detail \ | ||
| 89 | + --feed-id 67abc1234def567890123456 \ | ||
| 90 | + --xsec-token XSEC_TOKEN \ | ||
| 91 | + --load-all-comments | ||
| 92 | + | ||
| 93 | +# 加载全部评论(展开子评论) | ||
| 94 | +python scripts/cli.py get-feed-detail \ | ||
| 95 | + --feed-id 67abc1234def567890123456 \ | ||
| 96 | + --xsec-token XSEC_TOKEN \ | ||
| 97 | + --load-all-comments \ | ||
| 98 | + --click-more-replies \ | ||
| 99 | + --max-replies-threshold 10 | ||
| 100 | + | ||
| 101 | +# 限制评论数量 | ||
| 102 | +python scripts/cli.py get-feed-detail \ | ||
| 103 | + --feed-id 67abc1234def567890123456 \ | ||
| 104 | + --xsec-token XSEC_TOKEN \ | ||
| 105 | + --load-all-comments \ | ||
| 106 | + --max-comment-items 50 | ||
| 107 | +``` | ||
| 108 | + | ||
| 109 | +输出包含:笔记完整内容、图片列表、互动数据、评论列表。 | ||
| 110 | + | ||
| 111 | +### 获取用户主页 | ||
| 112 | + | ||
| 113 | +```bash | ||
| 114 | +python scripts/cli.py user-profile \ | ||
| 115 | + --user-id USER_ID \ | ||
| 116 | + --xsec-token XSEC_TOKEN | ||
| 117 | +``` | ||
| 118 | + | ||
| 119 | +输出包含:用户基本信息、粉丝/关注数、笔记列表。 | ||
| 120 | + | ||
| 121 | +## 结果呈现 | ||
| 122 | + | ||
| 123 | +搜索结果应按以下格式呈现给用户: | ||
| 124 | + | ||
| 125 | +1. **笔记列表**:每条笔记展示标题、作者、互动数据。 | ||
| 126 | +2. **详情内容**:完整的笔记正文、图片、评论。 | ||
| 127 | +3. **用户资料**:基本信息 + 代表作列表。 | ||
| 128 | +4. **数据表格**:使用 markdown 表格展示关键指标。 | ||
| 129 | + | ||
| 130 | +## 失败处理 | ||
| 131 | + | ||
| 132 | +- **未登录**:提示用户先执行登录(参考 xhs-auth)。 | ||
| 133 | +- **搜索无结果**:建议更换关键词或调整筛选条件。 | ||
| 134 | +- **笔记不可访问**:可能是私密笔记或已删除,提示用户。 | ||
| 135 | +- **用户主页不可访问**:用户可能已注销或设置隐私。 |
skills/xhs-interact/SKILL.md
0 → 100644
| 1 | +--- | ||
| 2 | +name: xhs-interact | ||
| 3 | +description: | | ||
| 4 | + 小红书社交互动技能。发表评论、回复评论、点赞、收藏。 | ||
| 5 | + 当用户要求评论、回复、点赞或收藏小红书帖子时触发。 | ||
| 6 | +--- | ||
| 7 | + | ||
| 8 | +# 小红书社交互动 | ||
| 9 | + | ||
| 10 | +你是"小红书互动助手"。帮助用户在小红书上进行社交互动。 | ||
| 11 | + | ||
| 12 | +## 输入判断 | ||
| 13 | + | ||
| 14 | +按优先级判断: | ||
| 15 | + | ||
| 16 | +1. 用户要求"发评论 / 评论这篇 / 写评论":执行发表评论流程。 | ||
| 17 | +2. 用户要求"回复评论 / 回复 TA":执行回复评论流程。 | ||
| 18 | +3. 用户要求"点赞 / 取消点赞":执行点赞流程。 | ||
| 19 | +4. 用户要求"收藏 / 取消收藏":执行收藏流程。 | ||
| 20 | + | ||
| 21 | +## 必做约束 | ||
| 22 | + | ||
| 23 | +- **评论和回复内容必须经过用户确认后才能发送**。 | ||
| 24 | +- 所有互动操作需要 `feed_id` 和 `xsec_token`(从搜索或详情中获取)。 | ||
| 25 | +- 评论文本不可为空。 | ||
| 26 | +- 点赞和收藏操作是幂等的(重复执行不会出错)。 | ||
| 27 | +- CLI 输出 JSON 格式。 | ||
| 28 | + | ||
| 29 | +## 工作流程 | ||
| 30 | + | ||
| 31 | +### 发表评论 | ||
| 32 | + | ||
| 33 | +1. 确认已有 `feed_id` 和 `xsec_token`(如没有,先搜索或获取详情)。 | ||
| 34 | +2. 向用户确认评论内容。 | ||
| 35 | +3. 执行发送。 | ||
| 36 | + | ||
| 37 | +```bash | ||
| 38 | +python scripts/cli.py post-comment \ | ||
| 39 | + --feed-id 67abc1234def567890123456 \ | ||
| 40 | + --xsec-token XSEC_TOKEN \ | ||
| 41 | + --content "写得很实用,感谢分享" | ||
| 42 | +``` | ||
| 43 | + | ||
| 44 | +### 回复评论 | ||
| 45 | + | ||
| 46 | +回复指定评论或用户: | ||
| 47 | + | ||
| 48 | +```bash | ||
| 49 | +# 回复指定评论(通过评论 ID) | ||
| 50 | +python scripts/cli.py reply-comment \ | ||
| 51 | + --feed-id 67abc1234def567890123456 \ | ||
| 52 | + --xsec-token XSEC_TOKEN \ | ||
| 53 | + --content "谢谢你的分享" \ | ||
| 54 | + --comment-id COMMENT_ID | ||
| 55 | + | ||
| 56 | +# 回复指定用户(通过用户 ID) | ||
| 57 | +python scripts/cli.py reply-comment \ | ||
| 58 | + --feed-id 67abc1234def567890123456 \ | ||
| 59 | + --xsec-token XSEC_TOKEN \ | ||
| 60 | + --content "谢谢你的分享" \ | ||
| 61 | + --user-id USER_ID | ||
| 62 | +``` | ||
| 63 | + | ||
| 64 | +### 点赞 / 取消点赞 | ||
| 65 | + | ||
| 66 | +```bash | ||
| 67 | +# 点赞 | ||
| 68 | +python scripts/cli.py like-feed \ | ||
| 69 | + --feed-id 67abc1234def567890123456 \ | ||
| 70 | + --xsec-token XSEC_TOKEN | ||
| 71 | + | ||
| 72 | +# 取消点赞 | ||
| 73 | +python scripts/cli.py like-feed \ | ||
| 74 | + --feed-id 67abc1234def567890123456 \ | ||
| 75 | + --xsec-token XSEC_TOKEN \ | ||
| 76 | + --unlike | ||
| 77 | +``` | ||
| 78 | + | ||
| 79 | +### 收藏 / 取消收藏 | ||
| 80 | + | ||
| 81 | +```bash | ||
| 82 | +# 收藏 | ||
| 83 | +python scripts/cli.py favorite-feed \ | ||
| 84 | + --feed-id 67abc1234def567890123456 \ | ||
| 85 | + --xsec-token XSEC_TOKEN | ||
| 86 | + | ||
| 87 | +# 取消收藏 | ||
| 88 | +python scripts/cli.py favorite-feed \ | ||
| 89 | + --feed-id 67abc1234def567890123456 \ | ||
| 90 | + --xsec-token XSEC_TOKEN \ | ||
| 91 | + --unfavorite | ||
| 92 | +``` | ||
| 93 | + | ||
| 94 | +## 互动策略建议 | ||
| 95 | + | ||
| 96 | +当用户需要批量互动时,建议: | ||
| 97 | + | ||
| 98 | +1. 先搜索目标内容(xhs-explore)。 | ||
| 99 | +2. 浏览搜索结果,选择要互动的笔记。 | ||
| 100 | +3. 获取详情确认内容。 | ||
| 101 | +4. 针对性地发表评论 / 点赞 / 收藏。 | ||
| 102 | +5. 每次互动之间保持合理间隔,避免频率过高。 | ||
| 103 | + | ||
| 104 | +## 失败处理 | ||
| 105 | + | ||
| 106 | +- **未登录**:提示先登录(参考 xhs-auth)。 | ||
| 107 | +- **笔记不可访问**:可能是私密或已删除笔记。 | ||
| 108 | +- **评论输入框未找到**:页面结构可能已变化,提示检查选择器。 | ||
| 109 | +- **评论发送失败**:检查内容是否包含敏感词。 | ||
| 110 | +- **点赞/收藏失败**:重试一次,仍失败则报告错误。 |
skills/xhs-publish/SKILL.md
0 → 100644
| 1 | +--- | ||
| 2 | +name: xhs-publish | ||
| 3 | +description: | | ||
| 4 | + 小红书内容发布技能。支持图文发布、视频发布、定时发布、标签、可见性设置。 | ||
| 5 | + 当用户要求发布内容到小红书、上传图文、上传视频时触发。 | ||
| 6 | +--- | ||
| 7 | + | ||
| 8 | +# 小红书内容发布 | ||
| 9 | + | ||
| 10 | +你是"小红书发布助手"。目标是在用户确认后,调用脚本完成内容发布。 | ||
| 11 | + | ||
| 12 | +## 输入判断 | ||
| 13 | + | ||
| 14 | +按优先级判断: | ||
| 15 | + | ||
| 16 | +1. 用户已提供 `标题 + 正文 + 视频(本地路径)`:直接进入视频发布流程。 | ||
| 17 | +2. 用户已提供 `标题 + 正文 + 图片(本地路径或 URL)`:直接进入图文发布流程。 | ||
| 18 | +3. 用户只提供网页 URL:先用 WebFetch 提取内容和图片,再给出可发布草稿等待确认。 | ||
| 19 | +4. 信息不全:先补齐缺失信息,不要直接发布。 | ||
| 20 | + | ||
| 21 | +## 必做约束 | ||
| 22 | + | ||
| 23 | +- **发布前必须让用户确认最终标题、正文和图片/视频**。 | ||
| 24 | +- 图文发布时,没有图片不得发布。 | ||
| 25 | +- 视频发布时,没有视频不得发布。图片和视频不可混合(二选一)。 | ||
| 26 | +- 标题长度不超过 20(UTF-16 编码计算,中文字符计 1,英文/数字/空格计 1)。 | ||
| 27 | +- 如果使用文件路径,必须使用绝对路径,禁止相对路径。 | ||
| 28 | +- 需要先有运行中的 Chrome,且已登录。 | ||
| 29 | + | ||
| 30 | +## 工作流程 | ||
| 31 | + | ||
| 32 | +### Step 1: 处理内容 | ||
| 33 | + | ||
| 34 | +#### 完整内容模式 | ||
| 35 | +直接使用用户提供的标题和正文。 | ||
| 36 | + | ||
| 37 | +#### URL 提取模式 | ||
| 38 | +1. 使用 WebFetch 提取网页内容。 | ||
| 39 | +2. 提取关键信息:标题、正文、图片 URL。 | ||
| 40 | +3. 适当总结内容,保持语言自然、适合小红书阅读习惯。 | ||
| 41 | +4. 如果提取不到图片,告知用户手动获取。 | ||
| 42 | + | ||
| 43 | +### Step 2: 内容检查 | ||
| 44 | + | ||
| 45 | +#### 标题检查 | ||
| 46 | +标题长度必须 ≤ 20(UTF-16 编码长度)。如果超长,自动生成符合长度的新标题。 | ||
| 47 | + | ||
| 48 | +#### 正文格式 | ||
| 49 | +- 段落之间使用双换行分隔。 | ||
| 50 | +- 简体中文,语言自然。 | ||
| 51 | +- 话题标签放在正文最后一行,格式:`#标签1 #标签2 #标签3` | ||
| 52 | + | ||
| 53 | +### Step 3: 用户确认 | ||
| 54 | + | ||
| 55 | +通过 `AskUserQuestion` 展示即将发布的内容(标题、正文、图片/视频),获得明确确认后继续。 | ||
| 56 | + | ||
| 57 | +### Step 4: 写入临时文件 | ||
| 58 | + | ||
| 59 | +将标题和正文写入 UTF-8 文本文件。不要在命令行参数中内联中文文本。 | ||
| 60 | + | ||
| 61 | +### Step 5: 执行发布 | ||
| 62 | + | ||
| 63 | +#### 图文发布 | ||
| 64 | + | ||
| 65 | +```bash | ||
| 66 | +# 使用 CLI 直接发布 | ||
| 67 | +python scripts/cli.py publish \ | ||
| 68 | + --title-file /tmp/xhs_title.txt \ | ||
| 69 | + --content-file /tmp/xhs_content.txt \ | ||
| 70 | + --images "/abs/path/pic1.jpg" "/abs/path/pic2.jpg" | ||
| 71 | + | ||
| 72 | +# 带标签和定时发布 | ||
| 73 | +python scripts/cli.py publish \ | ||
| 74 | + --title-file /tmp/xhs_title.txt \ | ||
| 75 | + --content-file /tmp/xhs_content.txt \ | ||
| 76 | + --images "/abs/path/pic1.jpg" \ | ||
| 77 | + --tags "标签1" "标签2" \ | ||
| 78 | + --schedule-at "2026-03-10T12:00:00" \ | ||
| 79 | + --original | ||
| 80 | + | ||
| 81 | +# 使用发布流水线(含图片下载和登录检查) | ||
| 82 | +python scripts/publish_pipeline.py \ | ||
| 83 | + --title-file /tmp/xhs_title.txt \ | ||
| 84 | + --content-file /tmp/xhs_content.txt \ | ||
| 85 | + --images "https://example.com/pic1.jpg" "/abs/path/pic2.jpg" | ||
| 86 | +``` | ||
| 87 | + | ||
| 88 | +#### 视频发布 | ||
| 89 | + | ||
| 90 | +```bash | ||
| 91 | +python scripts/cli.py publish-video \ | ||
| 92 | + --title-file /tmp/xhs_title.txt \ | ||
| 93 | + --content-file /tmp/xhs_content.txt \ | ||
| 94 | + --video "/abs/path/video.mp4" | ||
| 95 | + | ||
| 96 | +# 带标签和可见性 | ||
| 97 | +python scripts/cli.py publish-video \ | ||
| 98 | + --title-file /tmp/xhs_title.txt \ | ||
| 99 | + --content-file /tmp/xhs_content.txt \ | ||
| 100 | + --video "/abs/path/video.mp4" \ | ||
| 101 | + --tags "标签1" "标签2" \ | ||
| 102 | + --visibility "公开" | ||
| 103 | +``` | ||
| 104 | + | ||
| 105 | +#### 指定账号/远程 Chrome | ||
| 106 | + | ||
| 107 | +```bash | ||
| 108 | +# 指定账号 | ||
| 109 | +python scripts/cli.py --account work publish \ | ||
| 110 | + --title-file /tmp/xhs_title.txt \ | ||
| 111 | + --content-file /tmp/xhs_content.txt \ | ||
| 112 | + --images "/abs/path/pic1.jpg" | ||
| 113 | + | ||
| 114 | +# 远程 Chrome | ||
| 115 | +python scripts/cli.py --host 10.0.0.12 --port 9222 publish \ | ||
| 116 | + --title-file /tmp/xhs_title.txt \ | ||
| 117 | + --content-file /tmp/xhs_content.txt \ | ||
| 118 | + --images "/abs/path/pic1.jpg" | ||
| 119 | +``` | ||
| 120 | + | ||
| 121 | +### Step 6: 处理输出 | ||
| 122 | + | ||
| 123 | +- **Exit code 0**:发布成功。输出 JSON 包含 `success`, `title`, `images`/`video`, `status`。 | ||
| 124 | +- **Exit code 1**:未登录,提示用户先登录(参考 xhs-auth)。 | ||
| 125 | +- **Exit code 2**:错误,报告 JSON 中的 `error` 字段。 | ||
| 126 | + | ||
| 127 | +### Step 7: 报告结果 | ||
| 128 | + | ||
| 129 | +根据输出告知用户发布是否成功。 | ||
| 130 | + | ||
| 131 | +## 常用参数 | ||
| 132 | + | ||
| 133 | +| 参数 | 说明 | | ||
| 134 | +|------|------| | ||
| 135 | +| `--title-file path` | 标题文件路径(必须) | | ||
| 136 | +| `--content-file path` | 正文文件路径(必须) | | ||
| 137 | +| `--images path1 path2` | 图片路径/URL 列表(图文必须) | | ||
| 138 | +| `--video path` | 视频文件路径(视频必须) | | ||
| 139 | +| `--tags tag1 tag2` | 话题标签列表 | | ||
| 140 | +| `--schedule-at ISO8601` | 定时发布时间 | | ||
| 141 | +| `--original` | 声明原创 | | ||
| 142 | +| `--visibility` | 可见范围 | | ||
| 143 | +| `--host HOST` | 远程 CDP 主机 | | ||
| 144 | +| `--port PORT` | CDP 端口(默认 9222) | | ||
| 145 | +| `--account name` | 指定账号 | | ||
| 146 | + | ||
| 147 | +## 失败处理 | ||
| 148 | + | ||
| 149 | +- **登录失败**:提示用户重新扫码登录并重试。 | ||
| 150 | +- **图片下载失败**:提示更换图片 URL 或改用本地图片。 | ||
| 151 | +- **视频处理超时**:视频上传后需等待处理(最长 10 分钟),超时后提示重试。 | ||
| 152 | +- **标题过长**:自动缩短标题,保持语义。 | ||
| 153 | +- **页面选择器失效**:提示检查脚本中的选择器定义。 |
uv.lock
0 → 100644
| 1 | +version = 1 | ||
| 2 | +revision = 1 | ||
| 3 | +requires-python = ">=3.11" | ||
| 4 | + | ||
| 5 | +[[package]] | ||
| 6 | +name = "certifi" | ||
| 7 | +version = "2026.2.25" | ||
| 8 | +source = { registry = "https://pypi.org/simple" } | ||
| 9 | +sdist = { url = "https://files.pythonhosted.org/packages/af/2d/7bf41579a8986e348fa033a31cdd0e4121114f6bce2457e8876010b092dd/certifi-2026.2.25.tar.gz", hash = "sha256:e887ab5cee78ea814d3472169153c2d12cd43b14bd03329a39a9c6e2e80bfba7", size = 155029 } | ||
| 10 | +wheels = [ | ||
| 11 | + { url = "https://files.pythonhosted.org/packages/9a/3c/c17fb3ca2d9c3acff52e30b309f538586f9f5b9c9cf454f3845fc9af4881/certifi-2026.2.25-py3-none-any.whl", hash = "sha256:027692e4402ad994f1c42e52a4997a9763c646b73e4096e4d5d6db8af1d6f0fa", size = 153684 }, | ||
| 12 | +] | ||
| 13 | + | ||
| 14 | +[[package]] | ||
| 15 | +name = "charset-normalizer" | ||
| 16 | +version = "3.4.4" | ||
| 17 | +source = { registry = "https://pypi.org/simple" } | ||
| 18 | +sdist = { url = "https://files.pythonhosted.org/packages/13/69/33ddede1939fdd074bce5434295f38fae7136463422fe4fd3e0e89b98062/charset_normalizer-3.4.4.tar.gz", hash = "sha256:94537985111c35f28720e43603b8e7b43a6ecfb2ce1d3058bbe955b73404e21a", size = 129418 } | ||
| 19 | +wheels = [ | ||
| 20 | + { url = "https://files.pythonhosted.org/packages/ed/27/c6491ff4954e58a10f69ad90aca8a1b6fe9c5d3c6f380907af3c37435b59/charset_normalizer-3.4.4-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:6e1fcf0720908f200cd21aa4e6750a48ff6ce4afe7ff5a79a90d5ed8a08296f8", size = 206988 }, | ||
| 21 | + { url = "https://files.pythonhosted.org/packages/94/59/2e87300fe67ab820b5428580a53cad894272dbb97f38a7a814a2a1ac1011/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5f819d5fe9234f9f82d75bdfa9aef3a3d72c4d24a6e57aeaebba32a704553aa0", size = 147324 }, | ||
| 22 | + { url = "https://files.pythonhosted.org/packages/07/fb/0cf61dc84b2b088391830f6274cb57c82e4da8bbc2efeac8c025edb88772/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:a59cb51917aa591b1c4e6a43c132f0cdc3c76dbad6155df4e28ee626cc77a0a3", size = 142742 }, | ||
| 23 | + { url = "https://files.pythonhosted.org/packages/62/8b/171935adf2312cd745d290ed93cf16cf0dfe320863ab7cbeeae1dcd6535f/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8ef3c867360f88ac904fd3f5e1f902f13307af9052646963ee08ff4f131adafc", size = 160863 }, | ||
| 24 | + { url = "https://files.pythonhosted.org/packages/09/73/ad875b192bda14f2173bfc1bc9a55e009808484a4b256748d931b6948442/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:d9e45d7faa48ee908174d8fe84854479ef838fc6a705c9315372eacbc2f02897", size = 157837 }, | ||
| 25 | + { url = "https://files.pythonhosted.org/packages/6d/fc/de9cce525b2c5b94b47c70a4b4fb19f871b24995c728e957ee68ab1671ea/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:840c25fb618a231545cbab0564a799f101b63b9901f2569faecd6b222ac72381", size = 151550 }, | ||
| 26 | + { url = "https://files.pythonhosted.org/packages/55/c2/43edd615fdfba8c6f2dfbd459b25a6b3b551f24ea21981e23fb768503ce1/charset_normalizer-3.4.4-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:ca5862d5b3928c4940729dacc329aa9102900382fea192fc5e52eb69d6093815", size = 149162 }, | ||
| 27 | + { url = "https://files.pythonhosted.org/packages/03/86/bde4ad8b4d0e9429a4e82c1e8f5c659993a9a863ad62c7df05cf7b678d75/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d9c7f57c3d666a53421049053eaacdd14bbd0a528e2186fcb2e672effd053bb0", size = 150019 }, | ||
| 28 | + { url = "https://files.pythonhosted.org/packages/1f/86/a151eb2af293a7e7bac3a739b81072585ce36ccfb4493039f49f1d3cae8c/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:277e970e750505ed74c832b4bf75dac7476262ee2a013f5574dd49075879e161", size = 143310 }, | ||
| 29 | + { url = "https://files.pythonhosted.org/packages/b5/fe/43dae6144a7e07b87478fdfc4dbe9efd5defb0e7ec29f5f58a55aeef7bf7/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:31fd66405eaf47bb62e8cd575dc621c56c668f27d46a61d975a249930dd5e2a4", size = 162022 }, | ||
| 30 | + { url = "https://files.pythonhosted.org/packages/80/e6/7aab83774f5d2bca81f42ac58d04caf44f0cc2b65fc6db2b3b2e8a05f3b3/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:0d3d8f15c07f86e9ff82319b3d9ef6f4bf907608f53fe9d92b28ea9ae3d1fd89", size = 149383 }, | ||
| 31 | + { url = "https://files.pythonhosted.org/packages/4f/e8/b289173b4edae05c0dde07f69f8db476a0b511eac556dfe0d6bda3c43384/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:9f7fcd74d410a36883701fafa2482a6af2ff5ba96b9a620e9e0721e28ead5569", size = 159098 }, | ||
| 32 | + { url = "https://files.pythonhosted.org/packages/d8/df/fe699727754cae3f8478493c7f45f777b17c3ef0600e28abfec8619eb49c/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ebf3e58c7ec8a8bed6d66a75d7fb37b55e5015b03ceae72a8e7c74495551e224", size = 152991 }, | ||
| 33 | + { url = "https://files.pythonhosted.org/packages/1a/86/584869fe4ddb6ffa3bd9f491b87a01568797fb9bd8933f557dba9771beaf/charset_normalizer-3.4.4-cp311-cp311-win32.whl", hash = "sha256:eecbc200c7fd5ddb9a7f16c7decb07b566c29fa2161a16cf67b8d068bd21690a", size = 99456 }, | ||
| 34 | + { url = "https://files.pythonhosted.org/packages/65/f6/62fdd5feb60530f50f7e38b4f6a1d5203f4d16ff4f9f0952962c044e919a/charset_normalizer-3.4.4-cp311-cp311-win_amd64.whl", hash = "sha256:5ae497466c7901d54b639cf42d5b8c1b6a4fead55215500d2f486d34db48d016", size = 106978 }, | ||
| 35 | + { url = "https://files.pythonhosted.org/packages/7a/9d/0710916e6c82948b3be62d9d398cb4fcf4e97b56d6a6aeccd66c4b2f2bd5/charset_normalizer-3.4.4-cp311-cp311-win_arm64.whl", hash = "sha256:65e2befcd84bc6f37095f5961e68a6f077bf44946771354a28ad434c2cce0ae1", size = 99969 }, | ||
| 36 | + { url = "https://files.pythonhosted.org/packages/f3/85/1637cd4af66fa687396e757dec650f28025f2a2f5a5531a3208dc0ec43f2/charset_normalizer-3.4.4-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:0a98e6759f854bd25a58a73fa88833fba3b7c491169f86ce1180c948ab3fd394", size = 208425 }, | ||
| 37 | + { url = "https://files.pythonhosted.org/packages/9d/6a/04130023fef2a0d9c62d0bae2649b69f7b7d8d24ea5536feef50551029df/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b5b290ccc2a263e8d185130284f8501e3e36c5e02750fc6b6bdeb2e9e96f1e25", size = 148162 }, | ||
| 38 | + { url = "https://files.pythonhosted.org/packages/78/29/62328d79aa60da22c9e0b9a66539feae06ca0f5a4171ac4f7dc285b83688/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:74bb723680f9f7a6234dcf67aea57e708ec1fbdf5699fb91dfd6f511b0a320ef", size = 144558 }, | ||
| 39 | + { url = "https://files.pythonhosted.org/packages/86/bb/b32194a4bf15b88403537c2e120b817c61cd4ecffa9b6876e941c3ee38fe/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:f1e34719c6ed0b92f418c7c780480b26b5d9c50349e9a9af7d76bf757530350d", size = 161497 }, | ||
| 40 | + { url = "https://files.pythonhosted.org/packages/19/89/a54c82b253d5b9b111dc74aca196ba5ccfcca8242d0fb64146d4d3183ff1/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:2437418e20515acec67d86e12bf70056a33abdacb5cb1655042f6538d6b085a8", size = 159240 }, | ||
| 41 | + { url = "https://files.pythonhosted.org/packages/c0/10/d20b513afe03acc89ec33948320a5544d31f21b05368436d580dec4e234d/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:11d694519d7f29d6cd09f6ac70028dba10f92f6cdd059096db198c283794ac86", size = 153471 }, | ||
| 42 | + { url = "https://files.pythonhosted.org/packages/61/fa/fbf177b55bdd727010f9c0a3c49eefa1d10f960e5f09d1d887bf93c2e698/charset_normalizer-3.4.4-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:ac1c4a689edcc530fc9d9aa11f5774b9e2f33f9a0c6a57864e90908f5208d30a", size = 150864 }, | ||
| 43 | + { url = "https://files.pythonhosted.org/packages/05/12/9fbc6a4d39c0198adeebbde20b619790e9236557ca59fc40e0e3cebe6f40/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:21d142cc6c0ec30d2efee5068ca36c128a30b0f2c53c1c07bd78cb6bc1d3be5f", size = 150647 }, | ||
| 44 | + { url = "https://files.pythonhosted.org/packages/ad/1f/6a9a593d52e3e8c5d2b167daf8c6b968808efb57ef4c210acb907c365bc4/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:5dbe56a36425d26d6cfb40ce79c314a2e4dd6211d51d6d2191c00bed34f354cc", size = 145110 }, | ||
| 45 | + { url = "https://files.pythonhosted.org/packages/30/42/9a52c609e72471b0fc54386dc63c3781a387bb4fe61c20231a4ebcd58bdd/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:5bfbb1b9acf3334612667b61bd3002196fe2a1eb4dd74d247e0f2a4d50ec9bbf", size = 162839 }, | ||
| 46 | + { url = "https://files.pythonhosted.org/packages/c4/5b/c0682bbf9f11597073052628ddd38344a3d673fda35a36773f7d19344b23/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:d055ec1e26e441f6187acf818b73564e6e6282709e9bcb5b63f5b23068356a15", size = 150667 }, | ||
| 47 | + { url = "https://files.pythonhosted.org/packages/e4/24/a41afeab6f990cf2daf6cb8c67419b63b48cf518e4f56022230840c9bfb2/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:af2d8c67d8e573d6de5bc30cdb27e9b95e49115cd9baad5ddbd1a6207aaa82a9", size = 160535 }, | ||
| 48 | + { url = "https://files.pythonhosted.org/packages/2a/e5/6a4ce77ed243c4a50a1fecca6aaaab419628c818a49434be428fe24c9957/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:780236ac706e66881f3b7f2f32dfe90507a09e67d1d454c762cf642e6e1586e0", size = 154816 }, | ||
| 49 | + { url = "https://files.pythonhosted.org/packages/a8/ef/89297262b8092b312d29cdb2517cb1237e51db8ecef2e9af5edbe7b683b1/charset_normalizer-3.4.4-cp312-cp312-win32.whl", hash = "sha256:5833d2c39d8896e4e19b689ffc198f08ea58116bee26dea51e362ecc7cd3ed26", size = 99694 }, | ||
| 50 | + { url = "https://files.pythonhosted.org/packages/3d/2d/1e5ed9dd3b3803994c155cd9aacb60c82c331bad84daf75bcb9c91b3295e/charset_normalizer-3.4.4-cp312-cp312-win_amd64.whl", hash = "sha256:a79cfe37875f822425b89a82333404539ae63dbdddf97f84dcbc3d339aae9525", size = 107131 }, | ||
| 51 | + { url = "https://files.pythonhosted.org/packages/d0/d9/0ed4c7098a861482a7b6a95603edce4c0d9db2311af23da1fb2b75ec26fc/charset_normalizer-3.4.4-cp312-cp312-win_arm64.whl", hash = "sha256:376bec83a63b8021bb5c8ea75e21c4ccb86e7e45ca4eb81146091b56599b80c3", size = 100390 }, | ||
| 52 | + { url = "https://files.pythonhosted.org/packages/97/45/4b3a1239bbacd321068ea6e7ac28875b03ab8bc0aa0966452db17cd36714/charset_normalizer-3.4.4-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:e1f185f86a6f3403aa2420e815904c67b2f9ebc443f045edd0de921108345794", size = 208091 }, | ||
| 53 | + { url = "https://files.pythonhosted.org/packages/7d/62/73a6d7450829655a35bb88a88fca7d736f9882a27eacdca2c6d505b57e2e/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6b39f987ae8ccdf0d2642338faf2abb1862340facc796048b604ef14919e55ed", size = 147936 }, | ||
| 54 | + { url = "https://files.pythonhosted.org/packages/89/c5/adb8c8b3d6625bef6d88b251bbb0d95f8205831b987631ab0c8bb5d937c2/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:3162d5d8ce1bb98dd51af660f2121c55d0fa541b46dff7bb9b9f86ea1d87de72", size = 144180 }, | ||
| 55 | + { url = "https://files.pythonhosted.org/packages/91/ed/9706e4070682d1cc219050b6048bfd293ccf67b3d4f5a4f39207453d4b99/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:81d5eb2a312700f4ecaa977a8235b634ce853200e828fbadf3a9c50bab278328", size = 161346 }, | ||
| 56 | + { url = "https://files.pythonhosted.org/packages/d5/0d/031f0d95e4972901a2f6f09ef055751805ff541511dc1252ba3ca1f80cf5/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:5bd2293095d766545ec1a8f612559f6b40abc0eb18bb2f5d1171872d34036ede", size = 158874 }, | ||
| 57 | + { url = "https://files.pythonhosted.org/packages/f5/83/6ab5883f57c9c801ce5e5677242328aa45592be8a00644310a008d04f922/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a8a8b89589086a25749f471e6a900d3f662d1d3b6e2e59dcecf787b1cc3a1894", size = 153076 }, | ||
| 58 | + { url = "https://files.pythonhosted.org/packages/75/1e/5ff781ddf5260e387d6419959ee89ef13878229732732ee73cdae01800f2/charset_normalizer-3.4.4-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:bc7637e2f80d8530ee4a78e878bce464f70087ce73cf7c1caf142416923b98f1", size = 150601 }, | ||
| 59 | + { url = "https://files.pythonhosted.org/packages/d7/57/71be810965493d3510a6ca79b90c19e48696fb1ff964da319334b12677f0/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f8bf04158c6b607d747e93949aa60618b61312fe647a6369f88ce2ff16043490", size = 150376 }, | ||
| 60 | + { url = "https://files.pythonhosted.org/packages/e5/d5/c3d057a78c181d007014feb7e9f2e65905a6c4ef182c0ddf0de2924edd65/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:554af85e960429cf30784dd47447d5125aaa3b99a6f0683589dbd27e2f45da44", size = 144825 }, | ||
| 61 | + { url = "https://files.pythonhosted.org/packages/e6/8c/d0406294828d4976f275ffbe66f00266c4b3136b7506941d87c00cab5272/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:74018750915ee7ad843a774364e13a3db91682f26142baddf775342c3f5b1133", size = 162583 }, | ||
| 62 | + { url = "https://files.pythonhosted.org/packages/d7/24/e2aa1f18c8f15c4c0e932d9287b8609dd30ad56dbe41d926bd846e22fb8d/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:c0463276121fdee9c49b98908b3a89c39be45d86d1dbaa22957e38f6321d4ce3", size = 150366 }, | ||
| 63 | + { url = "https://files.pythonhosted.org/packages/e4/5b/1e6160c7739aad1e2df054300cc618b06bf784a7a164b0f238360721ab86/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:362d61fd13843997c1c446760ef36f240cf81d3ebf74ac62652aebaf7838561e", size = 160300 }, | ||
| 64 | + { url = "https://files.pythonhosted.org/packages/7a/10/f882167cd207fbdd743e55534d5d9620e095089d176d55cb22d5322f2afd/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:9a26f18905b8dd5d685d6d07b0cdf98a79f3c7a918906af7cc143ea2e164c8bc", size = 154465 }, | ||
| 65 | + { url = "https://files.pythonhosted.org/packages/89/66/c7a9e1b7429be72123441bfdbaf2bc13faab3f90b933f664db506dea5915/charset_normalizer-3.4.4-cp313-cp313-win32.whl", hash = "sha256:9b35f4c90079ff2e2edc5b26c0c77925e5d2d255c42c74fdb70fb49b172726ac", size = 99404 }, | ||
| 66 | + { url = "https://files.pythonhosted.org/packages/c4/26/b9924fa27db384bdcd97ab83b4f0a8058d96ad9626ead570674d5e737d90/charset_normalizer-3.4.4-cp313-cp313-win_amd64.whl", hash = "sha256:b435cba5f4f750aa6c0a0d92c541fb79f69a387c91e61f1795227e4ed9cece14", size = 107092 }, | ||
| 67 | + { url = "https://files.pythonhosted.org/packages/af/8f/3ed4bfa0c0c72a7ca17f0380cd9e4dd842b09f664e780c13cff1dcf2ef1b/charset_normalizer-3.4.4-cp313-cp313-win_arm64.whl", hash = "sha256:542d2cee80be6f80247095cc36c418f7bddd14f4a6de45af91dfad36d817bba2", size = 100408 }, | ||
| 68 | + { url = "https://files.pythonhosted.org/packages/2a/35/7051599bd493e62411d6ede36fd5af83a38f37c4767b92884df7301db25d/charset_normalizer-3.4.4-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:da3326d9e65ef63a817ecbcc0df6e94463713b754fe293eaa03da99befb9a5bd", size = 207746 }, | ||
| 69 | + { url = "https://files.pythonhosted.org/packages/10/9a/97c8d48ef10d6cd4fcead2415523221624bf58bcf68a802721a6bc807c8f/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8af65f14dc14a79b924524b1e7fffe304517b2bff5a58bf64f30b98bbc5079eb", size = 147889 }, | ||
| 70 | + { url = "https://files.pythonhosted.org/packages/10/bf/979224a919a1b606c82bd2c5fa49b5c6d5727aa47b4312bb27b1734f53cd/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:74664978bb272435107de04e36db5a9735e78232b85b77d45cfb38f758efd33e", size = 143641 }, | ||
| 71 | + { url = "https://files.pythonhosted.org/packages/ba/33/0ad65587441fc730dc7bd90e9716b30b4702dc7b617e6ba4997dc8651495/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:752944c7ffbfdd10c074dc58ec2d5a8a4cd9493b314d367c14d24c17684ddd14", size = 160779 }, | ||
| 72 | + { url = "https://files.pythonhosted.org/packages/67/ed/331d6b249259ee71ddea93f6f2f0a56cfebd46938bde6fcc6f7b9a3d0e09/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:d1f13550535ad8cff21b8d757a3257963e951d96e20ec82ab44bc64aeb62a191", size = 159035 }, | ||
| 73 | + { url = "https://files.pythonhosted.org/packages/67/ff/f6b948ca32e4f2a4576aa129d8bed61f2e0543bf9f5f2b7fc3758ed005c9/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ecaae4149d99b1c9e7b88bb03e3221956f68fd6d50be2ef061b2381b61d20838", size = 152542 }, | ||
| 74 | + { url = "https://files.pythonhosted.org/packages/16/85/276033dcbcc369eb176594de22728541a925b2632f9716428c851b149e83/charset_normalizer-3.4.4-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:cb6254dc36b47a990e59e1068afacdcd02958bdcce30bb50cc1700a8b9d624a6", size = 149524 }, | ||
| 75 | + { url = "https://files.pythonhosted.org/packages/9e/f2/6a2a1f722b6aba37050e626530a46a68f74e63683947a8acff92569f979a/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c8ae8a0f02f57a6e61203a31428fa1d677cbe50c93622b4149d5c0f319c1d19e", size = 150395 }, | ||
| 76 | + { url = "https://files.pythonhosted.org/packages/60/bb/2186cb2f2bbaea6338cad15ce23a67f9b0672929744381e28b0592676824/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:47cc91b2f4dd2833fddaedd2893006b0106129d4b94fdb6af1f4ce5a9965577c", size = 143680 }, | ||
| 77 | + { url = "https://files.pythonhosted.org/packages/7d/a5/bf6f13b772fbb2a90360eb620d52ed8f796f3c5caee8398c3b2eb7b1c60d/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:82004af6c302b5d3ab2cfc4cc5f29db16123b1a8417f2e25f9066f91d4411090", size = 162045 }, | ||
| 78 | + { url = "https://files.pythonhosted.org/packages/df/c5/d1be898bf0dc3ef9030c3825e5d3b83f2c528d207d246cbabe245966808d/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:2b7d8f6c26245217bd2ad053761201e9f9680f8ce52f0fcd8d0755aeae5b2152", size = 149687 }, | ||
| 79 | + { url = "https://files.pythonhosted.org/packages/a5/42/90c1f7b9341eef50c8a1cb3f098ac43b0508413f33affd762855f67a410e/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:799a7a5e4fb2d5898c60b640fd4981d6a25f1c11790935a44ce38c54e985f828", size = 160014 }, | ||
| 80 | + { url = "https://files.pythonhosted.org/packages/76/be/4d3ee471e8145d12795ab655ece37baed0929462a86e72372fd25859047c/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:99ae2cffebb06e6c22bdc25801d7b30f503cc87dbd283479e7b606f70aff57ec", size = 154044 }, | ||
| 81 | + { url = "https://files.pythonhosted.org/packages/b0/6f/8f7af07237c34a1defe7defc565a9bc1807762f672c0fde711a4b22bf9c0/charset_normalizer-3.4.4-cp314-cp314-win32.whl", hash = "sha256:f9d332f8c2a2fcbffe1378594431458ddbef721c1769d78e2cbc06280d8155f9", size = 99940 }, | ||
| 82 | + { url = "https://files.pythonhosted.org/packages/4b/51/8ade005e5ca5b0d80fb4aff72a3775b325bdc3d27408c8113811a7cbe640/charset_normalizer-3.4.4-cp314-cp314-win_amd64.whl", hash = "sha256:8a6562c3700cce886c5be75ade4a5db4214fda19fede41d9792d100288d8f94c", size = 107104 }, | ||
| 83 | + { url = "https://files.pythonhosted.org/packages/da/5f/6b8f83a55bb8278772c5ae54a577f3099025f9ade59d0136ac24a0df4bde/charset_normalizer-3.4.4-cp314-cp314-win_arm64.whl", hash = "sha256:de00632ca48df9daf77a2c65a484531649261ec9f25489917f09e455cb09ddb2", size = 100743 }, | ||
| 84 | + { url = "https://files.pythonhosted.org/packages/0a/4c/925909008ed5a988ccbb72dcc897407e5d6d3bd72410d69e051fc0c14647/charset_normalizer-3.4.4-py3-none-any.whl", hash = "sha256:7a32c560861a02ff789ad905a2fe94e3f840803362c84fecf1851cb4cf3dc37f", size = 53402 }, | ||
| 85 | +] | ||
| 86 | + | ||
| 87 | +[[package]] | ||
| 88 | +name = "colorama" | ||
| 89 | +version = "0.4.6" | ||
| 90 | +source = { registry = "https://pypi.org/simple" } | ||
| 91 | +sdist = { url = "https://files.pythonhosted.org/packages/d8/53/6f443c9a4a8358a93a6792e2acffb9d9d5cb0a5cfd8802644b7b1c9a02e4/colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44", size = 27697 } | ||
| 92 | +wheels = [ | ||
| 93 | + { url = "https://files.pythonhosted.org/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335 }, | ||
| 94 | +] | ||
| 95 | + | ||
| 96 | +[[package]] | ||
| 97 | +name = "idna" | ||
| 98 | +version = "3.11" | ||
| 99 | +source = { registry = "https://pypi.org/simple" } | ||
| 100 | +sdist = { url = "https://files.pythonhosted.org/packages/6f/6d/0703ccc57f3a7233505399edb88de3cbd678da106337b9fcde432b65ed60/idna-3.11.tar.gz", hash = "sha256:795dafcc9c04ed0c1fb032c2aa73654d8e8c5023a7df64a53f39190ada629902", size = 194582 } | ||
| 101 | +wheels = [ | ||
| 102 | + { url = "https://files.pythonhosted.org/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008 }, | ||
| 103 | +] | ||
| 104 | + | ||
| 105 | +[[package]] | ||
| 106 | +name = "iniconfig" | ||
| 107 | +version = "2.3.0" | ||
| 108 | +source = { registry = "https://pypi.org/simple" } | ||
| 109 | +sdist = { url = "https://files.pythonhosted.org/packages/72/34/14ca021ce8e5dfedc35312d08ba8bf51fdd999c576889fc2c24cb97f4f10/iniconfig-2.3.0.tar.gz", hash = "sha256:c76315c77db068650d49c5b56314774a7804df16fee4402c1f19d6d15d8c4730", size = 20503 } | ||
| 110 | +wheels = [ | ||
| 111 | + { url = "https://files.pythonhosted.org/packages/cb/b1/3846dd7f199d53cb17f49cba7e651e9ce294d8497c8c150530ed11865bb8/iniconfig-2.3.0-py3-none-any.whl", hash = "sha256:f631c04d2c48c52b84d0d0549c99ff3859c98df65b3101406327ecc7d53fbf12", size = 7484 }, | ||
| 112 | +] | ||
| 113 | + | ||
| 114 | +[[package]] | ||
| 115 | +name = "packaging" | ||
| 116 | +version = "26.0" | ||
| 117 | +source = { registry = "https://pypi.org/simple" } | ||
| 118 | +sdist = { url = "https://files.pythonhosted.org/packages/65/ee/299d360cdc32edc7d2cf530f3accf79c4fca01e96ffc950d8a52213bd8e4/packaging-26.0.tar.gz", hash = "sha256:00243ae351a257117b6a241061796684b084ed1c516a08c48a3f7e147a9d80b4", size = 143416 } | ||
| 119 | +wheels = [ | ||
| 120 | + { url = "https://files.pythonhosted.org/packages/b7/b9/c538f279a4e237a006a2c98387d081e9eb060d203d8ed34467cc0f0b9b53/packaging-26.0-py3-none-any.whl", hash = "sha256:b36f1fef9334a5588b4166f8bcd26a14e521f2b55e6b9de3aaa80d3ff7a37529", size = 74366 }, | ||
| 121 | +] | ||
| 122 | + | ||
| 123 | +[[package]] | ||
| 124 | +name = "pluggy" | ||
| 125 | +version = "1.6.0" | ||
| 126 | +source = { registry = "https://pypi.org/simple" } | ||
| 127 | +sdist = { url = "https://files.pythonhosted.org/packages/f9/e2/3e91f31a7d2b083fe6ef3fa267035b518369d9511ffab804f839851d2779/pluggy-1.6.0.tar.gz", hash = "sha256:7dcc130b76258d33b90f61b658791dede3486c3e6bfb003ee5c9bfb396dd22f3", size = 69412 } | ||
| 128 | +wheels = [ | ||
| 129 | + { url = "https://files.pythonhosted.org/packages/54/20/4d324d65cc6d9205fabedc306948156824eb9f0ee1633355a8f7ec5c66bf/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538 }, | ||
| 130 | +] | ||
| 131 | + | ||
| 132 | +[[package]] | ||
| 133 | +name = "pygments" | ||
| 134 | +version = "2.19.2" | ||
| 135 | +source = { registry = "https://pypi.org/simple" } | ||
| 136 | +sdist = { url = "https://files.pythonhosted.org/packages/b0/77/a5b8c569bf593b0140bde72ea885a803b82086995367bf2037de0159d924/pygments-2.19.2.tar.gz", hash = "sha256:636cb2477cec7f8952536970bc533bc43743542f70392ae026374600add5b887", size = 4968631 } | ||
| 137 | +wheels = [ | ||
| 138 | + { url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217 }, | ||
| 139 | +] | ||
| 140 | + | ||
| 141 | +[[package]] | ||
| 142 | +name = "pytest" | ||
| 143 | +version = "9.0.2" | ||
| 144 | +source = { registry = "https://pypi.org/simple" } | ||
| 145 | +dependencies = [ | ||
| 146 | + { name = "colorama", marker = "sys_platform == 'win32'" }, | ||
| 147 | + { name = "iniconfig" }, | ||
| 148 | + { name = "packaging" }, | ||
| 149 | + { name = "pluggy" }, | ||
| 150 | + { name = "pygments" }, | ||
| 151 | +] | ||
| 152 | +sdist = { url = "https://files.pythonhosted.org/packages/d1/db/7ef3487e0fb0049ddb5ce41d3a49c235bf9ad299b6a25d5780a89f19230f/pytest-9.0.2.tar.gz", hash = "sha256:75186651a92bd89611d1d9fc20f0b4345fd827c41ccd5c299a868a05d70edf11", size = 1568901 } | ||
| 153 | +wheels = [ | ||
| 154 | + { url = "https://files.pythonhosted.org/packages/3b/ab/b3226f0bd7cdcf710fbede2b3548584366da3b19b5021e74f5bde2a8fa3f/pytest-9.0.2-py3-none-any.whl", hash = "sha256:711ffd45bf766d5264d487b917733b453d917afd2b0ad65223959f59089f875b", size = 374801 }, | ||
| 155 | +] | ||
| 156 | + | ||
| 157 | +[[package]] | ||
| 158 | +name = "requests" | ||
| 159 | +version = "2.32.5" | ||
| 160 | +source = { registry = "https://pypi.org/simple" } | ||
| 161 | +dependencies = [ | ||
| 162 | + { name = "certifi" }, | ||
| 163 | + { name = "charset-normalizer" }, | ||
| 164 | + { name = "idna" }, | ||
| 165 | + { name = "urllib3" }, | ||
| 166 | +] | ||
| 167 | +sdist = { url = "https://files.pythonhosted.org/packages/c9/74/b3ff8e6c8446842c3f5c837e9c3dfcfe2018ea6ecef224c710c85ef728f4/requests-2.32.5.tar.gz", hash = "sha256:dbba0bac56e100853db0ea71b82b4dfd5fe2bf6d3754a8893c3af500cec7d7cf", size = 134517 } | ||
| 168 | +wheels = [ | ||
| 169 | + { url = "https://files.pythonhosted.org/packages/1e/db/4254e3eabe8020b458f1a747140d32277ec7a271daf1d235b70dc0b4e6e3/requests-2.32.5-py3-none-any.whl", hash = "sha256:2462f94637a34fd532264295e186976db0f5d453d1cdd31473c85a6a161affb6", size = 64738 }, | ||
| 170 | +] | ||
| 171 | + | ||
| 172 | +[[package]] | ||
| 173 | +name = "ruff" | ||
| 174 | +version = "0.15.4" | ||
| 175 | +source = { registry = "https://pypi.org/simple" } | ||
| 176 | +sdist = { url = "https://files.pythonhosted.org/packages/da/31/d6e536cdebb6568ae75a7f00e4b4819ae0ad2640c3604c305a0428680b0c/ruff-0.15.4.tar.gz", hash = "sha256:3412195319e42d634470cc97aa9803d07e9d5c9223b99bcb1518f0c725f26ae1", size = 4569550 } | ||
| 177 | +wheels = [ | ||
| 178 | + { url = "https://files.pythonhosted.org/packages/f2/82/c11a03cfec3a4d26a0ea1e571f0f44be5993b923f905eeddfc397c13d360/ruff-0.15.4-py3-none-linux_armv6l.whl", hash = "sha256:a1810931c41606c686bae8b5b9a8072adac2f611bb433c0ba476acba17a332e0", size = 10453333 }, | ||
| 179 | + { url = "https://files.pythonhosted.org/packages/ce/5d/6a1f271f6e31dffb31855996493641edc3eef8077b883eaf007a2f1c2976/ruff-0.15.4-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:5a1632c66672b8b4d3e1d1782859e98d6e0b4e70829530666644286600a33992", size = 10853356 }, | ||
| 180 | + { url = "https://files.pythonhosted.org/packages/b1/d8/0fab9f8842b83b1a9c2bf81b85063f65e93fb512e60effa95b0be49bfc54/ruff-0.15.4-py3-none-macosx_11_0_arm64.whl", hash = "sha256:a4386ba2cd6c0f4ff75252845906acc7c7c8e1ac567b7bc3d373686ac8c222ba", size = 10187434 }, | ||
| 181 | + { url = "https://files.pythonhosted.org/packages/85/cc/cc220fd9394eff5db8d94dec199eec56dd6c9f3651d8869d024867a91030/ruff-0.15.4-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b2496488bdfd3732747558b6f95ae427ff066d1fcd054daf75f5a50674411e75", size = 10535456 }, | ||
| 182 | + { url = "https://files.pythonhosted.org/packages/fa/0f/bced38fa5cf24373ec767713c8e4cadc90247f3863605fb030e597878661/ruff-0.15.4-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:3f1c4893841ff2d54cbda1b2860fa3260173df5ddd7b95d370186f8a5e66a4ac", size = 10287772 }, | ||
| 183 | + { url = "https://files.pythonhosted.org/packages/2b/90/58a1802d84fed15f8f281925b21ab3cecd813bde52a8ca033a4de8ab0e7a/ruff-0.15.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:820b8766bd65503b6c30aaa6331e8ef3a6e564f7999c844e9a547c40179e440a", size = 11049051 }, | ||
| 184 | + { url = "https://files.pythonhosted.org/packages/d2/ac/b7ad36703c35f3866584564dc15f12f91cb1a26a897dc2fd13d7cb3ae1af/ruff-0.15.4-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c9fb74bab47139c1751f900f857fa503987253c3ef89129b24ed375e72873e85", size = 11890494 }, | ||
| 185 | + { url = "https://files.pythonhosted.org/packages/93/3d/3eb2f47a39a8b0da99faf9c54d3eb24720add1e886a5309d4d1be73a6380/ruff-0.15.4-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f80c98765949c518142b3a50a5db89343aa90f2c2bf7799de9986498ae6176db", size = 11326221 }, | ||
| 186 | + { url = "https://files.pythonhosted.org/packages/ff/90/bf134f4c1e5243e62690e09d63c55df948a74084c8ac3e48a88468314da6/ruff-0.15.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:451a2e224151729b3b6c9ffb36aed9091b2996fe4bdbd11f47e27d8f2e8888ec", size = 11168459 }, | ||
| 187 | + { url = "https://files.pythonhosted.org/packages/b5/e5/a64d27688789b06b5d55162aafc32059bb8c989c61a5139a36e1368285eb/ruff-0.15.4-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:a8f157f2e583c513c4f5f896163a93198297371f34c04220daf40d133fdd4f7f", size = 11104366 }, | ||
| 188 | + { url = "https://files.pythonhosted.org/packages/f1/f6/32d1dcb66a2559763fc3027bdd65836cad9eb09d90f2ed6a63d8e9252b02/ruff-0.15.4-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:917cc68503357021f541e69b35361c99387cdbbf99bd0ea4aa6f28ca99ff5338", size = 10510887 }, | ||
| 189 | + { url = "https://files.pythonhosted.org/packages/ff/92/22d1ced50971c5b6433aed166fcef8c9343f567a94cf2b9d9089f6aa80fe/ruff-0.15.4-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:e9737c8161da79fd7cfec19f1e35620375bd8b2a50c3e77fa3d2c16f574105cc", size = 10285939 }, | ||
| 190 | + { url = "https://files.pythonhosted.org/packages/e6/f4/7c20aec3143837641a02509a4668fb146a642fd1211846634edc17eb5563/ruff-0.15.4-py3-none-musllinux_1_2_i686.whl", hash = "sha256:291258c917539e18f6ba40482fe31d6f5ac023994ee11d7bdafd716f2aab8a68", size = 10765471 }, | ||
| 191 | + { url = "https://files.pythonhosted.org/packages/d0/09/6d2f7586f09a16120aebdff8f64d962d7c4348313c77ebb29c566cefc357/ruff-0.15.4-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:3f83c45911da6f2cd5936c436cf86b9f09f09165f033a99dcf7477e34041cbc3", size = 11263382 }, | ||
| 192 | + { url = "https://files.pythonhosted.org/packages/1b/fa/2ef715a1cd329ef47c1a050e10dee91a9054b7ce2fcfdd6a06d139afb7ec/ruff-0.15.4-py3-none-win32.whl", hash = "sha256:65594a2d557d4ee9f02834fcdf0a28daa8b3b9f6cb2cb93846025a36db47ef22", size = 10506664 }, | ||
| 193 | + { url = "https://files.pythonhosted.org/packages/d0/a8/c688ef7e29983976820d18710f955751d9f4d4eb69df658af3d006e2ba3e/ruff-0.15.4-py3-none-win_amd64.whl", hash = "sha256:04196ad44f0df220c2ece5b0e959c2f37c777375ec744397d21d15b50a75264f", size = 11651048 }, | ||
| 194 | + { url = "https://files.pythonhosted.org/packages/3e/0a/9e1be9035b37448ce2e68c978f0591da94389ade5a5abafa4cf99985d1b2/ruff-0.15.4-py3-none-win_arm64.whl", hash = "sha256:60d5177e8cfc70e51b9c5fad936c634872a74209f934c1e79107d11787ad5453", size = 10966776 }, | ||
| 195 | +] | ||
| 196 | + | ||
| 197 | +[[package]] | ||
| 198 | +name = "urllib3" | ||
| 199 | +version = "2.6.3" | ||
| 200 | +source = { registry = "https://pypi.org/simple" } | ||
| 201 | +sdist = { url = "https://files.pythonhosted.org/packages/c7/24/5f1b3bdffd70275f6661c76461e25f024d5a38a46f04aaca912426a2b1d3/urllib3-2.6.3.tar.gz", hash = "sha256:1b62b6884944a57dbe321509ab94fd4d3b307075e0c2eae991ac71ee15ad38ed", size = 435556 } | ||
| 202 | +wheels = [ | ||
| 203 | + { url = "https://files.pythonhosted.org/packages/39/08/aaaad47bc4e9dc8c725e68f9d04865dbcb2052843ff09c97b08904852d84/urllib3-2.6.3-py3-none-any.whl", hash = "sha256:bf272323e553dfb2e87d9bfd225ca7b0f467b919d7bbd355436d3fd37cb0acd4", size = 131584 }, | ||
| 204 | +] | ||
| 205 | + | ||
| 206 | +[[package]] | ||
| 207 | +name = "websockets" | ||
| 208 | +version = "16.0" | ||
| 209 | +source = { registry = "https://pypi.org/simple" } | ||
| 210 | +sdist = { url = "https://files.pythonhosted.org/packages/04/24/4b2031d72e840ce4c1ccb255f693b15c334757fc50023e4db9537080b8c4/websockets-16.0.tar.gz", hash = "sha256:5f6261a5e56e8d5c42a4497b364ea24d94d9563e8fbd44e78ac40879c60179b5", size = 179346 } | ||
| 211 | +wheels = [ | ||
| 212 | + { url = "https://files.pythonhosted.org/packages/f2/db/de907251b4ff46ae804ad0409809504153b3f30984daf82a1d84a9875830/websockets-16.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:31a52addea25187bde0797a97d6fc3d2f92b6f72a9370792d65a6e84615ac8a8", size = 177340 }, | ||
| 213 | + { url = "https://files.pythonhosted.org/packages/f3/fa/abe89019d8d8815c8781e90d697dec52523fb8ebe308bf11664e8de1877e/websockets-16.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:417b28978cdccab24f46400586d128366313e8a96312e4b9362a4af504f3bbad", size = 175022 }, | ||
| 214 | + { url = "https://files.pythonhosted.org/packages/58/5d/88ea17ed1ded2079358b40d31d48abe90a73c9e5819dbcde1606e991e2ad/websockets-16.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:af80d74d4edfa3cb9ed973a0a5ba2b2a549371f8a741e0800cb07becdd20f23d", size = 175319 }, | ||
| 215 | + { url = "https://files.pythonhosted.org/packages/d2/ae/0ee92b33087a33632f37a635e11e1d99d429d3d323329675a6022312aac2/websockets-16.0-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:08d7af67b64d29823fed316505a89b86705f2b7981c07848fb5e3ea3020c1abe", size = 184631 }, | ||
| 216 | + { url = "https://files.pythonhosted.org/packages/c8/c5/27178df583b6c5b31b29f526ba2da5e2f864ecc79c99dae630a85d68c304/websockets-16.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7be95cfb0a4dae143eaed2bcba8ac23f4892d8971311f1b06f3c6b78952ee70b", size = 185870 }, | ||
| 217 | + { url = "https://files.pythonhosted.org/packages/87/05/536652aa84ddc1c018dbb7e2c4cbcd0db884580bf8e95aece7593fde526f/websockets-16.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d6297ce39ce5c2e6feb13c1a996a2ded3b6832155fcfc920265c76f24c7cceb5", size = 185361 }, | ||
| 218 | + { url = "https://files.pythonhosted.org/packages/6d/e2/d5332c90da12b1e01f06fb1b85c50cfc489783076547415bf9f0a659ec19/websockets-16.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:1c1b30e4f497b0b354057f3467f56244c603a79c0d1dafce1d16c283c25f6e64", size = 184615 }, | ||
| 219 | + { url = "https://files.pythonhosted.org/packages/77/fb/d3f9576691cae9253b51555f841bc6600bf0a983a461c79500ace5a5b364/websockets-16.0-cp311-cp311-win32.whl", hash = "sha256:5f451484aeb5cafee1ccf789b1b66f535409d038c56966d6101740c1614b86c6", size = 178246 }, | ||
| 220 | + { url = "https://files.pythonhosted.org/packages/54/67/eaff76b3dbaf18dcddabc3b8c1dba50b483761cccff67793897945b37408/websockets-16.0-cp311-cp311-win_amd64.whl", hash = "sha256:8d7f0659570eefb578dacde98e24fb60af35350193e4f56e11190787bee77dac", size = 178684 }, | ||
| 221 | + { url = "https://files.pythonhosted.org/packages/84/7b/bac442e6b96c9d25092695578dda82403c77936104b5682307bd4deb1ad4/websockets-16.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:71c989cbf3254fbd5e84d3bff31e4da39c43f884e64f2551d14bb3c186230f00", size = 177365 }, | ||
| 222 | + { url = "https://files.pythonhosted.org/packages/b0/fe/136ccece61bd690d9c1f715baaeefd953bb2360134de73519d5df19d29ca/websockets-16.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:8b6e209ffee39ff1b6d0fa7bfef6de950c60dfb91b8fcead17da4ee539121a79", size = 175038 }, | ||
| 223 | + { url = "https://files.pythonhosted.org/packages/40/1e/9771421ac2286eaab95b8575b0cb701ae3663abf8b5e1f64f1fd90d0a673/websockets-16.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:86890e837d61574c92a97496d590968b23c2ef0aeb8a9bc9421d174cd378ae39", size = 175328 }, | ||
| 224 | + { url = "https://files.pythonhosted.org/packages/18/29/71729b4671f21e1eaa5d6573031ab810ad2936c8175f03f97f3ff164c802/websockets-16.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:9b5aca38b67492ef518a8ab76851862488a478602229112c4b0d58d63a7a4d5c", size = 184915 }, | ||
| 225 | + { url = "https://files.pythonhosted.org/packages/97/bb/21c36b7dbbafc85d2d480cd65df02a1dc93bf76d97147605a8e27ff9409d/websockets-16.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e0334872c0a37b606418ac52f6ab9cfd17317ac26365f7f65e203e2d0d0d359f", size = 186152 }, | ||
| 226 | + { url = "https://files.pythonhosted.org/packages/4a/34/9bf8df0c0cf88fa7bfe36678dc7b02970c9a7d5e065a3099292db87b1be2/websockets-16.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:a0b31e0b424cc6b5a04b8838bbaec1688834b2383256688cf47eb97412531da1", size = 185583 }, | ||
| 227 | + { url = "https://files.pythonhosted.org/packages/47/88/4dd516068e1a3d6ab3c7c183288404cd424a9a02d585efbac226cb61ff2d/websockets-16.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:485c49116d0af10ac698623c513c1cc01c9446c058a4e61e3bf6c19dff7335a2", size = 184880 }, | ||
| 228 | + { url = "https://files.pythonhosted.org/packages/91/d6/7d4553ad4bf1c0421e1ebd4b18de5d9098383b5caa1d937b63df8d04b565/websockets-16.0-cp312-cp312-win32.whl", hash = "sha256:eaded469f5e5b7294e2bdca0ab06becb6756ea86894a47806456089298813c89", size = 178261 }, | ||
| 229 | + { url = "https://files.pythonhosted.org/packages/c3/f0/f3a17365441ed1c27f850a80b2bc680a0fa9505d733fe152fdf5e98c1c0b/websockets-16.0-cp312-cp312-win_amd64.whl", hash = "sha256:5569417dc80977fc8c2d43a86f78e0a5a22fee17565d78621b6bb264a115d4ea", size = 178693 }, | ||
| 230 | + { url = "https://files.pythonhosted.org/packages/cc/9c/baa8456050d1c1b08dd0ec7346026668cbc6f145ab4e314d707bb845bf0d/websockets-16.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:878b336ac47938b474c8f982ac2f7266a540adc3fa4ad74ae96fea9823a02cc9", size = 177364 }, | ||
| 231 | + { url = "https://files.pythonhosted.org/packages/7e/0c/8811fc53e9bcff68fe7de2bcbe75116a8d959ac699a3200f4847a8925210/websockets-16.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:52a0fec0e6c8d9a784c2c78276a48a2bdf099e4ccc2a4cad53b27718dbfd0230", size = 175039 }, | ||
| 232 | + { url = "https://files.pythonhosted.org/packages/aa/82/39a5f910cb99ec0b59e482971238c845af9220d3ab9fa76dd9162cda9d62/websockets-16.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:e6578ed5b6981005df1860a56e3617f14a6c307e6a71b4fff8c48fdc50f3ed2c", size = 175323 }, | ||
| 233 | + { url = "https://files.pythonhosted.org/packages/bd/28/0a25ee5342eb5d5f297d992a77e56892ecb65e7854c7898fb7d35e9b33bd/websockets-16.0-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:95724e638f0f9c350bb1c2b0a7ad0e83d9cc0c9259f3ea94e40d7b02a2179ae5", size = 184975 }, | ||
| 234 | + { url = "https://files.pythonhosted.org/packages/f9/66/27ea52741752f5107c2e41fda05e8395a682a1e11c4e592a809a90c6a506/websockets-16.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c0204dc62a89dc9d50d682412c10b3542d748260d743500a85c13cd1ee4bde82", size = 186203 }, | ||
| 235 | + { url = "https://files.pythonhosted.org/packages/37/e5/8e32857371406a757816a2b471939d51c463509be73fa538216ea52b792a/websockets-16.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:52ac480f44d32970d66763115edea932f1c5b1312de36df06d6b219f6741eed8", size = 185653 }, | ||
| 236 | + { url = "https://files.pythonhosted.org/packages/9b/67/f926bac29882894669368dc73f4da900fcdf47955d0a0185d60103df5737/websockets-16.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:6e5a82b677f8f6f59e8dfc34ec06ca6b5b48bc4fcda346acd093694cc2c24d8f", size = 184920 }, | ||
| 237 | + { url = "https://files.pythonhosted.org/packages/3c/a1/3d6ccdcd125b0a42a311bcd15a7f705d688f73b2a22d8cf1c0875d35d34a/websockets-16.0-cp313-cp313-win32.whl", hash = "sha256:abf050a199613f64c886ea10f38b47770a65154dc37181bfaff70c160f45315a", size = 178255 }, | ||
| 238 | + { url = "https://files.pythonhosted.org/packages/6b/ae/90366304d7c2ce80f9b826096a9e9048b4bb760e44d3b873bb272cba696b/websockets-16.0-cp313-cp313-win_amd64.whl", hash = "sha256:3425ac5cf448801335d6fdc7ae1eb22072055417a96cc6b31b3861f455fbc156", size = 178689 }, | ||
| 239 | + { url = "https://files.pythonhosted.org/packages/f3/1d/e88022630271f5bd349ed82417136281931e558d628dd52c4d8621b4a0b2/websockets-16.0-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:8cc451a50f2aee53042ac52d2d053d08bf89bcb31ae799cb4487587661c038a0", size = 177406 }, | ||
| 240 | + { url = "https://files.pythonhosted.org/packages/f2/78/e63be1bf0724eeb4616efb1ae1c9044f7c3953b7957799abb5915bffd38e/websockets-16.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:daa3b6ff70a9241cf6c7fc9e949d41232d9d7d26fd3522b1ad2b4d62487e9904", size = 175085 }, | ||
| 241 | + { url = "https://files.pythonhosted.org/packages/bb/f4/d3c9220d818ee955ae390cf319a7c7a467beceb24f05ee7aaaa2414345ba/websockets-16.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:fd3cb4adb94a2a6e2b7c0d8d05cb94e6f1c81a0cf9dc2694fb65c7e8d94c42e4", size = 175328 }, | ||
| 242 | + { url = "https://files.pythonhosted.org/packages/63/bc/d3e208028de777087e6fb2b122051a6ff7bbcca0d6df9d9c2bf1dd869ae9/websockets-16.0-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:781caf5e8eee67f663126490c2f96f40906594cb86b408a703630f95550a8c3e", size = 185044 }, | ||
| 243 | + { url = "https://files.pythonhosted.org/packages/ad/6e/9a0927ac24bd33a0a9af834d89e0abc7cfd8e13bed17a86407a66773cc0e/websockets-16.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:caab51a72c51973ca21fa8a18bd8165e1a0183f1ac7066a182ff27107b71e1a4", size = 186279 }, | ||
| 244 | + { url = "https://files.pythonhosted.org/packages/b9/ca/bf1c68440d7a868180e11be653c85959502efd3a709323230314fda6e0b3/websockets-16.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:19c4dc84098e523fd63711e563077d39e90ec6702aff4b5d9e344a60cb3c0cb1", size = 185711 }, | ||
| 245 | + { url = "https://files.pythonhosted.org/packages/c4/f8/fdc34643a989561f217bb477cbc47a3a07212cbda91c0e4389c43c296ebf/websockets-16.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:a5e18a238a2b2249c9a9235466b90e96ae4795672598a58772dd806edc7ac6d3", size = 184982 }, | ||
| 246 | + { url = "https://files.pythonhosted.org/packages/dd/d1/574fa27e233764dbac9c52730d63fcf2823b16f0856b3329fc6268d6ae4f/websockets-16.0-cp314-cp314-win32.whl", hash = "sha256:a069d734c4a043182729edd3e9f247c3b2a4035415a9172fd0f1b71658a320a8", size = 177915 }, | ||
| 247 | + { url = "https://files.pythonhosted.org/packages/8a/f1/ae6b937bf3126b5134ce1f482365fde31a357c784ac51852978768b5eff4/websockets-16.0-cp314-cp314-win_amd64.whl", hash = "sha256:c0ee0e63f23914732c6d7e0cce24915c48f3f1512ec1d079ed01fc629dab269d", size = 178381 }, | ||
| 248 | + { url = "https://files.pythonhosted.org/packages/06/9b/f791d1db48403e1f0a27577a6beb37afae94254a8c6f08be4a23e4930bc0/websockets-16.0-cp314-cp314t-macosx_10_15_universal2.whl", hash = "sha256:a35539cacc3febb22b8f4d4a99cc79b104226a756aa7400adc722e83b0d03244", size = 177737 }, | ||
| 249 | + { url = "https://files.pythonhosted.org/packages/bd/40/53ad02341fa33b3ce489023f635367a4ac98b73570102ad2cdd770dacc9a/websockets-16.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:b784ca5de850f4ce93ec85d3269d24d4c82f22b7212023c974c401d4980ebc5e", size = 175268 }, | ||
| 250 | + { url = "https://files.pythonhosted.org/packages/74/9b/6158d4e459b984f949dcbbb0c5d270154c7618e11c01029b9bbd1bb4c4f9/websockets-16.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:569d01a4e7fba956c5ae4fc988f0d4e187900f5497ce46339c996dbf24f17641", size = 175486 }, | ||
| 251 | + { url = "https://files.pythonhosted.org/packages/e5/2d/7583b30208b639c8090206f95073646c2c9ffd66f44df967981a64f849ad/websockets-16.0-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:50f23cdd8343b984957e4077839841146f67a3d31ab0d00e6b824e74c5b2f6e8", size = 185331 }, | ||
| 252 | + { url = "https://files.pythonhosted.org/packages/45/b0/cce3784eb519b7b5ad680d14b9673a31ab8dcb7aad8b64d81709d2430aa8/websockets-16.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:152284a83a00c59b759697b7f9e9cddf4e3c7861dd0d964b472b70f78f89e80e", size = 186501 }, | ||
| 253 | + { url = "https://files.pythonhosted.org/packages/19/60/b8ebe4c7e89fb5f6cdf080623c9d92789a53636950f7abacfc33fe2b3135/websockets-16.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:bc59589ab64b0022385f429b94697348a6a234e8ce22544e3681b2e9331b5944", size = 186062 }, | ||
| 254 | + { url = "https://files.pythonhosted.org/packages/88/a8/a080593f89b0138b6cba1b28f8df5673b5506f72879322288b031337c0b8/websockets-16.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:32da954ffa2814258030e5a57bc73a3635463238e797c7375dc8091327434206", size = 185356 }, | ||
| 255 | + { url = "https://files.pythonhosted.org/packages/c2/b6/b9afed2afadddaf5ebb2afa801abf4b0868f42f8539bfe4b071b5266c9fe/websockets-16.0-cp314-cp314t-win32.whl", hash = "sha256:5a4b4cc550cb665dd8a47f868c8d04c8230f857363ad3c9caf7a0c3bf8c61ca6", size = 178085 }, | ||
| 256 | + { url = "https://files.pythonhosted.org/packages/9f/3e/28135a24e384493fa804216b79a6a6759a38cc4ff59118787b9fb693df93/websockets-16.0-cp314-cp314t-win_amd64.whl", hash = "sha256:b14dc141ed6d2dde437cddb216004bcac6a1df0935d79656387bd41632ba0bbd", size = 178531 }, | ||
| 257 | + { url = "https://files.pythonhosted.org/packages/72/07/c98a68571dcf256e74f1f816b8cc5eae6eb2d3d5cfa44d37f801619d9166/websockets-16.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:349f83cd6c9a415428ee1005cadb5c2c56f4389bc06a9af16103c3bc3dcc8b7d", size = 174947 }, | ||
| 258 | + { url = "https://files.pythonhosted.org/packages/7e/52/93e166a81e0305b33fe416338be92ae863563fe7bce446b0f687b9df5aea/websockets-16.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:4a1aba3340a8dca8db6eb5a7986157f52eb9e436b74813764241981ca4888f03", size = 175260 }, | ||
| 259 | + { url = "https://files.pythonhosted.org/packages/56/0c/2dbf513bafd24889d33de2ff0368190a0e69f37bcfa19009ef819fe4d507/websockets-16.0-pp311-pypy311_pp73-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:f4a32d1bd841d4bcbffdcb3d2ce50c09c3909fbead375ab28d0181af89fd04da", size = 176071 }, | ||
| 260 | + { url = "https://files.pythonhosted.org/packages/a5/8f/aea9c71cc92bf9b6cc0f7f70df8f0b420636b6c96ef4feee1e16f80f75dd/websockets-16.0-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0298d07ee155e2e9fda5be8a9042200dd2e3bb0b8a38482156576f863a9d457c", size = 176968 }, | ||
| 261 | + { url = "https://files.pythonhosted.org/packages/9a/3f/f70e03f40ffc9a30d817eef7da1be72ee4956ba8d7255c399a01b135902a/websockets-16.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:a653aea902e0324b52f1613332ddf50b00c06fdaf7e92624fbf8c77c78fa5767", size = 178735 }, | ||
| 262 | + { url = "https://files.pythonhosted.org/packages/6f/28/258ebab549c2bf3e64d2b0217b973467394a9cea8c42f70418ca2c5d0d2e/websockets-16.0-py3-none-any.whl", hash = "sha256:1637db62fad1dc833276dded54215f2c7fa46912301a24bd94d45d46a011ceec", size = 171598 }, | ||
| 263 | +] | ||
| 264 | + | ||
| 265 | +[[package]] | ||
| 266 | +name = "xiaohongshu-skills" | ||
| 267 | +version = "0.1.0" | ||
| 268 | +source = { virtual = "." } | ||
| 269 | +dependencies = [ | ||
| 270 | + { name = "requests" }, | ||
| 271 | + { name = "websockets" }, | ||
| 272 | +] | ||
| 273 | + | ||
| 274 | +[package.optional-dependencies] | ||
| 275 | +dev = [ | ||
| 276 | + { name = "pytest" }, | ||
| 277 | + { name = "ruff" }, | ||
| 278 | +] | ||
| 279 | + | ||
| 280 | +[package.metadata] | ||
| 281 | +requires-dist = [ | ||
| 282 | + { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" }, | ||
| 283 | + { name = "requests", specifier = ">=2.28.0" }, | ||
| 284 | + { name = "ruff", marker = "extra == 'dev'", specifier = ">=0.9.0" }, | ||
| 285 | + { name = "websockets", specifier = ">=12.0" }, | ||
| 286 | +] | ||
| 287 | +provides-extras = ["dev"] |
-
Please register or login to post a comment