musicdl-catalog-sync-suite/catalog-sync/docs/catalogsync.md

# Catalog Sync CLI

`catalogsync` 是一套独立于 GUI 的采集、同步、下载链路，目标是把“发现”页里的“歌单广场”和“排行榜”来源抽出来，变成可以自动跑批的命令行工具。

当前支持的平台分两层：

- 歌单采集源：
  - `netease`
  - `qq`
  - `kuwo`
- 下载解析源：
  - `qq`
  - `kuwo`
  - `migu`
  - `qianqian`
  - `kugou`
  - `netease`

设计重点：

- 将“歌单池 -> 歌单 -> 歌曲”持久化到 SQLite
- 同步歌单歌曲时，派生更新“歌手池 -> 歌手 -> 歌曲”
- 下载时按歌曲主键和有效文件位置去重
- 为本地磁盘、云盘、对象存储保留统一的文件位置抽象

## 文档导览

本文件同时覆盖四类信息：

- 项目用途与运行链路（`collect -> sync -> download -> upload`）
- 代码架构（CLI、采集同步、下载上传、Ops Console）
- 数据库设计（业务实体、文件映射、任务编排）
- 服务器部署与运维（NAS/Linux 目录规范、脚本、日志、重启）

如果你是首次接手项目，建议按这个顺序阅读：

1. 先看“代码架构”和“数据库设计总览”
2. 再看“命令”和“NAS / Linux 落地约定”
3. 最后看文末的 Ops Console 更新说明

## 代码架构

这套系统是“命令入口 + 领域服务 + 仓储层 + 后台任务控制台”四层结构，核心目标是把“采集/同步/下载/上传”拆成可组合、可恢复、可观察的流水线。

### 目录与职责边界

```text
musicdl/catalogsync/
  cli.py                 # 命令入口与参数解析；组装 Application
  runtime.py             # 运行时路径/端口/目录规范（env -> config）
  db.py                  # SQLite schema、索引、补列迁移、连接参数
  models.py              # 领域模型与元信息提取
  repository.py          # catalog 侧数据读写（歌单/歌曲/文件/统计）
  services.py            # 采集 + 同步编排（playlist -> songs -> artists）
  downloader.py          # 下载规划 + 多源候选优选 + 落盘 + 去重入库
  resolver.py            # 跨平台候选搜歌、评分、降级策略
  uploader.py            # 对象存储补传、上传队列消费、presence 刷新
  collectors/            # 歌单源采集器（网易/QQ/酷我）
  ops/
    web.py               # FastAPI 页面与 API（dashboard/playlists/jobs）
    repository.py        # ops 侧任务仓储（job/stage/item/worker）
    runner.py            # 后台调度器（lane、抢占、恢复、收敛）
    executors.py         # stage 执行器（collect/sync/download/upload）
    maintenance.py       # 本地重复文件巡检与去重
    config.py            # 环境配置读取/写回/版本快照
    models.py            # Job/Stage/Item 状态枚举与数据结构
```

边界约束：

- `services.py` 只负责“业务编排”，不直接做 UI/任务调度
- `repository.py` 负责 SQL 读写，不关心下载/上传策略
- `ops/runner.py` 负责“如何跑任务”，不直接定义采集/下载规则
- `ops/executors.py` 负责“一个 item 怎么执行”，并通过 CAS 更新状态

### 两条主链路

1. CLI 直跑链路（离线批处理）
   - `cli.py` -> `CatalogSyncApplication`
   - `collect/sync/download/run/upload` 直接调用 `services/downloader/uploader`
   - 适合脚本化批量任务或单次命令执行
2. Ops 任务链路（可视化 + 可暂停恢复）
   - `ops/web.py` 受理任务创建（`/api/jobs`、`/api/playlists/*`）
   - `ops/runner.py` 按 `job_type` 拆 stage，轮询调度
   - `ops/executors.py` 逐 item 执行并回写 `job_*` 表
   - 前端通过 dashboard API + SSE 读取实时状态

### 关键调用序列（以“同步后下载”任务为例）

1. Web 端创建 `sync_download` 任务，写入 `job_runs`
2. runner 建立 `job_stages`：`sync -> download`
3. sync stage 为每个歌单生成 `job_items`，执行 `services.sync_playlist_row`
4. download stage 为歌曲生成 `job_items`，执行 `downloader.download_song_row`
5. 下载命中后写入 `file_assets` + `file_locations`，并刷新歌单状态聚合
6. runner 汇总 stage/item 计数，更新 `job_runs` 到 `completed/completed_with_errors`

### 任务并发与恢复模型

- 双 lane 调度：
  - `download` lane：独占型，限制并发，避免磁盘与网络争用
  - `general` lane：用于 collect/sync/upload，支持更高并发
- stage 内并发：
  - 由 worker 数控制（下载默认 10，可配置）
  - worker 心跳/速度/当前项写入 `job_workers`
- 断点恢复：
  - runner 启动时扫描 recoverable job
  - 运行中 item 置为 `interrupted`
  - 可恢复 item 重新入队，任务状态转 `paused` 或继续 `running`
- 命令控制：
  - pause/resume/cancel/retry 写入 `job_commands`
  - runner 统一消费命令，避免并发写冲突

### 可扩展点（后续加平台/加存储时看这里）

- 新歌单源：实现 `collectors/*` + 在 `services.py` 注册
- 新下载源：扩展 `resolver.py` 候选检索与评分策略
- 新存储后端：扩展 `uploader.py` 的 backend 适配与 locator 语义
- 新任务类型：在 `ops/jobdefs.py` 增加 stage 序列与显示名称
- 新运维能力：在 `ops/web.py` 加 API，在 `ops/repository.py` 落状态模型

### 任务状态流转图（JobStatus）

下面图示对应 `ops/models.py` 中的 `JobStatus`：

```mermaid
stateDiagram-v2
    [*] --> queued
    queued --> running: runner claim
    queued --> canceled: cancel
    running --> pause_requested: pause command
    pause_requested --> paused: all running items drained
    paused --> running: resume command
    running --> completed: all items success/skipped
    running --> completed_with_errors: some items failed
    running --> failed: unrecoverable error
    running --> canceled: cancel
    pause_requested --> canceled: cancel
    completed --> [*]
    completed_with_errors --> [*]
    failed --> [*]
    canceled --> [*]
```

## 命令

初始化数据库：

```bash
musicdl-catalogsync init-db --db D:\catalogsync\catalogsync.db --library-root E:\MusicLibrary
```

采集“歌单广场”和“排行榜”来源：

```bash
musicdl-catalogsync collect --db D:\catalogsync\catalogsync.db --sources netease,qq,kuwo
```

同步数据库里已有歌单：

```bash
musicdl-catalogsync sync --db D:\catalogsync\catalogsync.db --sources netease,qq,kuwo --limit 20
```

下载待下载歌曲：

```bash
musicdl-catalogsync download --db D:\catalogsync\catalogsync.db --library-root E:\MusicLibrary --sources netease,qq,kuwo --download-sources qq,kuwo,migu,qianqian,kugou,netease --limit 20 --workers 10
```

按默认链路一把跑完：

```bash
musicdl-catalogsync run --db D:\catalogsync\catalogsync.db --library-root E:\MusicLibrary --sources netease,qq,kuwo --download-sources qq,kuwo,migu,qianqian,kugou,netease --limit 20 --workers 10
```

按歌单文件直接跑：

```bash
musicdl-catalogsync run --db D:\catalogsync\catalogsync.db --library-root E:\MusicLibrary --playlist-file D:\catalogsync\playlists.txt --download-sources qq,kuwo,migu,qianqian,kugou,netease --workers 10
```

注册一个对象存储后端：

```bash
musicdl-catalogsync register-object-backend ^
  --db D:\catalogsync\catalogsync.db ^
  --backend main-s3 ^
  --bucket music-bucket ^
  --endpoint https://s3.example.com ^
  --region auto ^
  --base-prefix music ^
  --credential-env-prefix CATALOGSYNC_MAIN_S3
```

把本地已下载文件补传到对象存储：

```bash
musicdl-catalogsync upload --db D:\catalogsync\catalogsync.db --backend main-s3 --workers 4
musicdl-catalogsync upload --db D:\catalogsync\catalogsync.db --backend main-s3 --sources netease,qq --limit 200
musicdl-catalogsync upload --db D:\catalogsync\catalogsync.db --backend main-s3 --playlist-ids 12,15 --workers 4
```

启动 ops web console（FastAPI + uvicorn）：

```bash
musicdl-catalogsync serve --db D:\catalogsync\catalogsync.db --env-file D:\catalogsync\catalogsync.env --host 127.0.0.1 --port 18080
```

也可以直接用模块方式启动：

```bash
python -m musicdl.catalogsync.cli --help
```

## `--playlist-file` 行为

传入 `--playlist-file` 时，`run` 会走一条窄分支：

1. 跳过 `collect`
2. 读取文件中的歌单 URL
3. 解析并去重
4. 以 `manual_file` 池的形式写入数据库
5. 只同步这些歌单
6. 只下载这些歌单关联到的歌曲

不传 `--playlist-file` 时，仍然保持原来的 `collect -> sync -> download` 默认行为。

## `--sources` 与 `--download-sources`

- `--sources`
  - 控制要采集 / 同步 / 过滤哪些 canonical 平台歌曲
  - 当前主要用于 `netease`、`qq`、`kuwo` 这三类歌单来源
- `--download-sources`
  - 控制下载前要去哪些平台重新搜歌、解析直链
  - 默认值是 GUI 同款六平台：`qq,kuwo,migu,qianqian,kugou,netease`

下载阶段的实际行为是：

1. 先从数据库中的 canonical song 取歌名、歌手、原始快照
2. 在 `--download-sources` 白名单里重新找可下载候选
3. 对候选按“匹配度 -> 音质 / 文件大小 -> 你配置的源顺序”排序
4. 选出最佳候选后再真正下载

这意味着：

- 网易云歌单里的歌，不一定由网易云下载
- 原平台官方直链过期或不可用时，会自动去其它下载源找同名同歌手候选
- 只要匹配可信，优先选择质量更高的候选

`sync` 阶段从这一版开始也不再要求“原平台当场给出可下载直链”：

- 只要歌单接口还能返回歌曲元信息，`sync` 就会把歌曲快照完整写入数据库
- 这些歌曲会以“延迟解析”快照入库，真正下载时再按 `--download-sources` 去补可用直链
- 这样可以避免网易云 / QQ / 酷我因为版权或临时直链失效，导致歌曲在入库阶段被提前丢掉

### 文件格式

每行一种，支持以下三类：

```text
# 注释行
https://music.163.com/#/playlist?id=17745989905
qq,https://y.qq.com/n/ryqq/playlist/7707261125
https://y.qq.com/n/ryqq/toplist/26
https://www.kuwo.cn/rankList?bangId=16
```

规则：

- 空行忽略
- `#` 开头的行忽略
- 支持 `平台,URL`
- 也支持只写 URL，此时会自动识别平台
- 同一文件里的重复歌单会自动去重
- 当前支持自动识别的 URL 平台为 `netease`、`qq`、`kuwo`

### 支持的 URL 类型

- 网易云普通歌单：`https://music.163.com/#/playlist?id=...`
- QQ 普通歌单：`https://y.qq.com/n/ryqq/playlist/...`
- QQ 排行榜：`https://y.qq.com/n/ryqq/toplist/...`
- 酷我普通歌单：`https://www.kuwo.cn/playlist_detail/...`
- 酷我排行榜：`https://www.kuwo.cn/rankList?bangId=...`

## 数据库设计总览

数据库使用 SQLite，连接策略为：

- `PRAGMA journal_mode=WAL`
- `PRAGMA busy_timeout=30000`
- `PRAGMA synchronous=NORMAL`
- 所有表在 `db.py` 中集中定义，并在初始化时执行补列迁移

设计目标：

1. 强去重：同一平台同一远端 ID 只保留一条实体
2. 弱耦合：歌曲逻辑资产与物理存储位置分离
3. 可恢复：任务状态机可持久化并支持重启续跑
4. 可观测：任务、worker、日志、事件都有落表

### 表域拆分（四大域）

1. 目录实体域（Catalog Core）
   - `playlist_pools`: 歌单来源池（广场/榜单/manual_file）
   - `playlists`: 歌单主体（平台、远端 ID、策略、播放量）
   - `songs`: 歌曲主体（平台、远端 ID、名称、歌手、格式、快照）
   - `artists`: 歌手主体（归一化名称 + 平台维度）
2. 关系映射域（Association）
   - `pool_playlists`: 池与歌单多对多
   - `playlist_songs`: 歌单与歌曲多对多（含 position）
   - `pool_artists`: 池与歌手多对多
   - `artist_songs`: 歌手与歌曲多对多
3. 文件资产域（Storage）
   - `storage_backends`: 存储后端定义（local_fs/object_storage/cloud_drive）
   - `file_assets`: 歌曲文件逻辑版本（质量/格式/大小/checksum）
   - `file_locations`: 物理位置（backend + locator + 状态 + 主副本）
   - `song_backend_presence`: 歌曲在后端的聚合存在性（加速查询）
   - `download_tasks` / `upload_tasks`: 下载上传队列
4. 任务编排域（Ops）
   - `job_runs`: 任务总览（类型、状态、范围、配置快照）
   - `job_stages`: 阶段（collect/sync/download/upload）计数器
   - `job_items`: 最小执行单元（歌单项/歌曲项/文件项）
   - `job_workers`: worker 实时状态、吞吐、速度
   - `job_commands`: pause/resume/cancel/retry 命令队列
   - `job_events` / `job_logs`: 审计事件与执行日志
   - `config_revisions`: 环境配置版本快照与回滚记录

### 去重与一致性约束（核心）

唯一键（强约束）：

- `playlists(platform, remote_playlist_id)`
- `songs(platform, remote_song_id)`
- `file_locations(file_asset_id, backend_id, locator)`
- `upload_tasks(file_asset_id, target_backend_id, target_locator)`
- `job_items(job_stage_id, item_key)`

一致性规则（业务层）：

- 同一 `song_id` 可对应多个 `file_asset`（不同质量/格式）
- 同一 `file_asset` 可有多个 `file_location`（本地 + 云端）
- `song_backend_presence` 由 `file_locations` 派生，不作为事实源
- 歌单“已下载/未下载/部分”状态由 `playlist_songs + active local file_locations` 聚合计算

### 高频读写路径（排障重点）

1. 采集阶段
   - 写：`playlist_pools`, `playlists`, `pool_playlists`
   - 典型问题：池里有歌单但 `playlists.collected_song_count` 未回填
2. 同步阶段
   - 写：`songs`, `playlist_songs`, `artists`, `pool_artists`, `artist_songs`
   - 典型问题：歌单已同步但歌曲数为 0（需区分“源返回空”与“解析失败”）
3. 下载阶段
   - 写：`file_assets`, `file_locations`, `download_tasks`
   - 读：`songs` 快照 + 下载源候选
   - 典型问题：文件重复落盘、`(1)/(2)` 命名膨胀
4. 上传阶段
   - 写：`upload_tasks`, `file_locations`, `song_backend_presence`
   - 典型问题：上传成功但 presence 未刷新导致界面仍显示未上传
5. 任务中心
   - 写：`job_runs/stages/items/workers/commands/events/logs`
   - 读：dashboard 汇总、doing/done 树、worker 速度

### 迁移与向后兼容

- `initialize_database()` 每次启动都会：
  - 执行 `CREATE TABLE IF NOT EXISTS`
  - 执行必要 `ALTER TABLE ADD COLUMN`（如 `play_count`、worker 吞吐字段）
- 这保证了旧库可直接升级，不需要手工跑 SQL migration 脚本
- 升级前建议备份 `catalogsync.db`，尤其在调整去重策略与批量维护前

### 核心 ER 简图

```mermaid
erDiagram
    PLAYLIST_POOLS ||--o{ POOL_PLAYLISTS : links
    PLAYLISTS ||--o{ POOL_PLAYLISTS : belongs_to
    PLAYLISTS ||--o{ PLAYLIST_SONGS : contains
    SONGS ||--o{ PLAYLIST_SONGS : appears_in

    ARTIST_POOLS ||--o{ POOL_ARTISTS : links
    ARTISTS ||--o{ POOL_ARTISTS : belongs_to
    ARTISTS ||--o{ ARTIST_SONGS : sings
    SONGS ||--o{ ARTIST_SONGS : performed_by

    SONGS ||--o{ FILE_ASSETS : has_versions
    FILE_ASSETS ||--o{ FILE_LOCATIONS : stored_at
    STORAGE_BACKENDS ||--o{ FILE_LOCATIONS : hosts
    SONGS ||--o{ SONG_BACKEND_PRESENCE : has_presence
    STORAGE_BACKENDS ||--o{ SONG_BACKEND_PRESENCE : summarized_on

    JOB_RUNS ||--o{ JOB_STAGES : has
    JOB_STAGES ||--o{ JOB_ITEMS : has
    JOB_RUNS ||--o{ JOB_WORKERS : owns
    JOB_RUNS ||--o{ JOB_COMMANDS : receives
    JOB_RUNS ||--o{ JOB_EVENTS : emits
    JOB_RUNS ||--o{ JOB_LOGS : writes
```

## 数据表

### 歌单池 -> 歌单 -> 歌曲

- `playlist_pools`
  - 平台来源池，比如 `playlist_square`、`toplist`、`manual_file`
- `playlists`
  - 具体歌单或榜单
- `pool_playlists`
  - 歌单池和歌单的映射
- `songs`
  - 歌曲主表，唯一键为 `(platform, remote_song_id)`
- `playlist_songs`
  - 歌单和歌曲的映射

歌曲主表会保存这些核心信息：

- `remote_song_id`
- `name`
- `singers`
- `ext`
- `file_size_bytes`
- `quality_label`
- `metadata_json`
  - 包含 `SongInfo` 快照，后续可直接恢复给原下载器继续下载

### 派生歌手池 + 懒加载补全

- `artist_pools`
  - 由歌单池派生出的歌手池
- `artists`
  - 歌手主表
- `pool_artists`
  - 歌手池和歌手的映射
- `artist_songs`
  - 歌手和歌曲的映射

同步歌单歌曲时，会一起更新歌手池，满足“歌单池更新时，同时更新歌手池”的要求。

## 下载去重与文件映射

### 逻辑资产层

- `file_assets`
  - 表示“某首歌的某一种文件版本”
  - 常见维度是 `song_id + quality_label + ext + file_size_bytes`
  - `ext / quality_label / file_size_bytes` 以实际下载命中的音源文件为准，不强绑 canonical 平台

### 物理位置层

- `storage_backends`
  - 描述存储后端
  - 当前已实现 `local_fs`
  - 后续可扩展到云盘和对象存储
- `file_locations`
  - 记录某个文件资产当前实际存在哪

可以这样理解：

- `file_assets` 回答“这是什么文件”
- `file_locations` 回答“这个文件现在放在哪”

如果一首歌先下载到本地，后面再上传到云盘或对象存储，可以继续复用同一个 `file_asset`，只需追加或更新对应的 `file_location`。

### 上传队列与后端可达性

- `song_backend_presence`
  - 派生汇总表，表示某首歌在某个 backend 上是否已有 active 文件
  - 常用于快速判断“这首歌是否已经补传到 main-s3”
- `upload_tasks`
  - 上传任务队列表
  - 一条任务 = 一个本地 `file_asset` 上传到一个目标 backend/key
  - 状态包括 `pending`、`uploading`、`succeeded`、`failed`、`skipped`

这里要特别区分：

- `file_locations` 仍然是事实来源
- `song_backend_presence` 只是为了快速查询，不替代 `file_locations`

## 磁盘不足时的行为

下载器会优先检查目标目录剩余空间。

如果空间不足，会提示输入新的下载目录：

```text
磁盘空间不足，请输入新的下载目录继续:
```

新目录可以位于另一个盘符。程序会：

- 把歌曲下载到新目录
- 为新目录自动创建或复用一个 `storage_backend`
- 把新的文件位置写回 `file_locations`

在 `--workers > 1` 时，仍然只会出现一次全局提示。切换成功后，后续尚未开始的下载任务会统一改用新目录继续。

## 对象存储上传

当前已经实现第一版对象存储上传，后端语义按 S3-compatible 处理。

### 关键约定

1. 本地下载完成后，会先写入一条本地 `file_location`
2. 上传成功后，会为同一个 `file_asset` 新增一条远端 `file_location`
3. 本地文件仍保留，且本地 `file_location.is_primary = 1`
4. 远端对象存储记录为 `is_primary = 0`
5. 默认信数据库状态，不对远端对象额外做 `HEAD` 校验
6. 同一首歌如果本地有多个 active 文件版本，会全部入队上传

### key / locator 规则

对象存储 key 会镜像本地相对路径。

例如：

- 本地 locator：`qq/Singer A/song-a.flac`
- backend `base_prefix`：`music`
- 远端 locator：`music/qq/Singer A/song-a.flac`

这样做的好处是：

- 目录结构和本地一致
- 后续迁移或重新建立映射更简单
- 上传到 CDN / 云盘时也更容易复用相同 locator 语义

### backend 配置与密钥模型

非敏感配置写在 `storage_backends.config_json` 中，例如：

- `endpoint`
- `region`
- `base_prefix`
- `addressing_style`
- `public_base_url`
- `credential_env_prefix`

敏感密钥不落库，只走环境变量。

例如 `credential_env_prefix = CATALOGSYNC_MAIN_S3` 时：

```dotenv
CATALOGSYNC_MAIN_S3_ACCESS_KEY_ID=your-access-key
CATALOGSYNC_MAIN_S3_SECRET_ACCESS_KEY=your-secret-key
CATALOGSYNC_MAIN_S3_SESSION_TOKEN=optional-session-token
```

如果配置了 `public_base_url`，上传成功后会顺手把可推导出来的 `public_url` 写回远端 `file_location`。

### upload 命令默认行为

`upload` 默认会做三件事：

1. 找出目标 backend 上仍缺失的本地 active 文件
2. 去重后写入或复用 `upload_tasks`
3. 用有限并发 worker 执行上传并回写数据库

支持按以下维度缩小范围：

- `--sources`
- `--playlist-ids`
- `--limit`
- `--workers`

默认建议：

- 下载：`--workers 10`
- 上传：`--workers 4`

### 上传后数据库会更新什么

- `file_locations`
  - 新增或更新远端对象位置
- `song_backend_presence`
  - 刷新该歌曲在目标 backend 上的 active 汇总
- `upload_tasks`
  - 记录本次任务的排队、执行、成功或失败状态

## 云盘兼容预留

推荐约定：

- 本地文件：
  - `backend_type=local_fs`
  - `locator` 保存相对路径
- 对象存储：
  - `backend_type=object_storage`
  - `container_name` 保存 bucket
  - `locator` 保存 key
- 云盘类后端：
  - `backend_type=cloud_drive`
  - `remote_file_id` 保存平台文件 ID
  - `locator` 保存远端目录路径

## 当前实现说明

- 采集层已经覆盖 GUI “发现”页中的“歌单广场”和“排行榜”来源
- 榜单特殊解析已支持：
  - `netease_toplist`
  - `qq_toplist`
  - `kuwo_toplist`
- 下载链路已解耦“歌单来源”和“下载来源”
- 下载时会在 `--download-sources` 指定的平台里重新搜歌
- 候选优选策略为：
  - 高可信匹配优先
  - 在高可信候选里优先更高音质 / 更大文件
  - 音质相近时按 `--download-sources` 的顺序决定优先级
- 默认下载源为 GUI 同款六平台：`qq,kuwo,migu,qianqian,kugou,netease`
- 对象存储上传当前已实现 `register-object-backend` + `upload` 两条命令链路

## 运行建议

- 首次跑批建议先从单一平台开始，例如 `--sources netease`
- `sync` 和 `download` 建议先带 `--limit` 做冒烟验证
- 如果只想跑少量指定歌单，优先使用 `run --playlist-file`

## NAS / Linux 落地约定

### 目录职责拆分

- `/volume4/Music_Cloud/library`
  - 只存放最终音乐文件（下载产物）
- `/volume4/Music_Cloud/catalogsync`
  - 只存放 catalogsync 应用与运行数据（代码、副本脚本、配置、数据库、输入、日志）

建议固定结构：

```text
/volume4/Music_Cloud/
  library/
  catalogsync/
    app/
    bin/
    config/
    data/
    inputs/
    logs/
```

### 下载布局

默认下载布局为：

```text
<LIBRARY_DIR>/<platform>/<first_artist>/<filename>
```

其中 `DOWNLOAD_LAYOUT=platform_first_artist` 对应上述目录结构。

这里的 `<platform>` 指的是“实际命中的下载源平台”，不是歌单来源平台。

### `catalogsync.env` 关键项示例

```dotenv
ROOT_DIR=/volume4/Music_Cloud
APP_HOME=/volume4/Music_Cloud/catalogsync
LIBRARY_DIR=/volume4/Music_Cloud/library
DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db
INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs
LOG_DIR=/volume4/Music_Cloud/catalogsync/logs
ENV_FILE=/volume4/Music_Cloud/catalogsync/config/catalogsync.env
WEB_HOST=127.0.0.1
WEB_PORT=18080
PYTHON_BIN=python3
VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv
DOWNLOAD_LAYOUT=platform_first_artist
DOWNLOAD_SOURCES=qq,kuwo,migu,qianqian,kugou,netease
CATALOG_EXPORT_COMMAND=bash /volume4/Music_Cloud/Music_Server/scripts/catalog-export.sh
CATALOG_EXPORT_WORKDIR=/volume4/Music_Cloud/Music_Server
OBJECT_BACKEND_NAME=main-s3
OBJECT_BUCKET=music-bucket
OBJECT_ENDPOINT=https://s3.example.com
OBJECT_REGION=auto
OBJECT_BASE_PREFIX=music
OBJECT_ADDRESSING_STYLE=
OBJECT_PUBLIC_BASE_URL=
OBJECT_CREDENTIAL_ENV_PREFIX=CATALOGSYNC_MAIN_S3
UPLOAD_WORKERS=4
UPLOAD_SOURCES=
UPLOAD_PLAYLIST_IDS=
UPLOAD_LIMIT=
CATALOGSYNC_MAIN_S3_ACCESS_KEY_ID=
CATALOGSYNC_MAIN_S3_SECRET_ACCESS_KEY=
CATALOGSYNC_MAIN_S3_SESSION_TOKEN=
```

### Windows 一键部署到 NAS（推荐）

如果你在 Windows 本地开发并部署到固定 NAS，推荐使用一条命令：

```powershell
.\deploy-catalogsync.ps1
```

该命令会串联：

1. 本地上传 `musicdl/catalogsync` 到 NAS staging 目录
2. 覆盖 NAS 上最新 `serve_console.sh` 与 `deploy_and_restart.sh`
3. 在 NAS 端执行原子部署脚本（备份 -> 同步 -> 停旧 -> 起新 -> 探活）
4. 若探活或单实例校验失败，自动回滚到上一个版本并返回非 0

可选参数：

```powershell
.\deploy-catalogsync.ps1 -SkipHealthCheck
```

脚本位置：

- 仓库快捷入口：`deploy-catalogsync.ps1`
- NAS 部署触发：`scripts/catalogsync/deploy_to_nas.ps1`
- NAS 部署执行：`scripts/catalogsync/templates/deploy_and_restart.sh`

### NAS 端部署脚本行为（`deploy_and_restart.sh`）

脚本默认目标路径：

- 代码目标：`/volume4/Music_Cloud/catalogsync/app/musicdl/catalogsync`
- staging：`/volume4/Music_Cloud/catalogsync/deploy/staging/catalogsync`
- 备份：`/volume4/Music_Cloud/catalogsync/deploy/backups/catalogsync_YYYYMMDD_HHMMSS`

稳定性机制：

- 部署锁：`/volume4/Music_Cloud/catalogsync/run/deploy.lock`
- 服务 PID：`/volume4/Music_Cloud/catalogsync/run/serve.pid`
- 健康检查：默认 `http://127.0.0.1:${WEB_PORT}/dashboard`
- 失败回滚：自动恢复最近备份并重启验证
- 备份保留：默认保留最近 5 个版本（可用 `--keep-backups` 调整）

### `scripts/catalogsync/bootstrap_to_linux.ps1` 用法

在 Windows 侧执行（会通过 `ssh/scp` 初始化目标机目录并分发代码与脚本模板）：

```powershell
powershell -ExecutionPolicy Bypass -File .\scripts\catalogsync\bootstrap_to_linux.ps1 `
  -RemoteHost 192.168.1.10 `
  -Port 22 `
  -User xiaoming `
  -RootDir /volume4/Music_Cloud
```

执行后请在目标机把 `catalogsync.env.example` 复制为 `catalogsync.env` 并按机器实际路径调整。

### 目标机先执行 `install_runtime.sh`

目标机第一次部署完成后，建议先跑一次：

```bash
bash /volume4/Music_Cloud/catalogsync/bin/install_runtime.sh
```

这条脚本会自动完成几件事：

- 使用 `PYTHON_BIN` 创建 `VENV_DIR`
- 升级 `pip/setuptools/wheel`
- 从 `/volume4/Music_Cloud/catalogsync/app/requirements.txt` 生成 `/volume4/Music_Cloud/catalogsync/app/requirements.nas.txt`
- 自动过滤 `nodejs-wheel`
- 安装 `catalogsync` 当前下载/上传链路所需依赖
- 对 `/volume4/Music_Cloud/catalogsync/app` 执行一次 editable install，使 `python -m musicdl.catalogsync.cli ...` 可直接运行

日志会写到：

```text
/volume4/Music_Cloud/catalogsync/logs/install_runtime_YYYYMMDD_HHMMSS.log
```

### 目标机 `download_all.sh` / `download_from_file.sh` 用法

在目标机执行前先准备：

```bash
cp /volume4/Music_Cloud/catalogsync/config/catalogsync.env.example \
   /volume4/Music_Cloud/catalogsync/config/catalogsync.env
```

全量流程（等价于 `musicdl.catalogsync.cli run`）：

```bash
bash /volume4/Music_Cloud/catalogsync/bin/download_all.sh --sources netease,qq,kuwo --limit 20
```

按歌单文件跑（跳过 collect）：

```bash
bash /volume4/Music_Cloud/catalogsync/bin/download_from_file.sh \
  /volume4/Music_Cloud/catalogsync/inputs/playlists.txt
```

该脚本对应 `run --playlist-file` 分支（跳过 `collect`），因此示例中不再携带 `--sources`。

这两个下载脚本都会自动读取 `catalogsync.env` 里的 `DOWNLOAD_SOURCES`，并转成 `--download-sources ...` 传给 CLI。

这两个下载脚本会优先使用 `VENV_DIR/bin/python`；如果虚拟环境还没准备好，才回退到 `PYTHON_BIN`。

### 下载后 catalog 导出（NAS 联动建议开启）

为让 `Music_Server` 的只读库 `catalog_read.db` 在下载后自动刷新，建议在 `catalogsync.env` 配置：

- `CATALOG_EXPORT_COMMAND=bash /volume4/Music_Cloud/Music_Server/scripts/catalog-export.sh`
- `CATALOG_EXPORT_WORKDIR=/volume4/Music_Cloud/Music_Server`

行为说明：

- 每次 `download` stage 进入终态后触发一次（同一 stage 仅触发一次）
- 未配置 `CATALOG_EXPORT_COMMAND` 时，本次导出标记为 `skipped`
- `job_events` 会记录以下事件：
  - `catalog_export_started`
  - `catalog_export_skipped`
  - `catalog_export_succeeded`
  - `catalog_export_failed`

### 目标机 `upload_all.sh` 用法

对象存储上传脚本位于：

```text
/volume4/Music_Cloud/catalogsync/bin/upload_all.sh
```

它会先按 `catalogsync.env` 中的配置自动执行一次 `register-object-backend`，再执行 `upload`，因此改了 bucket、endpoint、CDN 基地址后，不需要单独再手工注册一次。

最简单的跑法：

```bash
bash /volume4/Music_Cloud/catalogsync/bin/upload_all.sh
```

如果只想补传指定来源或指定歌单，也可以在脚本后面直接追加 CLI 参数：

```bash
bash /volume4/Music_Cloud/catalogsync/bin/upload_all.sh --sources netease,qq --limit 200
bash /volume4/Music_Cloud/catalogsync/bin/upload_all.sh --playlist-ids 12,15 --workers 6
```

这条脚本同样会优先使用 `VENV_DIR/bin/python`；如果虚拟环境不存在，才回退到 `PYTHON_BIN`。

这条脚本依赖以下 env：

- `OBJECT_BACKEND_NAME`
- `OBJECT_BUCKET`
- `OBJECT_ENDPOINT`
- `OBJECT_REGION`
- `OBJECT_BASE_PREFIX`
- `OBJECT_ADDRESSING_STYLE`
- `OBJECT_PUBLIC_BASE_URL`
- `OBJECT_CREDENTIAL_ENV_PREFIX`
- `${OBJECT_CREDENTIAL_ENV_PREFIX}_ACCESS_KEY_ID`
- `${OBJECT_CREDENTIAL_ENV_PREFIX}_SECRET_ACCESS_KEY`
- `${OBJECT_CREDENTIAL_ENV_PREFIX}_SESSION_TOKEN`
- `UPLOAD_WORKERS`
- `UPLOAD_SOURCES`
- `UPLOAD_PLAYLIST_IDS`
- `UPLOAD_LIMIT`

日志会写到：

```text
/volume4/Music_Cloud/catalogsync/logs/upload_all_YYYYMMDD_HHMMSS.log
```

### 目标机 `serve_console.sh` 用法

ops 控制台脚本位于：

```text
/volume4/Music_Cloud/catalogsync/bin/serve_console.sh
```

运行示例：

```bash
bash /volume4/Music_Cloud/catalogsync/bin/serve_console.sh
```

脚本会自动读取 `catalogsync.env` 中的 `DB_PATH`、`ENV_FILE`、`WEB_HOST`、`WEB_PORT` 并透传给 `musicdl.catalogsync.cli serve`。

单实例保护机制：

- 锁目录：`/volume4/Music_Cloud/catalogsync/run/serve.lock`
- PID 文件：`/volume4/Music_Cloud/catalogsync/run/serve.pid`
- 如果已存在活跃实例，脚本会直接失败退出，避免重复启动

日志会写到：

```text
/volume4/Music_Cloud/catalogsync/logs/serve_console_YYYYMMDD_HHMMSS.log
```

### NAS 依赖安装备注

这台 NAS 的系统 Python 是 `Python 3.8`，并且缺少 `nodejs-wheel-binaries` 需要的本地编译工具链。

当前 `catalogsync` 的下载、对象存储上传、`netease/qq/kuwo` 这条链路不依赖 `nodejs-wheel`，因此建议直接使用上面的 `install_runtime.sh`。它会自动生成并安装过滤后的 `requirements.nas.txt`，不需要再手工执行 `grep`。

## `/playlists` 歌单池管理页（选择性下载）

`/playlists` 现已作为歌单池管理页使用，面向“筛选歌单 -> 选择目标 -> 执行批量动作”的运维流程。

支持筛选参数：

- `platform`
- `pool_kind`
- `status`
- `keyword`
- `wanted_only`
- `page_size`

列表支持当前页勾选，并提供整页全选/清空。

当前支持四个批量动作：

- 下载已同步所选歌单
- 同步后下载所选歌单
- 加入待下载清单
- 移出待下载清单

歌单状态语义：

- 未同步：该歌单尚未完成同步
- 未下载：已同步但仍有待下载歌曲
- 下载中：存在进行中的下载任务
- 部分已下载：部分歌曲已落盘，仍有剩余未完成
- 已下载：歌单内歌曲均满足“已下载”口径

“已下载”口径：对同一 `song_id`，只要本地存在 `active` 的 `local_fs` 文件，即判定该歌曲下载完成。

页面动作最终仍复用现有 job 系统：

- 下载已同步所选歌单 -> `download_only`
- 同步后下载所选歌单 -> `sync_download`
- 上述两类任务的区别在 `playlist_scope.playlist_ids`

## Operations Console Update

As of `2026-04-16`, the operations console behavior has changed in three important ways:

1. `musicdl-catalogsync serve` now starts the web console together with an embedded ops runner.
2. `/dashboard` now exposes a create-job form plus live job/download summary, active workers, and running items.
3. `/jobs/{id}` now exposes a command form for `pause`, `resume`, `cancel`, `retry_item`, and `force_retry_item`, together with worker and running-item detail.

Current job type to stage mapping:

- `catalog_sync`: `collect -> sync -> download`
- `collect_only`: `collect`
- `sync_only`: `sync`
- `sync_download`: `sync -> download`
- `download_only`: `download`
- `upload_only`: `upload`
- `download_upload`: `download -> upload`

Collector behavior update:

- playlist square collection now paginates for `netease` and `kuwo`
- `qq` playlist-square failures are isolated so other sources continue

This means the console is no longer read-only: creating a job from the dashboard should enqueue work that the embedded runner can execute without starting a second process.

As of `2026-04-17`, the deployed NAS console was verified again and the following operational fixes are also part of the live behavior:

1. `/dashboard` now exposes `Quick Launch`, `Active Job`, `Running Songs`, and `Playlist Coverage`, and the `Active Job` / `Recent Jobs` blocks now provide direct `pause` / `resume` / `cancel` buttons, so the operator can both observe progress and control the current queue from one page.
2. `/jobs/{id}` now exposes direct action buttons for `pause`, `resume`, `cancel`, `retry_item`, and `force_retry_item` instead of only relying on a generic command dropdown.
3. Collect-stage workers now emit page-level progress text such as `page N: +X, total Y`, which makes it clear whether collection is advancing or stuck.

Collector and runtime hardening in this round:

- `QQCollector` playlist-square requests now send the required `Referer` and `Origin` headers, which restored non-zero QQ playlist-square collection on NAS.
- `netease` and `kuwo` playlist-square pagination now stops when the upstream explicitly reports `has_more = false` or when a page is entirely duplicate playlists, preventing long-running repeated-page loops.
- NAS runtime compatibility was extended for Python `3.8` by removing runtime-evaluated built-in generic aliases from the serve import path.
- SQLite connections now enable `busy_timeout` and `journal_mode=WAL`, which prevents the operations console from intermittently failing with `database is locked` while the embedded runner is writing progress.

Observed NAS verification snapshot after redeploying these fixes:

- `GET http://192.168.5.43:18080/dashboard` returned `200 OK` with the new controls visible.
- Ten consecutive requests to `/api/dashboard` returned `200 OK` while `collect_only` job `3` was running.
- Total playlists on NAS grew from the earlier `811` baseline to `1441` during live verification.
- QQ playlists on NAS grew from `25` to `629+` during the same verification window, confirming that QQ playlist-square collection was no longer stuck at zero.

## 2026-04-17 NAS Restart Note

During the `2026-04-17` restart verification on NAS, the web console and the embedded runner did not recover equally:

- the web process restarted and continued serving `/dashboard`, `/jobs/{id}`, and `/api/dashboard`
- a stale duplicate `serve` process had to be removed manually before the NAS converged back to a single web instance
- after duplicate cleanup, the embedded runner still failed to advance queued work even though manual `OpsRepository` / `OpsRunner` recovery calls succeeded against the same database

Operational workaround used on NAS:

- web console kept running as `/volume4/Music_Cloud/catalogsync/app/.venv/bin/python -m musicdl.catalogsync.cli serve ...`
- a separate emergency runner process was started to execute `OpsRunner.run_forever()` against the same SQLite database
- verification after the workaround showed `job 5` resume correctly and `downloaded_songs` increase from `82` to `85`

Temporary NAS-only emergency runner details:

- PID: `17516`
- log: `/volume4/Music_Cloud/catalogsync/logs/ops_runner_20260417_101958.log`

Resolution on `2026-04-17 10:29`:

- `musicdl/catalogsync/ops/web.py` now supervises the embedded runner thread and automatically restarts it after transient exceptions instead of letting the web process continue without background execution
- local regression coverage now includes an embedded-runner recovery test that forces one loop failure and verifies that queued work is still completed after automatic restart
- NAS was redeployed with this fix and the temporary emergency runner was removed
- after restart, NAS converged back to a single live `serve` process on port `18080`
- the restarted web process recovered the interrupted download job back to `paused`, accepted a `resume` command, and then continued downloading without any standalone runner
- live verification on NAS showed `downloaded_songs` increase from `100` to `102` under the single embedded-runner setup

## 2026-04-17 Progress Visibility Update

- the playlists page now renders a `Progress` column with `downloaded / total`, a percentage bar, and the current running-song count
- the job detail page now renders a `Playlist Progress` table for playlist-scoped jobs
- job playlist progress is derived from playlist-song links, active local files, and download-stage job items of the current job
- songs that were already present locally before the job started still count as completed progress for that playlist
- empty boolean-like filters such as `/playlists?wanted_only=` and `/api/playlists?wanted_only=` are accepted and treated as `false`

## 2026-04-17 Non-Music Skip + Task Center Tree

- download stage now classifies QQ toplist fallback entries (`remote_song_id` starts with `qqtop_` or metadata marks `qq_toplist_fallback`) as `skipped` instead of `failed`
- skipped toplist entries are annotated with `非音乐资源（有声榜条目）`
- new API: `GET /api/jobs/{job_id}/playlists/{playlist_id}/songs` returns per-song progress rows for one playlist inside one job
- dashboard Task Center removed the old `Open` jump link and keeps operations inline
- task detail now supports hierarchical expansion:
  - task -> playlist progress rows
  - playlist row -> lazy-loaded song progress rows
  - song rows explicitly show `非音乐资源` tag when matched
## 2026-04-17 Stable Task Tree Refresh

- dashboard `Task Center` no longer renders the embedded `Summary / Stages / Workers / Running Items` detail tables
- the dashboard now presents one stable tree:
  - task
  - playlist
  - song
- task lifecycle transitions such as `paused`, `completed`, `completed_with_errors`, and `canceled` keep the same task node visible in Task Center instead of making the row disappear immediately
- live refresh updates task nodes in place so expanded tasks and expanded playlists can remain open across refresh cycles

## 2026-04-18 Dashboard Maintenance: Local Duplicate Scan / Dedupe

- `Dashboard` now includes a `Maintenance` card for local duplicate inspection.
- `Scan Duplicate Local Copies` calls `GET /api/maintenance/local-duplicates`.
- `Run Local Dedupe` calls `POST /api/maintenance/local-duplicates/dedupe`.
- The scan groups active local duplicate rows by `(file_asset_id, backend_id)`.
- Keep rule priority:
  1. existing file wins
  2. non-`(1)` / non-`(2)` canonical locator wins
  3. shorter locator wins
  4. smaller `file_locations.id` wins
- Dedupe execution updates references before inactivation:
  - repoint `upload_tasks.source_location_id`
  - repoint `job_items.file_location_id`
  - mark duplicate `file_locations.status = 'inactive'`
  - delete duplicate local files when they still exist on disk
  - refresh `song_backend_presence`
- Safety guard:
  - dedupe is rejected with `409` while any `job_runs.status = 'running'` or `job_items.status = 'running'`
  - this avoids colliding with active download / upload execution
- The dashboard renders results inline and does not jump away from the page.

## 2026-04-18 Playlist Export Pipeline Update

- `playlists/` directory generation is no longer triggered by `sync`.
- `CatalogSyncService.sync_playlist_row()` now only handles playlist-song linking and play-count backfill.
- Playlist export artifacts are refreshed from the download side for scoped playlist jobs:
  - `download_only`
  - `sync_download`
- The runner refreshes export folders when an individual scoped playlist finishes downloading, instead of waiting for the whole download job to finish.
- On runner restart / recovery, scoped download stages also backfill export folders for playlists whose items were already completed before the restart.
- Stage-final export refresh is still kept as the last safety net, including the `0`-pending-items case where all files already existed locally.
- Existing single-playlist export remains available:
  - `GET /api/playlists/{playlist_id}/export-folder`
  - it refreshes the folder from current database state only
  - it does not auto-download missing songs
- New bulk export API:
  - `POST /api/playlists/export`
  - routes selected playlists by current state
  - `downloaded` -> export immediately
  - `unsynced` -> create `sync_download` job
  - `not_downloaded` / `partial` / `downloading` -> create `download_only` job
- Playlists page adds `Export Selected Playlists`:
  - already-downloaded playlists can be exported without re-downloading songs
  - not-yet-synced or not-yet-downloaded playlists are queued into the appropriate job automatically

## 2026-04-19 Local ZIP Export + Adaptive Download

- Playlists page no longer shows a standalone `Sync Then Download` button.
- `Download Selected Playlists` is now adaptive:
  - `unsynced` playlists are routed to `sync_download`
  - already-synced but incomplete playlists are routed to `download_only`
  - mixed selections may create both a `download_job` and a `sync_download_job`
  - already-downloaded playlists can be skipped without forcing a re-download
- Export semantics now mean browser download to the operator's local machine:
  - modal `Export` downloads `GET /api/playlists/{playlist_id}/export.zip`
  - list `Export Selected` calls `POST /api/playlists/export-zip`
  - when every selected playlist is ready, the API returns `status=ready` plus `download_url`
  - when any selected playlist is not ready, the API returns `status=queued` plus job details instead of a partial ZIP
- Prepared bundle downloads are served by:
  - `GET /api/exports/bundles/{bundle_name}.zip`
- `GET /api/playlists/{playlist_id}/export-folder` remains available as an internal server-side folder refresh / inspection endpoint, but it is no longer the user-facing export action.