Initial import: Music_Server, MusicFree, catalog-sync

This commit is contained in:
2026-05-23 16:51:14 +08:00
commit 069af30dba
847 changed files with 179878 additions and 0 deletions
@@ -0,0 +1,120 @@
# Catalog Sync Design
## Goal
Build an independent catalog sync and download workflow that:
- extracts playlist-square and toplist sources from NetEase, QQ Music, and Kuwo
- stores `playlist pool -> playlist -> song` and derived `artist pool -> artist -> song`
- skips duplicate downloads by `(platform, remote_song_id)`
- prefers highest available quality and falls back when needed
- supports pausing on low disk space and continuing in a new local directory
- keeps storage metadata compatible with local paths, cloud-drive paths, and bucket/key style object storage
## Scope
### In Scope
- Independent Python CLI entrypoint
- SQLite schema for catalog, file, and task state
- Source collectors for:
- NetEase playlist square + toplists
- QQ playlist square + toplists
- Kuwo playlist square + toplists
- Reuse existing platform `parseplaylist()` and download logic where practical
- Derived artist pool updates during playlist sync
- Lazy artist enrichment metadata and hooks
- Local download dedupe and disk-space prompts
- Storage schema compatible with future uploads
### Out of Scope
- Full cross-platform song canonicalization
- GUI integration
- Production-ready 123 cloud upload implementation
- Streaming upload while downloading
## Constraints
- Prefer reuse of existing source clients under `musicdl.modules.sources`
- Avoid new mandatory dependencies where stdlib is sufficient
- Keep first version recoverable and inspectable from local files and SQLite
- Preserve compatibility with the existing `musicdl` package and console script
## Architecture
The new workflow lives in a dedicated package under `musicdl.catalogsync`. Collectors fetch playlist candidates per source and pool kind, then a sync layer normalizes and persists them. Playlist parsing reuses the existing per-platform clients to resolve tracks into `SongInfo` objects, which are then stored into catalog tables and used to derive artist pool membership. A download planner reads undispatched songs from the database, skips anything already represented by an active local file asset, and otherwise delegates the actual media fetch to existing source download logic.
Storage metadata is modeled with a logical file layer plus a location layer. `file_assets` describes the downloaded media version for a song, while `file_locations` records where that file lives. The first implementation only writes local locations, but the schema supports cloud-drive or bucket/key locations later without changing the song-level model.
## Data Model
### Catalog
- `playlist_pools`
- `playlists`
- `pool_playlists`
- `artist_pools`
- `artists`
- `pool_artists`
- `songs`
- `playlist_songs`
- `artist_songs`
### File and Storage
- `storage_backends`
- `file_assets`
- `file_locations`
- `download_tasks`
## Key Behaviors
### Playlist Sync
1. Fetch playlist-square and toplist candidates for selected sources.
2. Upsert pool rows and playlist rows.
3. Link pools to playlists.
4. For selected playlists, call platform `parseplaylist()` to resolve songs.
5. Upsert song rows and `playlist_songs`.
6. Extract artists from raw platform metadata when possible, otherwise from normalized singer strings.
7. Upsert artists and attach them to derived artist pools and `artist_songs`.
### Download Dedupe
- A song is considered already owned when it has an active local `file_location`.
- Dedupe key at song level is `(platform, remote_song_id)`.
- The first implementation keeps one preferred file asset per song. Future uploads add locations, not duplicate song rows.
### Quality Selection
- Existing platform clients already attempt higher qualities first.
- The workflow treats the returned file as the chosen asset and persists:
- quality label
- extension
- file size
- hash when available or computable
### Low Disk Space
- Before each download, check free space for the active local backend.
- If insufficient, pause and prompt for a new local directory.
- Upsert a new local backend row and continue subsequent downloads there.
- Already downloaded files remain linked to their original backend.
### Future Upload Compatibility
- `storage_backends` represents local FS, cloud-drive roots, or object-storage containers.
- `file_locations.container_name + locator` can represent:
- local root + relative path
- cloud root + remote path
- bucket + key
- Future upload jobs can attach new non-local locations to an existing `file_asset`.
## Acceptance Criteria
- Selected source collectors can persist playlist-square and toplist rows into SQLite.
- Playlist sync can populate songs and derived artists from at least the supported source set.
- Download command skips songs already backed by active local file locations.
- Low-space prompt can switch to a new local directory and continue.
- Tests cover schema creation, normalization, derived artist sync, dedupe checks, and collector parsing helpers.
@@ -0,0 +1,289 @@
# Download Layout And NAS Deployment Design
## Goal
Refine the current `musicdl.catalogsync` download flow so it can be deployed cleanly onto a NAS or any other Linux machine with:
- a portable script layout
- a machine-local `.env` configuration file
- a dedicated music library root separate from scripts and runtime state
- a download directory structure of `platform/first_artist/filename`
- path semantics that can be reused later by the upload workflow
This design intentionally focuses on download and deployment only. Upload automation is deferred to the next sub-project.
## Scope
### In Scope
- Introduce a portable deployment layout for NAS and other Linux targets
- Separate application/runtime files from downloaded music files
- Standardize local download paths as:
- `<LIBRARY_DIR>/<platform>/<first_artist>/<filename>`
- Preserve relative path semantics in `file_locations.locator`
- Add machine-local configuration through `config/catalogsync.env`
- Add bootstrap and runtime script conventions suitable for copying to other machines
- Keep database and runtime files under the application home instead of the music library root
- Ensure required directories are auto-created when bootstrapping or running
### Out of Scope
- 123 cloud upload implementation
- Object storage upload implementation
- Concurrent download
- Concurrent upload
- Cross-platform song canonicalization
- GUI integration
- Deletion or migration of existing remote file locations
## Constraints
- Reuse the existing `musicdl.catalogsync` package and CLI as much as possible
- Keep the deployment scripts portable so they can be copied to another Linux machine
- Do not hardcode NAS-only paths inside the application logic
- Store machine-specific paths in configuration, not in source code
- Keep `file_locations.locator` stable so the future upload phase can reuse the same relative paths
## Deployment Model
### Local Repo Versus Target Machine
There are two kinds of scripts:
1. Bootstrap/deployment scripts that live in the repository and are run from the operator machine
2. Runtime scripts that are copied onto the target machine and used there repeatedly
This avoids the circular problem of requiring a target-side script before the target-side directories exist.
### Target Directory Layout
Recommended target layout:
```text
/volume4/Music_Cloud/
├─ library/
└─ catalogsync/
├─ app/
├─ bin/
├─ config/
├─ data/
├─ inputs/
└─ logs/
```
Responsibilities:
- `library`
- downloaded music files only
- `catalogsync/app`
- synced code, virtual environment, and application files
- `catalogsync/bin`
- target-side runtime scripts
- `catalogsync/config`
- machine-local configuration such as `catalogsync.env`
- `catalogsync/data`
- SQLite database
- `catalogsync/inputs`
- playlist files and other operator-provided inputs
- `catalogsync/logs`
- runtime logs
## Configuration Model
### Machine-Local Environment File
Each deployed machine should use a local config file:
```bash
ROOT_DIR=/volume4/Music_Cloud
APP_HOME=/volume4/Music_Cloud/catalogsync
LIBRARY_DIR=/volume4/Music_Cloud/library
DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db
INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs
LOG_DIR=/volume4/Music_Cloud/catalogsync/logs
PYTHON_BIN=python3
VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv
DOWNLOAD_LAYOUT=platform_first_artist
```
### Configuration Rules
- `ROOT_DIR`
- optional convenience root for deployment layout
- `APP_HOME`
- runtime home for scripts, DB, logs, and inputs
- `LIBRARY_DIR`
- physical location of downloaded music files
- may be different from `ROOT_DIR`
- `DB_PATH`
- defaults to `<APP_HOME>/data/catalogsync.db`
- `INPUT_DIR`
- defaults to `<APP_HOME>/inputs`
- `LOG_DIR`
- defaults to `<APP_HOME>/logs`
- `PYTHON_BIN`
- interpreter used by runtime scripts
- `VENV_DIR`
- target-side virtualenv path
- `DOWNLOAD_LAYOUT`
- first supported value: `platform_first_artist`
This keeps deployment portable:
- copying to a new machine mainly requires updating `catalogsync.env`
- moving the music library only requires updating `LIBRARY_DIR`
## Download Path Design
### Layout Rule
The first supported layout is:
```text
<LIBRARY_DIR>/<platform>/<first_artist>/<filename>
```
Examples:
```text
/volume4/Music_Cloud/library/netease/周杰伦/七里香.flac
/volume4/Music_Cloud/library/qq/林俊杰/江南.mp3
```
### Artist Directory Rule
- Use the first artist only
- Do not create multi-artist directory names in the first version
- If no artist is available, use a stable fallback such as `Unknown Artist`
This keeps paths shorter, more stable, and easier to reuse for upload.
### Locator Rule
`file_locations.locator` must store a path relative to `LIBRARY_DIR`.
Examples:
```text
netease/周杰伦/七里香.flac
qq/林俊杰/江南.mp3
```
This is important because the future upload phase will reuse the same relative path for:
- cloud-drive locators
- object-storage keys beneath a backend root prefix
## Directory Creation Behavior
When bootstrapping or first running on a machine, the system should auto-create any missing directories with `mkdir -p` semantics.
Required directories:
- `<ROOT_DIR>`
- `<LIBRARY_DIR>`
- `<APP_HOME>`
- `<APP_HOME>/app`
- `<APP_HOME>/bin`
- `<APP_HOME>/config`
- `<APP_HOME>/data`
- `<APP_HOME>/inputs`
- `<APP_HOME>/logs`
Rules:
- existing directories are reused without error
- missing directories are created automatically
- permission failures should produce a clear fatal error
## Script Model
### Repository-Side Bootstrap Scripts
The repository should contain deployment/bootstrap scripts that:
- connect to the target machine
- create the target directory layout
- copy application files
- create or refresh the runtime scripts
- create a config template if missing
These scripts must not hardcode a single target path internally beyond defaults that can be overridden.
### Target-Side Runtime Scripts
After bootstrap, the target machine should contain reusable runtime scripts under:
```text
<APP_HOME>/bin
```
Initial examples:
- `download_all.sh`
- `download_from_file.sh`
Each runtime script should:
- load `config/catalogsync.env`
- ensure the required directories exist
- use `DB_PATH`, `INPUT_DIR`, `LOG_DIR`, and `LIBRARY_DIR`
- write logs to the configured log directory
## CLI And Application Semantics
The current code uses `--library-root` as the download root. This design prefers moving toward a configuration-driven deployment model where:
- runtime scripts supply the configured paths
- the application writes downloads into `LIBRARY_DIR`
- the DB lives under `APP_HOME/data`
The implementation may either:
- keep `--library-root` internally for compatibility while runtime scripts pass `LIBRARY_DIR`
- or introduce a cleaner root/app configuration layer as long as behavior stays aligned with this design
The important requirement is behavioral, not the exact CLI spelling:
- scripts and runtime state must stay separated from music files
- downloaded file locations must follow `platform/first_artist/filename`
## Error Handling
- Missing config file:
- fail fast with a clear message pointing to `catalogsync.env`
- Missing required env values:
- fail fast with a clear message naming the missing variable
- Missing artist data:
- use fallback artist directory and continue
- Invalid filename/path characters:
- sanitize to a filesystem-safe name
- Existing file in the destination path:
- preserve current dedupe behavior through DB state and active local file records
- Directory creation failure:
- fail fast with an actionable error
## Testing
Add or update coverage for:
- path-building helper for `platform/first_artist/filename`
- first-artist extraction behavior
- artist fallback behavior
- locator values remaining relative to `LIBRARY_DIR`
- directory auto-creation for deployment/runtime helpers
- runtime config loading from `catalogsync.env`
- download flow recording the new relative locator format in `file_locations`
## Acceptance Criteria
- Downloads are stored under `<LIBRARY_DIR>/<platform>/<first_artist>/<filename>`
- `file_locations.locator` stores the path relative to `LIBRARY_DIR`
- Application/runtime files are separate from music files
- A deployment can be copied to another Linux machine by adjusting `catalogsync.env`
- Bootstrap/runtime behavior auto-creates the expected directory structure
- Existing download logic still records local files into the catalog database
- The resulting local relative paths are suitable for reuse by the later upload implementation
@@ -0,0 +1,168 @@
# Playlist File Run Design
## Goal
Add a file-driven playlist execution path to `musicdl.catalogsync` so a user can provide a text file of playlist URLs and run the existing catalog sync and download pipeline against only those playlists.
The default behavior must remain unchanged when the new option is not used.
## Scope
### In Scope
- Add `--playlist-file` to the existing `run` command
- Support two input line formats:
- raw playlist URL
- `platform,playlist_url`
- Ignore blank lines and comment lines beginning with `#`
- Auto-detect `netease`, `qq`, or `kuwo` from URL when platform is omitted
- Deduplicate repeated playlist URLs within the same input file
- Import file playlists into the existing catalog tables
- Run sync and download only for playlists referenced by the file
- Keep song and file dedupe behavior exactly as it works today
### Out of Scope
- Incremental skip mode
- New collect-mode behavior
- New database tables for file imports
- GUI integration
- Upload automation
## Constraints
- Reuse the existing `playlists`, `playlist_pools`, and `pool_playlists` tables
- Preserve current `run` behavior when `--playlist-file` is absent
- Do not create duplicate playlist rows for the same `(platform, remote_playlist_id)`
- Do not widen download scope to the full database when a playlist file is used
- Keep implementation small and aligned with the current `catalogsync` package layout
## User-Facing Behavior
### Default Run Path
When `--playlist-file` is not provided:
1. `run` collects playlist pools from configured sources
2. `run` syncs playlists from the database
3. `run` downloads pending songs from the database
This matches the current behavior exactly.
### File-Driven Run Path
When `--playlist-file <path>` is provided:
1. Skip `collect`
2. Read and parse the file
3. Normalize and deduplicate the playlist entries from the file
4. Upsert those playlists into the existing catalog database
5. Attach them to a dedicated pool row representing the source file import
6. Sync only those playlist IDs
7. Download only songs belonging to those playlist IDs
## Input File Rules
Each non-empty, non-comment line must be one of:
```text
https://music.163.com/#/playlist?id=17745989905
qq,https://y.qq.com/n/ryqq/playlist/7707261125
```
Parsing rules:
- Leading and trailing whitespace is trimmed
- Blank lines are ignored
- `# ...` lines are ignored
- If a comma is present, split once into `platform` and `url`
- If no platform is provided, infer it from the URL
- Unsupported or unrecognized lines are reported and skipped
- Repeated URLs in the same file are processed only once
## Architecture
The feature should be implemented as a narrow branch off the existing `run` workflow.
Recommended units:
- A file parser helper that converts input lines into normalized playlist import entries
- A service method that imports manual playlists into the existing playlist catalog
- A service method that syncs only a provided list of playlist IDs
- A downloader method that queues only songs reachable from a provided list of playlist IDs
This keeps the current full-database path intact while adding a targeted path for file-based execution.
## Data Model
No new tables are required.
The imported playlists should reuse:
- `playlists`
- `playlist_pools`
- `pool_playlists`
Recommended pool representation:
- `pool_kind = manual_file`
- `external_id = manual_file:<resolved file path>`
- `name = Manual File Import: <filename>`
This preserves provenance without changing the main playlist model.
## Dedupe Behavior
### Playlist Rows
Duplicate playlist rows must not be created because `playlists` is already unique on `(platform, remote_playlist_id)`.
### Songs
Repeated sync of the same playlist may re-run parsing, but songs must continue to upsert by `(platform, remote_song_id)` and playlist-song links must remain unique by `(playlist_id, song_id)`.
### Files
Downloads must continue to rely on the existing `file_locations` and local-file checks so already downloaded songs are not fetched again.
## Error Handling
- Missing playlist file path: fail fast with a clear CLI error
- File exists but contains no valid playlist lines: fail fast with a clear CLI error
- Invalid individual lines: warn and skip, continue processing the rest
- Playlist parse failure for one playlist: log the failure, continue with the remaining playlists
- Download failure for one song: preserve the existing downloader behavior
## Output
The file-driven run path should report a compact summary including:
- total lines read
- valid playlist entries
- skipped invalid lines
- deduplicated playlist count
- synchronized song count
- downloaded song count
## Tests
Add coverage for:
- file parsing of URL-only and `platform,url` lines
- blank lines and comment handling
- same-file URL dedupe
- unsupported line handling
- `run --playlist-file ...` taking the file-driven branch instead of `collect`
- manual playlist import into a `manual_file` pool
- sync limited to provided playlist IDs
- download limited to songs linked to provided playlist IDs
- repeated execution not creating duplicate playlist rows or duplicate local file downloads
## Acceptance Criteria
- `run --playlist-file <path>` processes only playlists from the file
- omitting `--playlist-file` preserves current behavior
- duplicate URLs inside one file are processed once
- repeated runs do not create duplicate playlist rows
- repeated runs do not redownload already owned local files
- tests cover the file-driven branch and targeted sync/download behavior
@@ -0,0 +1,724 @@
# Catalogsync Operations Console Design
## Goal
Extend `musicdl.catalogsync` with a NAS-local web operations console that can:
- manage queue-based pipeline jobs for `collect`, `sync`, `download`, and `upload`
- show playlist pool and playlist execution status as `未完成 / 进行中 / 已完成 / 异常`
- show worker-level live processing state, especially which song each worker is handling
- support global soft pause and resume across all active workers
- survive process crashes or NAS restarts without restarting the whole catalog from scratch
- allow retrying a single failed or interrupted song/item instead of rerunning the whole database
- manage `catalogsync.env` as the primary operator configuration source
This design targets an internal NAS console, not a public-facing multi-user product.
## Scope
### In Scope
- Add a NAS-local web console for `catalogsync`
- Add a database-backed job queue with exactly one active job at a time
- Support these job templates:
- `全链路`
- `仅采集`
- `仅同步`
- `同步+下载`
- `仅下载`
- `仅上传`
- `下载+上传`
- Track job, stage, item, and worker state in SQLite
- Show dashboard, queue, playlist pool, worker, log, and config views
- Implement soft pause and resume
- Implement crash-safe recovery at job-item granularity
- Implement single-item retry and force-retry
- Version and edit `catalogsync.env` from the web console
- Reuse existing `musicdl.catalogsync` collectors, services, downloader, uploader, and storage model as much as possible
### Out of Scope
- Multi-user login or permissions
- Public internet exposure or hardened auth
- Multiple active jobs running at the same time
- Cross-machine worker distribution
- Arbitrary user-defined stage graphs
- Provider-specific cloud drive management beyond current object storage support
- Automatic deletion of local or remote files
- Editing business data such as songs or playlists directly from the UI
## Constraints
- The console runs on the NAS itself
- `catalogsync.env` remains the configuration source of truth
- A queued job must freeze the required runtime settings into a config snapshot so later env edits do not mutate in-flight work
- Recovery must resume from unfinished work items instead of rerunning all songs or all playlists
- Existing `musicdl.catalogsync` CLI and scripts must remain usable
- The first version should optimize for operational stability, inspectability, and recoverability over architecture purity
## Operator Model
### Deployment Model
The web console runs on the same NAS host that already owns:
- the SQLite database
- the local music library
- the logs directory
- the runtime scripts
- the object storage configuration
This avoids a remote-control architecture for v1 and keeps job control, log access, file state, and recovery local.
### Configuration Model
`catalogsync.env` remains the operator-managed source of truth.
The console may:
- display current env values
- validate and save new env revisions
- apply a previous env revision as the current file
Queued jobs must store a `config_snapshot_json` copy of the relevant settings so:
- existing queued or running jobs stay deterministic
- later env edits only affect newly created jobs
## Recommended Architecture
Use four layers:
1. `Web Console`
- browser UI for dashboards, queue control, logs, and config management
2. `Management API`
- serves data and accepts job or config commands
3. `Job Orchestrator / Runner`
- single-process scheduler that owns queue progression, pause, resume, and recovery
4. `Existing Catalogsync Executors`
- reuse `collect`, `sync`, `download`, and `upload` behavior from current package modules
### Why Not A Thin Shell Wrapper
Wrapping only `download_all.sh` and `upload_all.sh` would not reliably provide:
- worker-level current song visibility
- item-level retry
- fine-grained recovery after process crashes
- stable soft pause and resume
The console therefore needs first-class job and work-item tables instead of depending only on raw shell output.
## Job Model
### Active Job Policy
- only one job may be `running` at a time
- additional jobs stay `queued`
- a paused job may later resume and reclaim the active slot
This keeps:
- pause and resume semantics simple
- resource ownership clear
- crash recovery easier to reason about
### Job Templates
Supported templates and stage chains:
- `全链路`
- `collect -> sync -> download -> upload`
- `仅采集`
- `collect`
- `仅同步`
- `sync`
- `同步+下载`
- `sync -> download`
- `仅下载`
- `download`
- `仅上传`
- `upload`
- `下载+上传`
- `download -> upload`
### Job Status
Recommended job statuses:
- `queued`
- `running`
- `pause_requested`
- `paused`
- `completed`
- `completed_with_errors`
- `failed`
- `canceled`
### Stage Status
Recommended stage statuses:
- `pending`
- `running`
- `pause_requested`
- `paused`
- `completed`
- `failed`
- `skipped`
### Work Item Status
Recommended item statuses:
- `pending`
- `running`
- `succeeded`
- `failed`
- `interrupted`
- `skipped`
- `canceled`
The work item is the recovery and retry granularity. This is what prevents a single failure from forcing a whole-catalog restart.
## Data Model
### Existing Table Reuse
Keep current business tables as the catalog truth:
- `playlist_pools`
- `playlists`
- `pool_playlists`
- `songs`
- `playlist_songs`
- `artists`
- `song_artists`
- `file_locations`
- `object_storage_backends`
These continue to answer:
- what playlists exist
- what songs belong to each playlist
- which files exist locally or remotely
The new console layer adds execution truth around them.
### New Table: `job_runs`
Purpose:
- represent one queued or active operator job
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
job_type TEXT NOT NULL
status TEXT NOT NULL
priority INTEGER NOT NULL DEFAULT 100
requested_by TEXT
config_snapshot_json TEXT NOT NULL
sources TEXT
download_sources TEXT
playlist_scope_json TEXT
created_at TEXT DEFAULT CURRENT_TIMESTAMP
started_at TEXT
ended_at TEXT
last_error TEXT
resume_token TEXT
```
### New Table: `job_stages`
Purpose:
- track the stage-level execution status inside one job
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
job_run_id INTEGER NOT NULL
stage_type TEXT NOT NULL
status TEXT NOT NULL DEFAULT 'pending'
seq_no INTEGER NOT NULL
total_items INTEGER NOT NULL DEFAULT 0
pending_items INTEGER NOT NULL DEFAULT 0
running_items INTEGER NOT NULL DEFAULT 0
success_items INTEGER NOT NULL DEFAULT 0
failed_items INTEGER NOT NULL DEFAULT 0
skipped_items INTEGER NOT NULL DEFAULT 0
started_at TEXT
ended_at TEXT
last_error TEXT
```
### New Table: `job_items`
Purpose:
- track the real execution unit for recovery and retry
Granularity by stage:
- `collect`
- one pool/source fetch unit
- `sync`
- one playlist expansion unit
- `download`
- one song download unit
- `upload`
- one file upload unit
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
job_stage_id INTEGER NOT NULL
item_type TEXT NOT NULL
item_key TEXT NOT NULL
playlist_pool_id INTEGER
playlist_id INTEGER
song_id INTEGER
file_location_id INTEGER
status TEXT NOT NULL DEFAULT 'pending'
attempt_count INTEGER NOT NULL DEFAULT 0
max_attempts INTEGER NOT NULL DEFAULT 3
worker_id INTEGER
started_at TEXT
ended_at TEXT
last_error TEXT
last_error_code TEXT
payload_json TEXT
UNIQUE(job_stage_id, item_key)
```
### New Table: `job_workers`
Purpose:
- surface live worker state to the UI
- show which song each worker is processing
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
job_run_id INTEGER NOT NULL
job_stage_id INTEGER
worker_name TEXT NOT NULL
status TEXT NOT NULL DEFAULT 'idle'
current_job_item_id INTEGER
current_song_id INTEGER
current_playlist_id INTEGER
current_display_text TEXT
heartbeat_at TEXT
last_progress_text TEXT
processed_count INTEGER NOT NULL DEFAULT 0
error_count INTEGER NOT NULL DEFAULT 0
```
### New Table: `job_commands`
Purpose:
- safely bridge UI actions and runner behavior
Recommended command types:
- `pause`
- `resume`
- `cancel`
- `retry_item`
- `force_retry_item`
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
job_run_id INTEGER NOT NULL
command_type TEXT NOT NULL
target_item_id INTEGER
status TEXT NOT NULL DEFAULT 'pending'
created_at TEXT DEFAULT CURRENT_TIMESTAMP
applied_at TEXT
payload_json TEXT
```
### New Table: `job_events`
Purpose:
- structured audit trail for major runner events
Recommended event types include:
- `job_started`
- `stage_started`
- `item_started`
- `item_failed`
- `pause_requested`
- `resumed`
- `worker_heartbeat`
- `recovery_requeued`
### New Table: `job_logs`
Purpose:
- queryable log lines for the UI
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
job_run_id INTEGER NOT NULL
job_stage_id INTEGER
worker_id INTEGER
level TEXT NOT NULL
message TEXT NOT NULL
created_at TEXT DEFAULT CURRENT_TIMESTAMP
```
### New Table: `config_revisions`
Purpose:
- keep revision history of `catalogsync.env`
Recommended fields:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
source_type TEXT NOT NULL DEFAULT 'env_file'
file_path TEXT NOT NULL
content_text TEXT NOT NULL
content_hash TEXT NOT NULL
created_at TEXT DEFAULT CURRENT_TIMESTAMP
applied_at TEXT
note TEXT
```
## UI Design
### Page 1: Dashboard
Show:
- current active job
- queue length
- downloaded song count
- uploaded file count
- failed item count
- per-stage summaries
- recent exceptions
- worker heartbeat overview
### Page 2: Job Center
Show:
- queued jobs
- running or paused job
- job template
- scope
- stage progression
- pause, resume, cancel controls
Allow:
- creating a new job from the supported templates
- changing priority of queued jobs if desired
### Page 3: Playlist Pools
Show:
- all playlist pools and playlists
- source platform
- pool kind
- song count
- downloaded count
- uploaded count
- main status
- current stage
- last processed time
- latest error summary
#### Derived Playlist Status Rules
Recommend deriving the main status as:
- `异常`
- any recent failed item exists for the playlist
- `进行中`
- any running or pause-requested item exists
- `未完成`
- unfinished items remain but the playlist is not actively processing
- `已完成`
- no unfinished item remains in the relevant pipeline scope
### Page 4: Song Processing
Show:
- each worker and its current song
- failed songs
- interrupted songs
- retryable items
Allow:
- retry single item
- force-retry single item
- filter by stage, platform, playlist, or error state
### Page 5: Logs And Exceptions
Show:
- structured events
- text logs
- job-level and item-level errors
- stack traces or HTTP error summaries where available
### Page 6: Config Management
Show:
- current `catalogsync.env`
- parsed effective values
- validation errors
- revision history
Allow:
- save a new env revision
- re-apply a previous revision
Rule:
- config edits affect only future jobs unless an explicit resume override is supplied
## API Surface
Recommended management endpoints:
- `GET /api/dashboard`
- `GET /api/jobs`
- `POST /api/jobs`
- `GET /api/jobs/{id}`
- `POST /api/jobs/{id}/pause`
- `POST /api/jobs/{id}/resume`
- `POST /api/jobs/{id}/cancel`
- `GET /api/jobs/{id}/items`
- `POST /api/job-items/{id}/retry`
- `POST /api/job-items/{id}/force-retry`
- `GET /api/workers`
- `GET /api/playlists`
- `GET /api/playlists/{id}`
- `GET /api/logs`
- `GET /api/config/env`
- `PUT /api/config/env`
- `GET /api/config/revisions`
- `POST /api/config/revisions/{id}/apply`
- `GET /api/events/stream`
`/api/events/stream` should use server-sent events so the dashboard and worker pages can refresh without polling every table separately.
## Pause, Resume, And Recovery Rules
### Soft Pause
The only supported pause mode in v1 is soft pause.
Behavior:
- UI inserts a `pause` command
- the runner marks the job and current stage as `pause_requested`
- workers stop claiming new items
- any in-progress item is allowed to finish naturally
- once all workers are idle, the stage becomes `paused` and then the job becomes `paused`
This avoids half-written file state and keeps item completion boundaries clean.
### Resume
Resume behavior:
- UI inserts a `resume` command
- the runner validates the job can continue
- the runner resets paused stage and job state back to `running`
- unstarted items stay `pending`
- succeeded items remain untouched
The resume action may optionally carry a limited override payload, such as a new library root after disk exhaustion.
### Crash Recovery
On runner startup:
1. find all jobs with status `running` or `pause_requested`
2. mark those jobs `paused`
3. find all `job_items` left in `running`
4. convert those items to `interrupted`
5. record a recovery event
After that:
- `succeeded` items remain done
- `pending` items remain pending
- `interrupted` items become eligible for retry or auto-requeue depending on stage policy
- `failed` items remain failed until explicit retry
This preserves progress without restarting the whole job or whole database.
## Retry Rules
### Single Item Retry
When the operator clicks retry for a failed or interrupted item:
- insert `job_commands.retry_item`
- clear execution fields on the target item
- set status back to `pending`
- increment `attempt_count` on the next worker claim
### Force Retry
Force retry is more aggressive:
- download stage may ignore an existing local mapping if the operator requests a fresh re-download
- upload stage may ignore an existing active remote mapping if the operator explicitly wants a re-upload
Force retry must stay item-scoped, never job-scoped.
## Disk Exhaustion Handling
If the downloader detects insufficient space:
- fail or interrupt the current download item
- pause the active job with a machine-readable reason such as `disk_full`
- surface a UI banner asking for a new library root override
After the operator supplies a new directory and clicks resume:
- the job continues only for unfinished items
- completed downloads are not restarted
- the currently failed song can be retried from scratch
This matches the requirement that one song may restart while the whole database must not restart.
## Execution Strategy
### Stage Executors
Implement separate executor paths for:
- `collect`
- `sync`
- `download`
- `upload`
Recommended concurrency:
- `collect`
- low concurrency, v1 may stay serial
- `sync`
- low concurrency, v1 may stay serial
- `download`
- configurable worker pool
- `upload`
- configurable worker pool
### Reuse Strategy
Prefer reusing current catalogsync modules:
- `musicdl.catalogsync.services`
- `musicdl.catalogsync.downloader`
- `musicdl.catalogsync.uploader`
- `musicdl.catalogsync.repository`
The runner should orchestrate these modules rather than rewriting the domain logic from scratch.
## Technology Choice
### Backend
Recommended stack:
- `FastAPI`
- `Jinja2`
- `SQLite`
- `SSE` for live updates
### Frontend
Recommended rendering model:
- server-rendered pages with `Jinja2`
- `HTMX` for partial updates and action forms
- a small amount of vanilla JavaScript for log streaming and live worker refresh
Why this fits:
- NAS-local internal tool
- mainly operational tables and actions
- lower dependency and deployment complexity than a separate SPA
- easier to keep aligned with the existing Python-only project
## Verification Plan
The implementation should be verified at four levels:
1. unit tests
- state transitions
- retry rules
- recovery transforms
2. API integration tests
- job creation
- pause and resume
- item retry
- config revision flow
3. fault injection tests
- kill the runner mid-download and confirm item-level recovery
4. NAS smoke tests
- create jobs
- pause and resume
- crash and restart
- retry a single failed song
- change library directory after disk-full pause
## V1 Delivery Boundary
### Must Ship In V1
- queue-based single-active-job runner
- supported job templates
- dashboard, job center, playlist pools, song processing, logs, and config pages
- soft pause and resume
- crash-safe item-level recovery
- single-item retry and force-retry
- env revision history and apply flow
### Explicitly Deferred
- authentication
- multi-user permissions
- multiple active jobs
- distributed workers
- arbitrary stage composition
- automatic endless retries
- destructive file cleanup actions
## Open Follow-Up Items
Two source-coverage follow-ups remain outside this console design and should stay tracked separately:
- redeploy the local Kuwo toplist fallback fix to the NAS and backfill the missing collection or sync results
- repair QQ playlist square collection after the old endpoint started returning `parameter failed`
These belong to operational backlog work, not to the web console architecture itself.
@@ -0,0 +1,567 @@
# Object Storage Upload Automation Design
## Goal
Extend `musicdl.catalogsync` with a first-class object storage upload workflow that:
- uploads downloaded local files to an S3-compatible object storage backend
- preserves local files after upload
- mirrors the local relative path into the remote object key
- records remote locations in the catalog database
- tracks backend presence per song for fast lookup
- supports queue-based upload execution and limited concurrency
- updates `docs/catalogsync.md` alongside the implementation so operator docs stay current
This sub-project also introduces limited concurrent download so very large catalogs do not have to run fully serially.
## Scope
### In Scope
- Add a queue-based upload workflow for object storage backends
- Reuse `storage_backends`, `file_assets`, and `file_locations` as the primary storage model
- Add a song/backend presence summary table
- Add an upload task queue table
- Add CLI commands to register an object storage backend and upload files to it
- Support S3-compatible object storage as the first upload backend type
- Store non-secret backend configuration in the database
- Read secrets from environment variables at runtime
- Mirror local relative paths into remote object keys
- Keep local files after successful upload
- Mark remote object locations as non-primary while local files remain primary
- Support queue-based concurrent upload workers
- Add limited concurrent download workers
- When download space is exhausted, pause the whole download flow once, prompt for a new directory once, then continue later tasks under the new root
- Update `docs/catalogsync.md` to document the upload workflow, object storage backend configuration, and the new commands
### Out of Scope
- 123 cloud implementation
- Baidu Netdisk implementation
- Remote `HEAD` verification before every upload
- Automatic deletion of local files after upload
- Multi-backend upload in a single command
- GUI integration
- CDN upload orchestration beyond deriving an optional public URL
- Background daemon / scheduler service
## Constraints
- Keep the current `musicdl.catalogsync` data model as the source of truth
- Do not duplicate file location truth into `songs`
- Do not store secret access credentials in SQLite
- First upload backend must be generic S3-compatible object storage
- Default behavior must trust database state rather than querying remote object existence every time
- Upload behavior must preserve existing local download behavior
- Download and upload concurrency must remain limited and operator-controllable
## Recommended Architecture
Use the existing storage model as the base:
- `storage_backends`
- backend definition
- `file_assets`
- file-version identity
- `file_locations`
- concrete physical or remote locations
Add two new layers:
- `song_backend_presence`
- fast summary of whether a song has active files on a given backend
- `upload_tasks`
- queue of upload work items per file asset and target backend/key
Implement one new uploader component:
- `S3CompatibleUploader`
- resolves credentials from environment
- uploads a local file to a configured backend
- writes the resulting remote file location
- refreshes backend presence
Keep the user-facing CLI small:
- `register-object-backend`
- `upload`
Internally, `upload` should still be queue-driven:
1. enumerate missing remote uploads
2. enqueue deduplicated tasks
3. consume tasks with limited workers
## Data Model
### Existing Table Reuse
#### `storage_backends`
Object storage backends should reuse the current table with the following conventions:
- `backend_type = 'object_storage'`
- `name`
- stable operator-facing backend name, for example `main-s3`
- `container_name`
- object storage bucket name
- `base_path`
- unused for object storage, may remain `NULL`
- `config_json`
- non-secret configuration only
Recommended `config_json` keys:
- `endpoint`
- `region`
- `base_prefix`
- `addressing_style`
- `public_base_url`
- `credential_env_prefix`
Secrets must not be stored here.
#### `file_assets`
No semantic changes are required.
The upload unit stays aligned with the current model:
- one `file_asset` represents one concrete file version for a song
- if a song has multiple active local file versions, all of them are eligible for upload
#### `file_locations`
No structural redesign is required.
For object storage locations:
- `backend_id`
- target object storage backend
- `container_name`
- bucket
- `locator`
- object key
- `absolute_path`
- `NULL`
- `remote_file_id`
- optional, reserved for future provider-specific remote IDs
- `public_url`
- derived if backend config provides `public_base_url`
- `download_url`
- optional, first version may keep this `NULL`
- `status`
- `active`, `deleted`, or `failed`
- `is_primary`
- `0` for remote object storage in the first version
The local location remains:
- `status = 'active'`
- `is_primary = 1`
### New Table: `song_backend_presence`
Purpose:
- answer “does this song have active files on backend X?” quickly
- avoid pushing hard-coded backend presence fields into `songs`
Recommended schema:
```text
song_id INTEGER NOT NULL
backend_id INTEGER NOT NULL
has_active_file INTEGER NOT NULL DEFAULT 0
active_file_count INTEGER NOT NULL DEFAULT 0
primary_file_location_id INTEGER
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
PRIMARY KEY(song_id, backend_id)
```
Rules:
- this is a derived summary table, not the source of truth
- truth still comes from `file_locations`
- refresh this row whenever a location on that song/backend becomes active or inactive
### New Table: `upload_tasks`
Purpose:
- queue upload work
- support retries, concurrency, and resumable batch execution
Recommended schema:
```text
id INTEGER PRIMARY KEY AUTOINCREMENT
file_asset_id INTEGER NOT NULL
source_location_id INTEGER NOT NULL
target_backend_id INTEGER NOT NULL
target_container_name TEXT
target_locator TEXT NOT NULL
status TEXT NOT NULL DEFAULT 'pending'
attempts INTEGER NOT NULL DEFAULT 0
last_error TEXT
queued_at TEXT DEFAULT CURRENT_TIMESTAMP
started_at TEXT
finished_at TEXT
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
UNIQUE(file_asset_id, target_backend_id, target_locator)
```
Task granularity:
- one task = one local file asset version uploaded to one target backend/key
This keeps the queue aligned with your “upload all active file versions” requirement.
## Object Storage Key Rules
### Key Shape
The object key should mirror the local relative path beneath the configured backend prefix.
If:
- local relative path is `qq/Singer A/song-c.mp3`
- backend `base_prefix` is `music`
Then:
- remote key becomes `music/qq/Singer A/song-c.mp3`
### Why Mirror The Relative Path
- easiest to reconnect local and remote locations
- preserves the existing local organization
- keeps future CDN and migration mapping simple
- reuses the semantics already established in `file_locations.locator`
## Credential Model
### Database Versus Secrets
Store only non-secret backend config in SQLite.
Resolve secrets from environment variables using the backends configured prefix.
Example:
- backend name: `main-s3`
- `credential_env_prefix = CATALOGSYNC_MAIN_S3`
Runtime lookup:
- `CATALOGSYNC_MAIN_S3_ACCESS_KEY_ID`
- `CATALOGSYNC_MAIN_S3_SECRET_ACCESS_KEY`
- `CATALOGSYNC_MAIN_S3_SESSION_TOKEN` optional
### Why This Model
- portable for long-running batch jobs
- safer than storing keys in SQLite
- works well across multiple machines and deployment targets
## CLI Design
### `register-object-backend`
Purpose:
- create or update one object storage backend definition
Example:
```bash
musicdl-catalogsync register-object-backend \
--db D:\catalogsync\catalogsync.db \
--backend main-s3 \
--endpoint https://s3.example.com \
--bucket music \
--base-prefix music \
--region auto \
--addressing-style auto \
--public-base-url https://cdn.example.com/music \
--credential-env-prefix CATALOGSYNC_MAIN_S3
```
Behavior:
- upsert backend by `name`
- set `backend_type='object_storage'`
- validate required non-secret config before writing
### `upload`
Purpose:
- default: upload all local active file versions that are missing on the target backend
- optionally filter by source platform, playlist range, and count
Example:
```bash
musicdl-catalogsync upload --db D:\catalogsync\catalogsync.db --backend main-s3
musicdl-catalogsync upload --db D:\catalogsync\catalogsync.db --backend main-s3 --sources netease,qq --limit 200
musicdl-catalogsync upload --db D:\catalogsync\catalogsync.db --backend main-s3 --playlist-ids 12,15 --workers 4
```
Default semantics:
- trust database state
- do not do remote `HEAD` by default
- enqueue missing uploads
- consume queue with limited workers
### Download CLI Extension
Extend the existing `download` and `run` workflows with:
- `--workers`
First-version default:
- `download --workers 3`
- `upload --workers 4`
These defaults should remain conservative and configurable.
## Upload Execution Flow
### Phase 1: Candidate Selection
For the target backend:
- find all active local `file_locations`
- resolve their `file_asset`
- derive target object key from:
- backend `base_prefix`
- local relative path
- skip assets that already have an active remote location on the same backend/key
Selection must support:
- all local songs
- `--sources`
- `--playlist-ids`
- `--limit`
### Phase 2: Task Enqueue
For each missing remote file:
- insert or reuse a unique `upload_tasks` row
- set status to `pending` unless it is already `uploading` or `succeeded`
### Phase 3: Worker Claim
Each worker should:
- claim one `pending` task in a transaction
- move it to `uploading`
- set `started_at`
This must prevent duplicate worker claims.
### Phase 4: Upload
For each claimed task:
- resolve source local file from `source_location_id`
- validate that the file still exists
- resolve backend config
- resolve credentials from environment
- upload to S3-compatible storage
### Phase 5: Writeback
On success:
- write or upsert the remote `file_location`
- set remote `status='active'`
- keep remote `is_primary=0`
- refresh `song_backend_presence`
- mark task `succeeded`
- set `finished_at`
## Upload Task State Machine
Use these first-version task states:
- `pending`
- `uploading`
- `succeeded`
- `failed`
- `skipped`
State transitions:
- enqueue → `pending`
- worker claim → `uploading`
- success with DB writeback → `succeeded`
- upload error or writeback error → `failed`
- no-op due to already-active remote location → `skipped`
Retry model:
- store `attempts`
- store `last_error`
- later `upload` runs may requeue or retry `failed` tasks under a bounded retry rule
## Backend Presence Refresh Rules
Whenever a remote location changes on `(song_id, backend_id)`:
- count active locations for that song/backend
- update `has_active_file`
- update `active_file_count`
- set `primary_file_location_id` to a preferred active location on that backend
First version preference rule:
- if any active location exists on that backend, pick one deterministic row, for example the smallest active `file_locations.id`
This table exists for fast lookup and operator queries, not for deciding the actual upload truth.
## Limited Concurrency Design
### Download Concurrency
Add limited worker-based download concurrency.
Key rule:
- disk-space exhaustion must trigger one global pause, not one prompt per worker
Behavior:
1. workers process queued download items
2. if a worker detects insufficient space under the current active root:
- raise a shared pause request
- stop dispatching new tasks
3. prompt the operator once for a new download directory
4. switch the shared active root
5. resume remaining not-yet-started tasks under the new root
Non-goals:
- per-worker independent root switching
- automatic multi-root balancing in the first version
### Upload Concurrency
Upload workers should process queue rows concurrently but conservatively.
Requirements:
- claim tasks transactionally
- prevent duplicate uploads of the same `(file_asset_id, backend_id, locator)`
- keep worker count operator-controlled
## Error Handling
### Upload Errors
- missing source file
- mark task `failed`
- set descriptive `last_error`
- missing backend config
- fail fast before batch execution
- missing environment credentials
- fail fast before batch execution
- upload transport error
- mark task `failed`
- upload succeeded but DB writeback failed
- mark task `failed`
- store explicit `last_error` explaining that remote upload may already exist
### Download Errors
- worker download failure
- record failure for that item and continue with other tasks
- insufficient disk space
- trigger one global directory-switch prompt
- no replacement directory supplied
- fail the remaining batch clearly
## Testing
Add or update coverage for the following areas.
### Schema Tests
- `song_backend_presence` exists
- `upload_tasks` exists
- unique constraint on upload task dedupe works
### Repository Tests
- register or upsert object storage backends
- write remote `file_locations`
- refresh `song_backend_presence`
- enqueue deduplicated upload tasks
- select pending upload candidates by backend, source, playlist, and limit
### Uploader / Service Tests
Using a fake or stub S3-compatible client:
- successful upload creates active remote location
- public URL derivation when configured
- missing source file becomes `failed`
- missing credentials fail fast
- multiple local file versions for one song are all enqueued
### CLI Tests
- `register-object-backend`
- `upload --backend ...`
- `upload --sources ...`
- `upload --playlist-ids ...`
- `upload --limit ...`
- `upload --workers ...`
- `download --workers ...`
### Concurrency Tests
- concurrent upload workers do not claim the same task twice
- concurrent download workers trigger only one directory switch prompt
- after directory switch, later downloads use the new root
### Documentation Tests
- `docs/catalogsync.md` is updated to describe:
- object storage backend registration
- upload command usage
- queue semantics
- environment variable credential model
- download/upload worker options
## Documentation Requirements
Implementation must update `docs/catalogsync.md` to include:
- why object storage uses backend config plus env-based secrets
- how to register an object storage backend
- how remote keys mirror local relative paths
- how `upload` works by default
- what `song_backend_presence` and `upload_tasks` are for
- how `--workers` affects download and upload
- how the global download directory switch behaves under low disk space
## Acceptance Criteria
- An operator can register an S3-compatible object storage backend without storing secrets in SQLite
- `upload` can enqueue and execute uploads for missing remote files on that backend
- Remote object keys mirror local relative paths beneath the configured backend prefix
- Successful uploads create active remote `file_locations`
- Local files remain active and primary after upload
- `song_backend_presence` shows whether a song has active files on a given backend
- `upload_tasks` supports resumable queue execution with bounded retries
- The first version uploads all active local file versions for a song, not just one version
- `upload` supports both full backend fill-in mode and filtered mode
- Download and upload both support limited operator-configurable concurrency
- Low disk space during download triggers one global prompt and one shared root switch for later tasks
- `docs/catalogsync.md` is updated together with the implementation
@@ -0,0 +1,571 @@
# Playlist Selective Download Design
## Goal
Extend the `catalogsync` operations console so operators can download songs by selected playlists instead of relying on uncontrolled full-library download runs.
The new flow must allow the operator to:
- browse playlists through a paginated playlist-pool page
- filter playlists by download state
- select playlists on the current page
- run either `download already-synced songs` or `sync then download` for the selected playlists
- persist a separate `wanted for download` marker for playlists that should remain in a long-term queue
This design keeps the existing job system and downloader intact wherever possible.
## Scope
### In Scope
- Upgrade `/playlists` from a read-only list into a playlist-pool management page
- Add server-side pagination to the playlist page
- Add playlist filtering by:
- platform
- pool kind
- keyword
- download state
- wanted marker
- Add current-page checkbox selection and current-page select-all
- Add bulk actions:
- `下载已同步所选歌单`
- `同步后下载所选歌单`
- `加入待下载清单`
- `移出待下载清单`
- Add a persistent playlist-level preference table for the wanted marker
- Reuse existing `download_only` and `sync_download` jobs by passing `playlist_scope.playlist_ids`
- Compute playlist download state from live catalog and runner data
### Out of Scope
- Cross-page remembered temporary selection
- Saved named selection sets
- Manual editing of computed playlist download state
- New downloader semantics outside the existing job system
- Per-playlist download history pages
- Automatic cancellation or reprioritization of in-flight jobs in this design
## User Decisions Captured
This design encodes the following confirmed product decisions:
- Playlist download state uses multiple states instead of a simple downloaded flag
- The operator needs both:
- temporary current-page selection for immediate actions
- persistent playlist-level wanted markers
- If a song appears in multiple playlists, it counts as downloaded for all of them once the same `song_id` has an active local file
- Playlists with no synced `playlist_songs` must show a dedicated `未同步` state
- The playlist page must support pagination, page-level select-all, and download-state filtering
- `下载中` is shown only when there is active running download work for songs belonging to that playlist
- The state filter set is:
- `全部`
- `未同步`
- `未下载`
- `下载中`
- `部分已下载`
- `已下载`
## Constraints
- Existing `catalogsync` job queue remains the only execution path
- Only one active job still runs at a time
- Existing `download_only` and `sync_download` job types should remain valid and reusable
- SQLite remains the backing store
- The first version should optimize for operational clarity and low migration risk over advanced UX
- The implementation should avoid full-library recomputation for every playlist page load because the NAS dataset is already large
## Existing System Reuse
The current codebase already provides two critical capabilities that should be reused instead of reinvented:
1. `playlist_scope.playlist_ids` already exists on jobs
2. download planning already supports filtering by `playlist_ids`
Relevant current behavior:
- `download_only` is already a first-class job type
- `sync_download` is already a first-class job type
- `OpsRunner` already resolves `playlist_scope.playlist_ids`
- `DownloadPlanner.build_download_queue()` already accepts `playlist_ids`
- `CatalogRepository.list_pending_download_songs()` already supports `playlist_ids`
Because of this, the main work is playlist management UI, playlist-state aggregation, and lightweight playlist preference persistence.
## Recommended Approach
Use a mixed model:
- compute playlist state live for the current result page
- persist only playlist-level operator intent (`wanted for download`)
- use current bulk selections only as transient request payload
### Why This Approach
This approach avoids two bad extremes:
- **Pure runtime-only UI** would lose long-term operator intent such as a curated wanted list
- **Full cached playlist-state tables** would add a large invalidation burden after every sync, download, retry, or file-state change
The mixed approach gives:
- correct and current state for the visible page
- minimal schema change
- low-risk reuse of the current pipeline
## Data Model
### New Table: `playlist_download_preferences`
Purpose:
- persist operator intent for playlists that should stay in a long-term wanted queue
Recommended fields:
```text
playlist_id INTEGER PRIMARY KEY
is_wanted INTEGER NOT NULL DEFAULT 1
marked_by TEXT
created_at TEXT DEFAULT CURRENT_TIMESTAMP
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
```
Notes:
- use one row per playlist, not an event log
- `playlist_id` should be unique and serve as the primary key
- deleting or setting `is_wanted = 0` are both acceptable implementation choices; prefer explicit row persistence only if it simplifies auditing
### No Cached Playlist State Table
Do not add a second table that stores computed playlist state such as `已下载 / 未下载 / 部分已下载`.
Reason:
- those values depend on current `playlist_songs`
- current local file availability can change
- current running download items can change
The state should therefore be computed from live data for the current page.
## Playlist State Model
For each playlist row, calculate:
- `song_count`
- `downloaded_song_count`
- `running_download_song_count`
- `is_wanted`
### State Rules
- `未同步`
- `song_count = 0`
- `下载中`
- `song_count > 0`
- `running_download_song_count > 0`
- `未下载`
- `song_count > 0`
- `downloaded_song_count = 0`
- `running_download_song_count = 0`
- `部分已下载`
- `song_count > 0`
- `0 < downloaded_song_count < song_count`
- `running_download_song_count = 0`
- `已下载`
- `song_count > 0`
- `downloaded_song_count = song_count`
- `running_download_song_count = 0`
### Downloaded Song Counting Rule
For one playlist song entry, the song is treated as downloaded if:
- the same `song_id` has an active local file location
It does not matter which playlist originally triggered that file download.
### Running Download Counting Rule
For one playlist song entry, the song is treated as currently downloading if:
- there is a `running` job item in stage `download`
- that item points to the same `song_id`
Queued-but-not-running work does not count as `下载中`.
## Playlist Page Design
### URL
Keep the main page at:
```text
/playlists
```
### Query Parameters
Support these server-side filters:
- `page`
- `page_size`
- `platform`
- `pool_kind`
- `status`
- `keyword`
- `wanted_only`
### Default Pagination
Recommended defaults:
- default `page_size = 50`
- allow `20 / 50 / 100`
Reason:
- current NAS data already contains more than ten thousand playlists
- a fixed `LIMIT 500` list will become increasingly unusable
### Table Columns
Recommended visible columns:
- checkbox
- playlist id
- platform
- remote playlist id
- playlist name
- pool names
- song count
- downloaded song count
- computed state
- wanted marker
- updated at
### Toolbar Actions
Recommended top-toolbar controls:
- platform filter
- pool-kind filter
- state filter
- keyword search
- wanted-only filter
- page-size selector
### Bulk Action Groups
#### Temporary Selection Actions
Apply only to the currently selected playlist ids:
- `下载已同步所选歌单`
- `同步后下载所选歌单`
#### Persistent Marker Actions
Apply only to the currently selected playlist ids:
- `加入待下载清单`
- `移出待下载清单`
### Selection Behavior
The first version should support only current-page temporary selection.
Rules:
- `全选本页` selects all rows visible on the current page
- changing page clears temporary selection
- filters changing the result page clear temporary selection
- persistent wanted markers remain stored independently of temporary selection
This keeps implementation simple and predictable.
## API Design
### `GET /api/playlists`
Purpose:
- return one filtered page of playlist rows with computed state
Request parameters:
- `page`
- `page_size`
- `platform`
- `pool_kind`
- `status`
- `keyword`
- `wanted_only`
Response shape:
```json
{
"items": [
{
"id": 123,
"platform": "qq",
"remote_playlist_id": "456",
"name": "Example Playlist",
"pool_names": "QQ 音乐歌单广场",
"song_count": 120,
"downloaded_song_count": 80,
"state": "部分已下载",
"is_wanted": true,
"updated_at": "2026-04-17 00:00:00"
}
],
"page": 1,
"page_size": 50,
"total": 12345
}
```
### `POST /api/playlists/mark-wanted`
Purpose:
- persist wanted markers for the specified playlists
Request body:
```json
{
"playlist_ids": [1, 2, 3],
"marked_by": "ops-console"
}
```
### `POST /api/playlists/unmark-wanted`
Purpose:
- remove or disable wanted markers for the specified playlists
Request body:
```json
{
"playlist_ids": [1, 2, 3]
}
```
### `POST /api/playlists/download`
Purpose:
- create a `download_only` job scoped to selected playlist ids
Request body:
```json
{
"playlist_ids": [1, 2, 3],
"requested_by": "ops-console"
}
```
Behavior:
- create one `download_only` job
- store `playlist_scope.playlist_ids = [...]`
- do not include playlists that are not selected
### `POST /api/playlists/sync-download`
Purpose:
- create a `sync_download` job scoped to selected playlist ids
Request body:
```json
{
"playlist_ids": [1, 2, 3],
"requested_by": "ops-console"
}
```
Behavior:
- create one `sync_download` job
- store `playlist_scope.playlist_ids = [...]`
## Interaction Rules
### `下载已同步所选歌单`
This action should:
- create a `download_only` job for the selected `playlist_ids`
- operate only on songs already present in `playlist_songs`
Playlists in state `未同步` contribute no songs and therefore effectively produce no download work.
The UI should make this explicit instead of pretending those playlists are downloading.
### `同步后下载所选歌单`
This action should:
- create a `sync_download` job for the selected `playlist_ids`
- sync playlist songs first
- then download only missing songs from those playlists
### Wanted Marker UX
The wanted marker is not itself a download state.
It is a separate operator-intent flag.
A playlist may therefore be:
- `已下载` and still marked wanted
- `未同步` and marked wanted
- `部分已下载` and not marked wanted
This separation avoids overloading one column with two different meanings.
## Query and Aggregation Strategy
### Page-First Aggregation
Do not compute states for the whole library on each request.
Instead:
1. query only the playlist ids for the requested page
2. run aggregation queries only for those playlist ids
3. merge the counts into the returned rows
This keeps response cost proportional to current page size instead of full library size.
### Aggregations Needed Per Page
For the current page playlist ids:
- playlist song totals from `playlist_songs`
- downloaded song totals from:
- `playlist_songs`
- `file_assets`
- `file_locations`
- `storage_backends`
- running download song totals from:
- `playlist_songs`
- `job_items`
- `job_stages`
- wanted markers from `playlist_download_preferences`
### Index Expectations
Add or verify indexes for:
- `pool_playlists(playlist_id)`
- `pool_playlists(pool_id)`
- `playlist_songs(playlist_id)`
- `playlist_songs(song_id)`
- `file_assets(song_id)`
- `file_locations(file_asset_id, status)`
- `job_items(song_id, status)`
- `job_stages(id, stage_type)`
- `playlist_download_preferences(playlist_id)`
- `playlist_download_preferences(is_wanted)`
## Error Handling
### Empty Selection
Bulk actions should reject empty `playlist_ids` with a validation error.
### Unknown Playlist IDs
If unknown ids are passed:
- ignore ids that do not exist
- fail only if the final valid set is empty
### Duplicate Playlist IDs
Normalize to unique ids before processing.
### Large Selection on One Page
The selected ids are page-scoped and therefore bounded by `page_size`.
This makes bulk requests predictable and low risk.
## Testing Strategy
### Repository and Query Tests
Add tests for:
- listing one playlist page with filters
- correct `total` count under filtering
- wanted marker persistence
- state aggregation across:
- `未同步`
- `未下载`
- `下载中`
- `部分已下载`
- `已下载`
### API Tests
Add tests for:
- `GET /api/playlists`
- `POST /api/playlists/mark-wanted`
- `POST /api/playlists/unmark-wanted`
- `POST /api/playlists/download`
- `POST /api/playlists/sync-download`
Verify that:
- created jobs use the expected job type
- `playlist_scope.playlist_ids` is stored correctly
- invalid or empty selection is rejected
### UI Tests
At minimum, validate rendered page content and form wiring for:
- pagination controls
- state filter controls
- current-page select-all
- bulk action forms
- wanted-only filter
### Regression Coverage
Keep existing `download_only` and `sync_download` behavior valid for callers outside the playlist page.
## Rollout Notes
The recommended rollout order is:
1. add the playlist preference table and repository helpers
2. add page-level playlist listing API with computed state
3. upgrade `/playlists` UI to pagination, filters, selection, and actions
4. add bulk job-creation endpoints for selected playlists
5. verify on NAS with a controlled subset of playlists before using it for wide library download
## Result
After this design lands, the operator workflow becomes:
1. open `/playlists`
2. filter to `未同步`, `未下载`, or `部分已下载`
3. select playlists on the current page
4. choose either:
- `下载已同步所选歌单`
- `同步后下载所选歌单`
- `加入待下载清单`
5. observe progress through the existing jobs and worker views
This changes playlist download from an uncontrolled whole-library operation into a scoped, inspectable, operator-driven workflow.
@@ -0,0 +1,529 @@
# Catalogsync Task Center And Download Lanes Design
## Goal
Rework the current operations console so the NAS web UI behaves like a real task center:
- `Dashboard` becomes the primary task control page
- task list and task detail are merged into one page with row expansion instead of forced page jumps
- all jobs that contain a `download` stage are serialized into one download lane
- collect and sync jobs can still run without being blocked by the download lane
- operators can run `sync_only` against selected playlists to fix `song_count = 0` or incomplete playlists
- download jobs surface real-time throughput, including per-task aggregate speed and per-worker song speed
This design extends the existing operations-console design rather than replacing the underlying SQLite-backed execution model.
## Confirmed Decisions
The following points were confirmed during design review:
- `Dashboard` should become the main task center instead of relying on `/jobs/{id}` as the normal interaction path
- the preferred layout is a single task table with inline expansion for details
- all job types that include a `download` stage are treated as download-class jobs:
- `catalog_sync`
- `sync_download`
- `download_only`
- `download_upload`
- download-class jobs may be created freely, but only one may run at a time
- non-download jobs such as `collect_only` and `sync_only` are not restricted by the single-download-job rule
- playlists with `song_count = 0` do not get a new dedicated status
- the playlists page must add `sync selected playlists`, implemented as `sync_only + playlist_scope`
- the task center should use compact icon-style controls:
- one toggle control for pause and resume
- one `X`-style control for cancel
- the task center should show download speed if a real value can be captured from the download pipeline
## Scope
### In Scope
- redesign the `Dashboard` page into a task center
- reduce the importance of `/jobs` and `/jobs/{id}` while keeping them available as fallback routes
- add lane-aware scheduling rules so only one download-class job runs at a time
- keep collect and sync jobs runnable without the global single-job bottleneck
- add playlist-bulk sync from the playlists page
- add structured task summary data for dashboard rows
- add real-time download throughput display for download workers and download tasks
- preserve the existing pause, resume, cancel, retry, and recovery model where possible
### Out Of Scope
- changing the core collect, sync, download, and upload business logic for provider behavior
- redefining playlist status taxonomy beyond the current states
- removing `/jobs/{id}` completely
- redesigning the UI as a separate SPA
- cross-machine or distributed workers
- changing upload scheduling policy in this iteration beyond fitting into the task-center UI
## User Experience Design
## Dashboard As Task Center
`/dashboard` becomes the single primary operator page.
The page should be reorganized into three layers:
1. Summary cards
2. Quick actions
3. Main task table
### Summary Cards
The top row should remain compact and operator-focused:
- total jobs
- running jobs
- queued download jobs
- paused jobs
- failed or completed-with-errors jobs
- running download songs
These cards are status indicators, not the main interaction surface.
### Quick Actions
Keep quick-launch actions, but make them secondary to the task table:
- full pipeline
- collect only
- sync only
- download only
- upload only
The existing manual job-creation form may remain, but it should be visually reduced so the page reads as a task center first.
### Main Task Table
Replace the current split between `Active Job`, `Recent Jobs`, and the hard jump into `/jobs/{id}` with one central task table.
Recommended columns:
- `ID`
- `Task`
- `Status`
- `Scope`
- `Primary Progress`
- `Active Workers`
- `Lane`
- `Actions`
Each row should support inline expansion.
### Inline Expanded Task Details
Expanding a row reveals the most useful parts of the current task detail page:
- stage summary
- playlist progress
- running items
- recent commands
- recent errors
The goal is that normal pause, resume, cancel, and progress inspection no longer require navigation away from `Dashboard`.
## Route Roles
### `/dashboard`
Primary operator view and daily control surface.
### `/jobs`
Fallback archive-like list for all jobs. It can be simplified because the main operator path is now `Dashboard`.
### `/jobs/{id}`
Keep as a deep-link route for troubleshooting, bookmarks, and future direct links from logs or notifications. It should no longer be the primary interaction path.
## Playlist Page Behavior
The playlists page should keep its current filtering and progress features and add one new bulk action:
- `sync selected playlists`
This action creates a `sync_only` job with `playlist_scope.playlist_ids`.
It must:
- sync the selected playlists
- update playlist-song links
- update `song_count`
It must not:
- download song files
- implicitly turn into `sync_download`
### `song_count = 0` Handling
Playlists with zero songs remain in the existing status model.
No extra dedicated state is introduced for this iteration.
To help operators understand what to do, the row may show a light hint such as:
- `0 songs, sync recommended`
But the filter model stays unchanged.
## Task Summary Model
The dashboard task table needs a job-summary projection that is richer than the current recent-jobs payload.
Each task row should expose:
- `id`
- `job_type`
- `display_name`
- `status`
- `scope_summary`
- `lane_type`
- `queue_position`
- `primary_progress_text`
- `primary_progress_percent`
- `active_worker_count`
- `can_pause`
- `can_resume`
- `can_cancel`
- `expanded_detail_payload`
### Display Name
Map internal job types to friendlier operator labels. Examples:
- `catalog_sync` -> `Full Pipeline`
- `collect_only` -> `Collect`
- `sync_only` with playlist scope -> `Sync Selected Playlists`
- `download_only` with playlist scope -> `Download Selected Playlists`
- `sync_download` with playlist scope -> `Sync Then Download`
### Scope Summary
Examples:
- `All sources`
- `12 playlists`
- `3 sources`
### Primary Progress
The progress shown in the main task row depends on task type:
- collect jobs:
- collected sources or collected pools summary
- sync jobs:
- synced playlists / target playlists
- download jobs:
- downloaded songs / target download songs
- upload jobs:
- uploaded files / target uploads
## Scheduling Design
## Current Limitation
The current runner effectively behaves like a global single-active-job scheduler.
That is insufficient for the new requirement because it would still block pure collect or sync jobs behind a long-running download-class job.
## Lane Model
Introduce two scheduler lanes:
- `download`
- `general`
### Download Lane
Contains any job whose stage sequence includes `download`.
This includes:
- `catalog_sync`
- `sync_download`
- `download_only`
- `download_upload`
Policy:
- only one download-lane job may be running at a time
- additional download-lane jobs remain queued in lane order
### General Lane
Contains jobs without a `download` stage.
This includes:
- `collect_only`
- `sync_only`
Policy:
- these jobs are not blocked by the single-download-job rule
- multiple general-lane jobs may run concurrently
### Recommended Default Concurrency
Assume:
- `DOWNLOAD_LANE_CONCURRENCY = 1`
- `GENERAL_LANE_CONCURRENCY = 3`
`GENERAL_LANE_CONCURRENCY` should be configurable later through env or runner settings, but default `3` is acceptable for the first implementation.
## Lane Assignment Rule
Lane assignment should be derived from the job stage sequence, not from separate operator flags.
This avoids drift between UI intent and scheduler behavior.
If a job contains `download` in `JOB_STAGE_SEQUENCES`, it belongs to the `download` lane.
## Queue Position
For download-lane jobs, the dashboard should expose:
- `running`
- `queued #1`
- `queued #2`
For general-lane jobs, queue display can be simpler:
- `general`
- `running`
- `queued`
The exact wording can stay simple as long as the operator can tell which jobs are blocked by the single-download rule.
## Runner Refactor Strategy
Do not rewrite stage executors.
Instead, refactor the scheduler layer so:
- lane eligibility is computed when choosing runnable jobs
- the runner can hold one active download-lane job
- the runner can hold multiple active general-lane jobs
- each running job continues to use the existing stage and worker machinery
This keeps risk concentrated in the orchestration layer rather than in provider-specific logic.
## Download Speed Design
## Requirement
The task center should display real download throughput rather than parsing console text heuristically.
### Why Text Parsing Is Rejected
Many providers already emit `MB/s` in rich terminal progress, but that output is not a stable API:
- formats differ by provider
- text may change without notice
- not all clients expose identical progress lines
Therefore the design must use structured progress reporting inside the download pipeline.
## Structured Throughput Model
During download-stage execution, each download worker should publish structured progress fields such as:
- `downloaded_bytes`
- `total_bytes`
- `speed_bytes_per_sec`
- `progress_percent`
These values should update the worker state and be aggregatable per task.
### Task-Level Speed
The dashboard row for a download-class task should show total live throughput across active download workers.
Example:
- `62 / 300 songs | 18.4 MB/s`
### Worker-Level Speed
The expanded worker section for a running download task should show, per worker:
- current song
- current speed
- downloaded bytes / total bytes when known
Example:
- `download-2 | Moonlight | 6.2 MB/s | 21.4 / 41.0 MB`
### Fallback Behavior
If structured speed is not available for a particular worker or provider:
- show `-`
- do not synthesize or guess a value from text logs
## API Changes
## Dashboard Payload
`GET /api/dashboard` must evolve from a light summary into a task-center payload.
It should return:
- summary cards
- quick-launch defaults
- task-center rows
- row detail summaries for tasks expanded by the UI
Each task row should include the task summary fields described above.
## New Playlist Sync Endpoint
Add:
- `POST /api/playlists/sync`
Behavior:
- validate `playlist_ids`
- create `sync_only`
- write `playlist_scope.playlist_ids`
- return created job summary
Existing playlist bulk endpoints remain:
- `mark-wanted`
- `unmark-wanted`
- `download`
- `sync-download`
## Job Detail Endpoint
`GET /api/jobs/{id}` remains the source of full detail.
The dashboard inline expansion may either:
- reuse the existing detail payload directly
- or consume a trimmed detail projection
The implementation may start by reusing the current payload for safety.
## Data Model Extensions
Prefer extending existing operations tables rather than introducing a second job schema.
### Job Summary Computation
The repository layer should compute dashboard-friendly projections rather than forcing templates to derive them ad hoc.
### Worker Progress Extension
`job_workers` state must be able to carry structured download progress.
This may be done by:
- adding typed columns
- or adding a compact JSON progress payload
Recommended preference:
- keep existing visible scalar columns
- add a small JSON payload if multiple dynamic throughput fields are needed
The implementation plan can choose the exact storage form.
## UI Controls
The dashboard task table should use compact icon-first controls:
- pause icon when the job is pausable
- resume icon when the job is resumable
- cancel `X` icon when the job is cancelable
- expand toggle for inline details
Low-frequency actions such as `retry item` remain inside expanded detail sections or the fallback detail page.
## Testing Strategy
## Scheduler Tests
Add tests that prove:
- only one download-lane job runs at a time
- a second download-lane job remains queued
- a general-lane `sync_only` job can run while a download-lane job is active
- `catalog_sync` is correctly classified into the download lane
## API Tests
Add tests for:
- `POST /api/playlists/sync`
- dashboard payload includes lane and task-summary fields
- dashboard renders compact action controls
- inline task detail data is available
## UI Rendering Tests
At minimum verify:
- dashboard contains the main task table
- dashboard no longer depends on a jump to detail as the primary control path
- playlists page contains `sync selected playlists`
- download tasks render speed fields and real values when available
## Regression Tests
Protect:
- existing pause, resume, cancel command flow
- `wanted_only=` empty-query compatibility
- playlist progress rendering
- task playlist progress rendering
## Rollout Plan
Recommended rollout order:
1. add dashboard-oriented task summary repository helpers
2. add lane-aware scheduling rules
3. add playlist bulk sync endpoint and button
4. redesign dashboard into the primary task center
5. add structured download throughput reporting
6. redeploy to NAS and verify live behavior
This order keeps correctness and scheduling changes ahead of cosmetic UI work.
## Risks
Primary technical risk:
- refactoring the runner from a single-global-job loop into a lane-aware multi-job scheduler
Risk reduction:
- keep stage executors intact
- concentrate changes in job selection and orchestration
- verify lane rules with focused tests before refining UI
Secondary risk:
- structured speed reporting may require touching downloader integration points across multiple providers
Risk reduction:
- start with download-stage worker instrumentation in `catalogsync`
- expose speed only when a real structured value is available
- degrade gracefully to `-` rather than inventing numbers
## Success Criteria
The design is successful when:
- operators can manage tasks primarily from `Dashboard`
- normal control flow does not require bouncing into `/jobs/{id}`
- multiple download-class jobs can be queued while only one runs
- collect and sync jobs are no longer unnecessarily blocked behind downloads
- selected playlists can be synced directly from the playlists page
- running download tasks show meaningful live throughput in the task center
@@ -0,0 +1,219 @@
# Catalogsync Task Tree Dashboard Design
## Goal
Replace the current Task Center detail tables with a stable tree view:
- task
- playlist
- song
The new Task Center must keep task nodes visible across status transitions such as `running -> paused -> completed`, and live refresh must update existing nodes instead of rebuilding the entire task table.
## Problem Statement
The current dashboard still feels unstable for two reasons:
1. The expanded task detail is rendered as a large HTML block containing `Summary`, `Stages`, `Workers`, `Running Items`, and `Playlist Progress`.
2. The frontend refresh path still calls `setTaskRows(...)`, which rebuilds the whole Task Center body and then rebinds all event handlers.
Even after paused tasks were kept in the query, this full redraw still recreates DOM nodes and causes visible flicker. It also makes the UI feel like a live report view instead of an operator task tree.
## Scope
### In Scope
- Rebuild only the `Task Center` section of `/dashboard`
- Keep the dashboard top cards for now:
- live snapshot
- summary
- quick actions
- create job
- playlist coverage
- Replace the task detail block with a three-level tree:
- task node
- playlist child node
- song child node
- Keep pause/resume/cancel actions on the task node
- Preserve expanded task and playlist state across refreshes
- Update task status, progress, and counts in place without full Task Center redraw
- Keep non-music resource labeling in song rows
### Out of Scope
- Redesigning the top dashboard cards
- Removing `/jobs`, `/songs`, or other pages
- Changing job execution semantics
- Changing download logic or retry semantics
## Chosen UI Structure
### Task Center Layout
`Task Center` becomes a single tree container instead of a table plus nested detail tables.
Each task node shows:
- expand/collapse toggle
- task display name
- task type
- task status
- scope summary
- primary progress text and progress bar
- active worker count
- lane label
- pause/resume button
- cancel button
When expanded, the task shows only its playlist children.
Each playlist node shows:
- expand/collapse toggle
- playlist name
- source label such as `qq #64`
- downloaded song count / total song count
- progress bar
- compact state summary:
- running
- pending
- failed
- skipped
When expanded, the playlist shows only its song children.
Each song node shows:
- sequence number
- song name
- singer summary
- platform/source id summary
- song status tag
- optional `非音乐资源` tag
- status note
This makes the page behave more like a file explorer tree and removes the distracting intermediate tables.
## Data Model and API Expectations
### Task List
`list_task_center_rows()` must return recent tasks for all operator-visible lifecycle states, not just active ones.
The intended visible states are:
- `queued`
- `running`
- `pause_requested`
- `paused`
- `completed`
- `completed_with_errors`
- `failed`
- `canceled`
This ensures a task stays in the tree when its state changes; only its displayed status changes.
### Task Detail
`/api/jobs/{job_id}` may continue returning its current payload shape, but the dashboard will only consume:
- `job`
- `playlist_progress`
The frontend will ignore `summary`, `stages`, `workers`, and `running_items` for Task Center rendering.
### Playlist Songs
`/api/jobs/{job_id}/playlists/{playlist_id}/songs` remains the lazy-loaded source for song child nodes.
The existing fields are sufficient:
- `position`
- `song_name`
- `singers`
- `platform`
- `remote_song_id`
- `status`
- `status_note`
- `is_non_music_resource`
## Rendering Strategy
### Initial Render
The server-rendered HTML for `/dashboard` should render a task tree shell directly, not a table with hidden detail rows.
### Live Updates
The Task Center refresh path must switch from full `innerHTML` replacement to keyed DOM patching.
Rules:
1. Task nodes are keyed by `job_id`
2. Playlist nodes are keyed by `job_id + playlist_id`
3. Song nodes are keyed by `job_id + playlist_id + song_id/position`
4. Existing nodes are updated in place
5. Expanded/collapsed state is preserved in `dashboardState`
6. Status changes never collapse or remove a visible node by themselves
### Removal Policy
Nodes may be removed only when they are absent from the latest server payload because they have truly fallen out of the visible result window, not because they changed from active to paused/completed.
## Refresh Model
### Dashboard Summary Refresh
The lightweight snapshot refresh may continue updating:
- live snapshot text
- summary numbers
- download stats
- playlist coverage
These sections can still use simple row replacement because they are small and not interactive.
### Task Tree Refresh
Task rows must be refreshed through a dedicated keyed patch function:
- update existing task header fields
- insert new task nodes
- remove missing task nodes only when no longer returned
- if a task is expanded, refresh its playlist subtree in place
- if a playlist is expanded, refresh its song subtree in place when fresh data arrives
The Task Center must no longer call a function that rebuilds the whole container on every poll.
## Error Handling
- Failed songs remain visible in the playlist subtree with their note
- Non-music resources remain visible and are labeled `非音乐资源`
- If playlist song loading fails, show the error message only inside that playlist node
- If task detail loading fails, show the error only inside that task node
## Testing Strategy
### Automated
- repository test: task list includes completed jobs
- API test: dashboard HTML renders the tree shell instead of detail tables
- API test: dashboard data continues to expose task rows for live refresh
### Manual
- open `/dashboard`
- expand one paused task
- expand one playlist
- wait through multiple refresh cycles
- verify:
- expanded task stays expanded
- expanded playlist stays expanded
- no `Summary / Stages / Workers / Running Items` blocks appear under tasks
- task status changes update text only, without the whole Task Center flashing
## Assumptions
- The operator still wants the top dashboard cards retained for now
- Recent finished tasks should remain visible in the Task Center instead of disappearing immediately
- The current lazy-load model for song lists is acceptable as long as the node itself stays stable
@@ -0,0 +1,268 @@
# Playlist Export To Local ZIP Design
**Date:** 2026-04-18
## Goal
统一 `Export` 的用户语义:
- 用户看到的 `Export` / `Export Selected` 都表示“导出到前端本地电脑”
- NAS 上的 `playlists/` 目录继续保留,但只作为服务端缓存与打包来源
- 导出流程变成:
- 先确保 NAS 上已有歌单目录
- 再由服务端打包 ZIP
- 最后由浏览器下载到用户本地
## Current Problem
当前系统存在语义混淆:
- 页面里的 `Export Folder` / `Export Selected Playlists` 实际上是在 NAS 上生成或刷新 `playlists/<歌单目录>/`
- 用户会自然把“导出”理解成“下载到当前浏览器所在电脑”
- 结果是:
- NAS 目录导出和前端本地导出没有区分
- 用户看到按钮名时会误解
- 批量导出到本地还没有真正落地
## Target UX
### 1. Export means browser download
- 歌单详情弹窗按钮文案:
- `Export Folder` -> `Export`
- 歌单列表批量按钮文案:
- `Export Selected Playlists` -> `Export Selected`
这两个按钮对用户都统一表示:
- 导出到当前前端本地
- 浏览器触发文件下载
### 2. NAS playlists directory becomes internal cache
- NAS 上继续维护:
- `playlists/<歌单名_歌单ID>/playlist.yaml`
- `.playlist_meta.json`
- `covers/...`
- 但这不再是主要用户可见动作
- 用户真正点击 `Export` 时:
- 若 NAS 目录已存在且可用,则直接复用
- 若不存在,则先生成
- 然后打包为 ZIP 返回给浏览器
### 3. Single playlist export
- 单歌单点击 `Export`
- 返回一个 ZIP
- ZIP 内包含该歌单目录完整内容
建议 ZIP 文件名:
```text
playlist-<platform>-<playlist_id>-<sanitized_name>.zip
```
ZIP 结构:
```text
歌单名_歌单ID/
playlist.yaml
.playlist_meta.json
covers/
playlist-cover.jpg
song-1-xxxx.jpg
```
### 4. Multi-playlist export
- 批量选择多个歌单后点击 `Export Selected`
- 返回一个 ZIP
- ZIP 内包含多个歌单目录
建议 ZIP 文件名:
```text
playlists-export-YYYYMMDD-HHMMSS.zip
```
ZIP 结构:
```text
playlists/
歌单A_123/
playlist.yaml
.playlist_meta.json
covers/
...
歌单B_456/
playlist.yaml
.playlist_meta.json
covers/
...
```
## Export Readiness Rules
导出前按歌单状态分流:
- `downloaded`
- 直接确保 NAS 目录存在
- 然后进入打包
- `unsynced`
- 不能在同一个 HTTP 请求里边下载边等待
- 创建 `sync_download` 后台任务
- 当前请求返回“已入队,暂时不能立即导出”
- `not_downloaded`
- 创建 `download_only` 后台任务
- 当前请求返回“已入队,暂时不能立即导出”
- `partial`
- 创建 `download_only` 后台任务
- 当前请求返回“已入队,暂时不能立即导出”
- `downloading`
- 不重复创建任务
- 返回“正在处理中,稍后再导出”
批量导出规则再额外收紧一条:
- 只有“所选全部歌单都已可导出”时,才返回一个最终 ZIP 下载
- 只要所选集合里有任意歌单需要先同步/下载,就本次不返回部分 ZIP
- 这样可以保证 `Export Selected` 的结果始终对应“这次选中的完整集合”
## API Design
### Keep
- `GET /api/playlists/{playlist_id}/export-folder`
- 保留
- 作为服务端目录刷新与定位能力
- 不作为最终用户主导出接口
### Add
#### `GET /api/playlists/{playlist_id}/export.zip`
行为:
- 读取歌单状态
- 若已可导出:
- 确保 NAS 目录存在
- 临时打包为 ZIP
- 以二进制下载响应返回
- 若尚不可导出:
- 返回 `409`
- 响应体说明当前状态与下一步动作建议
#### `POST /api/playlists/export-zip`
请求体:
```json
{
"playlist_ids": [1, 2, 3],
"requested_by": "ops-console"
}
```
返回分两类:
1. 所选全部歌单都可立即导出
```json
{
"status": "ready",
"download_url": "/api/exports/bundles/<token>.zip",
"playlist_ids": [1, 2, 3]
}
```
2. 所选集合里有歌单尚不可立即导出
```json
{
"status": "queued",
"message": "2 playlists queued for sync/download before export.",
"download_job": {...},
"sync_download_job": {...},
"blocked_playlist_ids": [2, 3],
"ready_playlist_ids": [1]
}
```
注意:
- `ready_playlist_ids` 只是状态说明,不表示本次会先下载一个“部分 ZIP”
- 当前批次只要不是 `status=ready`,前端就不触发本地下载
- 用户应等待后台任务完成后再次点击 `Export`
### Optional helper endpoint
#### `GET /api/exports/bundles/{token}.zip`
- 下载已准备好的临时 ZIP
- token 指向临时打包结果
- 便于前端先请求准备,再触发浏览器下载
## Backend Packaging Strategy
### Why not build ZIP directly from DB payload
不建议直接从数据库临时拼:
- YAML、封面、目录结构已经在 `playlists/` 目录中固化
- 当前系统已经有一套歌单目录生成链路
- 直接复用目录再打包,行为更稳定,也更容易和 NAS 目录保持一致
### Packaging flow
1. 收到 export 请求
2. 判断歌单是否立即可导出
3. 对可导出的歌单调用 `ensure_playlist_artifacts_for_playlist(...)`
4. 收集歌单目录路径
5. 在临时目录生成 ZIP
6. 通过下载响应返回给前端
7. 请求结束后删除临时 ZIP,或短期缓存后清理
## Frontend Behavior
### Single playlist modal
- 按钮文案改成 `Export`
- 点击后:
- 调用单歌单 ZIP 导出接口
- 若可立即导出,浏览器直接下载
- 若不可立即导出,弹出状态提示
### Playlist list bulk export
- 按钮文案改成 `Export Selected`
- 点击后:
- 调用批量 ZIP 导出接口
- 若可立即导出,则自动开始下载一个 ZIP
- 若需先同步/下载,则提示已创建后台任务
## Error Handling
- `404`
- 歌单不存在
- `409`
- 歌单尚未同步或下载,不可立即导出
- `500`
- 打包失败
- 封面刷新失败不应导致整个导出失败;只要目录可生成,就继续打包
## Naming Rules
- 用户按钮文案只用 `Export` / `Export Selected`
- 若页面仍需保留查看 NAS 路径的能力,应另命名为:
- `Show NAS Folder`
-`Show Server Folder`
不能再把“导出到本地”和“在 NAS 生成目录”共用一个 `Export` 名字。
## Testing
- 单歌单 `Export` 可返回 ZIP 下载
- 批量 `Export Selected` 可返回包含多个歌单目录的 ZIP
- 已有 NAS 目录时不重复生成
- 未同步/未下载时返回 queued/blocked,而不是长时间卡住 HTTP 请求
- 前端按钮文案与实际行为一致
@@ -0,0 +1,93 @@
# Playlist Export On Download Design
**Date:** 2026-04-18
## Goal
`playlists/` 目录产出从 `sync` 链路移到“所选歌单下载链路”,并新增“输出所选歌单”能力,让已下载歌单可单独补输出,未同步/未下载歌单则自动走 `sync + download` 后输出。
## Current Problem
- `CatalogSyncService.sync_playlist_row()` 目前会直接写 `playlists/<歌单名_id>/`
- 这会导致“只是同步歌单”也生成导出目录,和用户期望不一致。
- 已下载歌单想补生成 `playlist.yaml` / 封面时,只能重新走旧链路,容易让人误解成要重下歌曲。
## Target Behavior
### 1. Sync no longer writes playlist artifacts
- `sync` 只负责:
- 拉歌单歌曲
- 回填歌单热度
- 更新歌手池/歌曲关联
- `sync` 完成后不再自动写 `playlists/`
### 2. Scoped download writes playlist artifacts
- 当任务是“所选歌单”的 `download_only``sync_download`,并且下载阶段结束后:
- 为该任务作用域内的歌单刷新 `playlists/<歌单名_id>/`
- 写入最新 `playlist.yaml`
- 拉取歌单封面和歌曲封面
- 这样已下载歌曲会把 `local_file_path` 带进导出目录,但不会要求整库重下。
### 3. Add Export Selected Playlists action
- 歌单页增加 `Export Selected Playlists` 按钮。
- 后端按歌单状态分流:
- `unsynced` -> 放入 `sync_download`
- `not_downloaded` / `partial` -> 放入 `download_only`
- `downloaded` -> 直接按当前数据库输出到 `playlists/`
- 一个请求允许同时返回:
- 直接导出的歌单
- 新建的 `download_only` 任务
- 新建的 `sync_download` 任务
### 4. Single-playlist export remains available
- `GET /api/playlists/{id}/export-folder` 保留。
- 它只按当前数据库状态刷新/返回歌单目录。
- 不负责自动触发下载。
## Data Rules
- 是否“已下载”继续以数据库中的本地 `file_locations` 为准。
- `playlist.yaml` 中的 `local_file_path` 只来源于数据库已有的本地活跃位置。
- 导出目录刷新不写入新的歌曲下载记录,只是把已有数据库状态重新投影到文件系统。
## API Changes
- 保留:`GET /api/playlists/{playlist_id}/export-folder`
- 新增:`POST /api/playlists/export`
请求体:
```json
{
"playlist_ids": [1, 2, 3],
"requested_by": "ops-console"
}
```
响应体:
```json
{
"exported_playlist_ids": [1],
"exported_count": 1,
"download_job": {"id": 11},
"sync_download_job": {"id": 12}
}
```
## Implementation Notes
- 复用现有歌单状态判定逻辑,避免前后端自己猜状态。
- 下载后写导出目录的触发点放在 `OpsRunner` 的下载阶段完成后,而不是单首歌曲完成后。
- 仅对带 `playlist_scope.playlist_ids` 的下载任务执行自动导出,避免“全量下载”顺手刷全库导出目录。
## Testing
- `sync_playlist_row()` 不再生成 `playlists/`
- 作用域下载任务完成后会调用歌单导出
- `POST /api/playlists/export` 对不同状态的歌单正确分流
- 已下载歌单直接导出,不创建多余下载任务
@@ -0,0 +1,463 @@
# Catalogsync Download Dual-Pool Pipeline Design
## Goal
Improve real download concurrency without changing the sync stage or introducing a sync-time download URL cache.
The current bottleneck is not the byte-transfer implementation itself. The real bottleneck is that each download worker performs two very different jobs in sequence:
1. resolve a usable download source and URL
2. transfer audio bytes and record the finished file
In production, source resolution often takes tens of seconds while the final audio transfer may take around one second. As a result, `DOWNLOAD_WORKERS=10` behaves like ten mixed workers waiting on resolve work instead of ten true download workers.
This design splits the download stage into a two-pool in-memory pipeline:
- `resolver pool`
- `download pool`
The sync stage remains unchanged. Songs are still stored as deferred snapshots and download URLs are still resolved at download time.
## Confirmed Decisions
The following points were confirmed during design discussion:
- do not change the sync stage
- do not introduce a sync-time download URL cache as part of this iteration
- focus only on download-stage behavior
- the target outcome is that download workers spend their time on actual downloads instead of long source-resolution work
- UI clarity matters:
- operators should be able to tell which workers are resolving and which workers are downloading
- existing database schema should be preserved if possible
## Scope
### In Scope
- split the download stage into resolver workers and downloader workers
- keep the existing job, stage, and item lifecycle model
- preserve existing deferred snapshot storage
- preserve current local file recording and quality detection behavior
- surface resolver activity and download activity clearly in worker state
- keep pause, cancel, and recovery semantics compatible with the current runner
### Out Of Scope
- changing playlist sync behavior
- persisting resolved download URLs across runs
- redesigning source ranking logic
- changing upload behavior
- changing the meaning of song uniqueness in the database
- introducing distributed workers or external queues
## Problem Statement
Current download-stage flow:
1. a runner worker claims a download item
2. the worker calls `CatalogDownloader.download_song_row(...)`
3. inside that flow, the same worker:
- deserializes the deferred snapshot
- resolves a usable source across multiple providers
- downloads the final audio file
- records the local file
This model creates two user-visible problems:
- most workers appear idle from a transfer perspective because they are blocked in source resolution
- byte-transfer concurrency is much lower than the configured worker count
Recent production measurements showed the pattern clearly:
- source resolution commonly takes about `77-83s`
- actual file download commonly takes about `1s`
So the current worker pool is structurally spending most of its time in the wrong phase.
## Approaches Considered
### Approach A: Keep Single Pool And Only Improve UI
Show resolver activity more clearly, but keep each worker as `resolve + download`.
Pros:
- smallest code change
- no pipeline coordination logic
Cons:
- does not materially improve true download concurrency
- preserves the main performance bottleneck
Decision:
- rejected for this task because it improves observability but not throughput
### Approach B: Implement Dual Pools Inside `CatalogDownloader`
Move queueing and split-pool logic into `CatalogDownloader`.
Pros:
- conceptually local to download code
- useful for non-ops batch paths
Cons:
- mismatches the current ops runner lifecycle
- complicates job item ownership, pause, cancel, and worker naming
- less natural for the NAS task center, which already manages workers at runner level
Decision:
- not preferred for this iteration
### Approach C: Implement Dual Pools At Download Stage Runner Level
Create a download-stage pipeline in the ops runner:
- resolver workers claim items and produce ready-to-download tasks
- downloader workers consume ready tasks and perform final transfer
Pros:
- fits current job/stage/item orchestration naturally
- keeps worker ownership explicit
- lets dashboard show separate resolver and downloader workers
- delivers real transfer concurrency gains without changing sync behavior
Cons:
- more control-flow complexity in the runner
- requires careful queue shutdown and pause/cancel handling
Decision:
- recommended
## Recommended Design
## High-Level Architecture
During a `download` stage, the runner will create a bounded in-memory queue:
- `ready_queue`
The stage will use two thread pools:
- `resolver pool`
- `download pool`
### Resolver Pool Responsibilities
- claim pending download items
- check whether the song is already downloaded
- build the download row
- resolve a usable `SongInfo` with a valid download URL
- publish a `ResolvedDownloadTask` into `ready_queue`
- mark the item failed immediately if resolution cannot produce a usable download target
### Download Pool Responsibilities
- consume `ResolvedDownloadTask` instances from `ready_queue`
- execute actual file download only
- emit transfer progress
- record local file metadata
- mark the item succeeded or failed
This separates long-latency provider resolution from short, bandwidth-heavy transfer work.
## New Internal Data Model
Introduce an internal in-memory task object for the stage, for example:
```python
@dataclass
class ResolvedDownloadTask:
item_id: int
row: dict[str, Any]
resolved_song_info: Any
display_text: str
target_library_root: Path
```
This object is not persisted to the database in this iteration.
## Worker Model
The dashboard should show two worker families for a running download stage:
- `resolve-1`, `resolve-2`, ...
- `download-1`, `download-2`, ...
This is intentional. The operator should be able to distinguish:
- workers currently finding a usable source
- workers currently transferring bytes
`transfer_stats` should continue to count only workers with real transfer speed values.
## Download Stage Flow
### Step 1: Stage Startup
When the runner enters a `download` stage:
1. compute total worker budget from existing configuration
2. split it into resolver and downloader counts
3. create a bounded `ready_queue`
4. start resolver pool and downloader pool
### Step 2: Item Resolution
Each resolver worker loops until:
- no more claimable items remain
- pause or cancel is requested
- pipeline shutdown is triggered
For each claimed item:
1. load row data
2. skip immediately if already downloaded
3. emit resolver progress such as `resolving source qq (1/6)`
4. call a new downloader API that resolves but does not download
5. enqueue a `ResolvedDownloadTask` on success
6. mark failed on resolution failure
### Step 3: Pure Download Execution
Each downloader worker loops until:
- a shutdown sentinel is received
- pause or cancel is requested and the queue has drained according to the chosen shutdown policy
For each resolved task:
1. emit `starting download via <platform>`
2. monitor file growth and emit transfer stats
3. record the local file on success
4. mark the item succeeded or failed
## CatalogDownloader API Refactor
Keep the current public behavior but split the implementation into two explicit phases.
### New Methods
- `resolve_song_row(...) -> ResolvedDownloadPayload | None`
- `download_resolved_song(...) -> bool`
Where:
- `resolve_song_row(...)` handles snapshot deserialization, source resolution, target directory selection, and worker text for the resolver phase
- `download_resolved_song(...)` performs only final download, monitor setup, file recording, and quality detection
### Compatibility Method
Keep:
- `download_song_row(...)`
But turn it into a compatibility wrapper:
1. resolve
2. download
This preserves existing unit-test entry points and any non-ops call sites.
## Worker State Design
Resolver workers should update:
- `current_song_id`
- `current_display_text`
- `last_progress_text`
Example messages:
- `resolving source qq (1/6)`
- `resolving source kuwo (2/6)`
- `resolved via qq`
Downloader workers should update:
- `current_song_id`
- `current_display_text`
- `last_progress_text`
- `downloaded_bytes`
- `total_bytes`
- `speed_bytes_per_sec`
- `progress_percent`
Example messages:
- `starting download via qq`
- `12.00MB/48.00MB`
## Concurrency Split
Do not require a schema change or mandatory new env vars for the first version.
Recommended default behavior:
- if total download worker budget is `1`, use `1 resolver, 0 downloader` is invalid, so coerce to single-thread compatibility path
- if total is `2`, use `1 resolver + 1 downloader`
- if total is `>= 3`, use approximately `30% resolver` and `70% downloader`
Initial recommended rule:
```text
resolver_workers = max(1, min(3, total_workers // 3))
download_workers = max(1, total_workers - resolver_workers)
```
For `DOWNLOAD_WORKERS=10`, this gives:
- `3 resolver`
- `7 downloader`
This is a reasonable first cut and avoids over-investing worker budget in resolution.
## Queue Design
Use a bounded in-memory queue to avoid resolver workers running too far ahead.
Recommended initial capacity:
- `download_workers * 2`
Why bounded:
- prevents unbounded memory growth
- keeps resolution work closer to actual download demand
- simplifies pause and cancel behavior
## Pause, Cancel, And Shutdown Behavior
### Pause
When pause is requested:
- resolver workers stop claiming new items
- downloader workers may finish in-flight downloads
- stage reconciliation remains based on existing item states
This matches current expectations better than attempting hard interruption of active downloads.
### Cancel
When cancel is requested:
- resolver workers stop claiming new items immediately
- downloader workers stop after their current task boundary where possible
- no new resolved tasks should be enqueued after cancellation is observed
### Queue Shutdown
After resolver workers finish, the runner should send explicit queue sentinels so downloader workers can exit cleanly once the queue drains.
## Failure Handling
### Resolution Failure
If resolution cannot produce a valid downloadable `SongInfo`:
- mark the item failed immediately
- do not enqueue it for download
### Download Failure
If pure download fails after resolution:
- mark the item failed
- preserve the existing error formatting model
### Resolver Success But Queue/Shutdown Race
If the pipeline is shutting down and a resolver has a resolved task ready:
- prefer not enqueuing new work after pause/cancel has been observed
- let the item remain in a recoverable state according to current reconciliation rules
The first implementation should prefer correctness over aggressive continuation.
## Why This Improves Throughput
Under the current model, ten workers spend most of their time waiting on provider resolution.
Under the dual-pool model:
- a small resolver pool continues finding usable sources
- a larger downloader pool stays focused on byte transfer
This does not make provider resolution free, but it stops long resolution latency from occupying the same worker budget needed for real downloads.
The expected operator-visible result is:
- multiple downloader workers can show real transfer progress concurrently
- resolver workers remain visible as separate activity instead of appearing as fake download workers
## Testing Strategy
## Unit Tests
Extend downloader tests to cover:
- `resolve_song_row(...)` returning a resolved payload without downloading
- `download_resolved_song(...)` preserving existing progress and file-recording behavior
- compatibility wrapper `download_song_row(...)` still working
## Runner Tests
Add runner tests for:
- worker split calculation
- resolver workers feeding downloader workers through a queue
- successful completion of mixed resolved tasks
- pause and cancel behavior while the queue is non-empty
- clean worker shutdown after resolver completion
## Dashboard-Oriented Tests
Add ops tests to verify:
- resolver workers appear with resolver progress text
- downloader workers expose transfer metrics
- aggregate transfer stats ignore resolver-only workers
## Rollout Plan
1. refactor `CatalogDownloader` into resolve-only and download-only phases
2. add dual-pool execution path for the download stage in the runner
3. keep the old single-call wrapper for compatibility
4. update worker naming and dashboard expectations
5. run targeted NAS verification:
- confirm simultaneous non-zero transfer speed on more than one downloader worker
- confirm resolver workers remain visible separately
## Open Questions Resolved In This Design
- Should sync be changed to resolve URLs early?
- No.
- Should this iteration add persistent URL caching?
- No.
- Should resolver and downloader state share the same worker names?
- No. Separate names are clearer and better match reality.
- Should the first version require schema changes?
- No.
## Summary
The recommended change is to keep deferred snapshots exactly as they are and redesign only the download-stage execution model.
Instead of ten mixed workers doing `resolve + download`, the system should run a two-pool pipeline:
- a small resolver pool that turns deferred snapshots into ready download tasks
- a larger downloader pool that performs real file transfer
This is the smallest architecture change that directly targets the current bottleneck while preserving the existing sync model and database schema.
@@ -0,0 +1,324 @@
# Catalogsync Resolver Source Ranking Design
## Goal
Improve resolver throughput without sacrificing cross-run learning by introducing a persistent, isolated source-ranking store.
The resolver should keep treating the song's original platform as the preferred source, but fallback order should become adaptive instead of fixed. The adaptive order must:
- learn over time across jobs and restarts
- be grouped by original source instead of using one global ranking
- stay isolated from the main catalog business tables
- preserve the current "keep trying later sources if earlier ones fail" behavior
## Confirmed Decisions
The following points were confirmed during design discussion:
- the ranking model is grouped by original source
- statistics must persist across tasks and service restarts
- the statistics store must be isolated from the main business schema
- the original source is still tried first
- after the warmup threshold is reached, fallback should try the top two ranked sources first
- if the top two fallback sources fail, resolver must continue trying the remaining sources
## Scope
### In Scope
- add a dedicated resolver statistics SQLite side database
- record persistent fallback attempt and success statistics by `(origin_source, candidate_source)`
- use fallback statistics to reorder sources after a warmup threshold
- keep the original source as the first attempt
- preserve existing resolver matching and candidate selection logic within a single source
- cover side-database initialization, repository methods, ranking logic, and resolver behavior with tests
### Out Of Scope
- changing sync-stage behavior
- caching download URLs across runs
- changing song uniqueness rules
- replacing the current matching heuristics inside a source
- adding a UI for resolver statistics in this iteration
- distributed or external metrics storage
## Problem Statement
The current resolver still spends too much time in fallback traversal.
Today, resolver behavior is:
1. derive the original platform as `preferred_source`
2. try the preferred source first
3. if preferred-source fast return does not happen, continue through the configured fallback list
4. within fallback traversal, source order is static and does not learn from production outcomes
This causes two operational problems:
- fallback time is longer than necessary because low-yield sources keep being retried early
- the system does not accumulate knowledge from prior jobs, so every restart returns to the same static ordering
The result is that resolver throughput remains bursty even after the dual-pool pipeline work, because the ready queue is still fed by a fallback strategy that does not adapt.
## Approaches Considered
### Approach A: Global Ranking For All Sources
Keep one success-rate table for all candidate sources regardless of original platform.
Pros:
- simplest data model
- easiest ranking query
Cons:
- mixes very different source relationships
- large-volume platforms can dominate the ranking
- does not reflect that `qq -> kuwo` and `netease -> kuwo` may behave differently
Decision:
- rejected because the learning model should follow the original source
### Approach B: In-Memory Per-Run Learning Only
Track statistics only for the current job and discard them at task end.
Pros:
- no schema work
- easy to experiment with
Cons:
- restarts lose all learning
- long warmup every time
- directly conflicts with the requirement for cross-run reuse
Decision:
- rejected
### Approach C: Persistent Side Database Grouped By Original Source
Store statistics in a dedicated SQLite side database keyed by original source and fallback source.
Pros:
- matches the confirmed grouping model
- survives restarts and future jobs
- keeps analytics-style tables isolated from the main business schema
- easy to evolve independently from catalog tables
Cons:
- requires one more database file and repository
- adds coordination between resolver and statistics store
Decision:
- recommended
## Recommended Design
## High-Level Architecture
Add a dedicated resolver statistics store, for example:
- `resolver_stats.db`
This database is initialized separately from `catalogsync.db` and contains only resolver-learning tables.
The main download flow remains:
1. build `target_song_info`
2. determine `preferred_source`
3. try preferred source first
4. reorder fallback sources using persistent statistics when warmup criteria are met
5. try ranked top two fallback sources first
6. if still unresolved, continue the remaining fallback sources in ranked order
The existing resolver still owns matching, candidate picking, and final candidate selection inside each source.
## Statistics Model
The learning key is:
- `origin_source`
- `candidate_source`
Where:
- `origin_source` is the normalized original platform for the song being resolved
- `candidate_source` is a fallback source actually attempted after the original source path failed
Statistics are recorded only for fallback attempts. Preferred-source attempts are not stored in this side database for the first iteration because the ranking problem is specifically about fallback order.
### Stored Counters
Each row should persist:
- `origin_source`
- `candidate_source`
- `attempt_count`
- `resolve_success_count`
- `last_attempt_at`
- `last_success_at`
- `created_at`
- `updated_at`
## Warmup and Ranking Rules
### Warmup Threshold
The warmup threshold is not global song count. It is the total fallback sample count for a specific `origin_source`.
Example:
- `qq` fallback learning activates only after the sum of all `qq -> *` fallback attempts reaches `1000`
- `netease` fallback learning activates independently after the sum of all `netease -> *` fallback attempts reaches `1000`
### Ranking Formula
Use a smoothed success rate:
`(resolve_success_count + 1) / (attempt_count + 2)`
This avoids unstable rankings when sample counts are still low.
### Ranked Traversal
For a song with original source `origin_source`:
1. always try `preferred_source` first
2. if preferred source does not resolve a high-confidence downloadable result, enter fallback
3. if `origin_source` warmup threshold is not met:
- keep the configured fallback order
4. if `origin_source` warmup threshold is met:
- sort fallback candidates by smoothed success rate, highest first
- preserve configured order as the tie-breaker
- try the top two ranked fallback sources first
- if both fail, continue with the remaining ranked fallback sources
This preserves completeness while improving average-case resolution speed.
## Resolver Flow Changes
Resolver source ordering should become a two-phase plan:
### Phase 1: Preferred Source
- derive `preferred_source` from the snapshot or row platform
- try preferred-source refresh
- try preferred-source search
- if a preferred-source high-confidence result is found, return immediately
### Phase 2: Ranked Fallback
- build fallback candidates from configured `download_sources` excluding `preferred_source`
- ask the resolver stats repository for the ranked order for this `origin_source`
- attempt fallback sources in that order
- after each fallback attempt:
- record one attempt
- if that source resolves a usable candidate, record one success and stop
- if a fallback source fails to produce a usable candidate, continue to the next source
The resolver should still stop at the first acceptable fallback success in this iteration rather than exhaustively scanning later sources for a possibly better file.
## Side Database Schema
The side database should stay minimal for the first version.
### Table: `resolver_source_stats`
- `origin_source TEXT NOT NULL`
- `candidate_source TEXT NOT NULL`
- `attempt_count INTEGER NOT NULL DEFAULT 0`
- `resolve_success_count INTEGER NOT NULL DEFAULT 0`
- `last_attempt_at TEXT`
- `last_success_at TEXT`
- `created_at TEXT DEFAULT CURRENT_TIMESTAMP`
- `updated_at TEXT DEFAULT CURRENT_TIMESTAMP`
- primary key: `(origin_source, candidate_source)`
Recommended indexes:
- primary key already covers lookup by `(origin_source, candidate_source)`
- index on `(origin_source)` for ranking queries
## Repository Boundary
Introduce a dedicated repository for the side database, for example:
- `ResolverStatsRepository`
Responsibilities:
- initialize side-database schema
- upsert attempt and success counters
- report total fallback samples for an origin source
- return ranked fallback candidates for an origin source given the configured fallback list
The main `CatalogRepository` should not absorb this responsibility. Keeping the side database behind a dedicated repository keeps the separation explicit and prevents statistics logic from leaking into core business persistence.
## Configuration and File Layout
Add a dedicated resolver statistics database path derived from the application root, for example:
- `<APP_HOME>/data/resolver_stats.db`
This path should be configurable but should default automatically so current operators do not need new setup work.
The service should initialize both:
- main catalog database
- resolver statistics side database
Service startup should not fail if the side database is empty; it should be created on demand.
## Error Handling
Resolver statistics must not become a single point of failure.
If the side database update fails:
- do not fail the actual download item
- log the statistics error
- continue resolver fallback using the best available in-memory ordering for that invocation
If the side database ranking query fails:
- fall back to configured source order
This keeps the ranking system opportunistic rather than mission-critical.
## Testing Strategy
Tests should cover:
- side-database schema creation
- isolated side-database repository queries and updates
- warmup not reached:
- configured fallback order is preserved
- warmup reached:
- fallback order is re-ranked by per-origin-source statistics
- top-two-first behavior:
- top two ranked fallback sources are attempted before the rest
- continuation behavior:
- if top two fail, later sources are still attempted
- grouping behavior:
- `qq` ranking does not affect `netease` ranking
- graceful degradation:
- side-database failure falls back to configured order instead of failing the item
## Acceptance Criteria
- resolver statistics are stored in a dedicated SQLite side database rather than the main business database
- fallback statistics persist across jobs and service restarts
- ranking is grouped by original source
- before the warmup threshold, fallback order matches configured source order
- after the warmup threshold, top two fallback candidates for an origin source are tried first according to smoothed success rate
- if the top two fallback candidates fail, resolver still attempts the remaining fallback sources
- statistics-store failures do not fail the download item outright
- automated tests cover ranking, grouping, warmup, and fallback-to-configured-order behavior