Files
musicdl-catalog-sync-suite/catalog-sync/docs/superpowers/specs/2026-04-15-playlist-file-run-design.md
T

5.5 KiB

Playlist File Run Design

Goal

Add a file-driven playlist execution path to musicdl.catalogsync so a user can provide a text file of playlist URLs and run the existing catalog sync and download pipeline against only those playlists.

The default behavior must remain unchanged when the new option is not used.

Scope

In Scope

  • Add --playlist-file to the existing run command
  • Support two input line formats:
    • raw playlist URL
    • platform,playlist_url
  • Ignore blank lines and comment lines beginning with #
  • Auto-detect netease, qq, or kuwo from URL when platform is omitted
  • Deduplicate repeated playlist URLs within the same input file
  • Import file playlists into the existing catalog tables
  • Run sync and download only for playlists referenced by the file
  • Keep song and file dedupe behavior exactly as it works today

Out of Scope

  • Incremental skip mode
  • New collect-mode behavior
  • New database tables for file imports
  • GUI integration
  • Upload automation

Constraints

  • Reuse the existing playlists, playlist_pools, and pool_playlists tables
  • Preserve current run behavior when --playlist-file is absent
  • Do not create duplicate playlist rows for the same (platform, remote_playlist_id)
  • Do not widen download scope to the full database when a playlist file is used
  • Keep implementation small and aligned with the current catalogsync package layout

User-Facing Behavior

Default Run Path

When --playlist-file is not provided:

  1. run collects playlist pools from configured sources
  2. run syncs playlists from the database
  3. run downloads pending songs from the database

This matches the current behavior exactly.

File-Driven Run Path

When --playlist-file <path> is provided:

  1. Skip collect
  2. Read and parse the file
  3. Normalize and deduplicate the playlist entries from the file
  4. Upsert those playlists into the existing catalog database
  5. Attach them to a dedicated pool row representing the source file import
  6. Sync only those playlist IDs
  7. Download only songs belonging to those playlist IDs

Input File Rules

Each non-empty, non-comment line must be one of:

https://music.163.com/#/playlist?id=17745989905
qq,https://y.qq.com/n/ryqq/playlist/7707261125

Parsing rules:

  • Leading and trailing whitespace is trimmed
  • Blank lines are ignored
  • # ... lines are ignored
  • If a comma is present, split once into platform and url
  • If no platform is provided, infer it from the URL
  • Unsupported or unrecognized lines are reported and skipped
  • Repeated URLs in the same file are processed only once

Architecture

The feature should be implemented as a narrow branch off the existing run workflow.

Recommended units:

  • A file parser helper that converts input lines into normalized playlist import entries
  • A service method that imports manual playlists into the existing playlist catalog
  • A service method that syncs only a provided list of playlist IDs
  • A downloader method that queues only songs reachable from a provided list of playlist IDs

This keeps the current full-database path intact while adding a targeted path for file-based execution.

Data Model

No new tables are required.

The imported playlists should reuse:

  • playlists
  • playlist_pools
  • pool_playlists

Recommended pool representation:

  • pool_kind = manual_file
  • external_id = manual_file:<resolved file path>
  • name = Manual File Import: <filename>

This preserves provenance without changing the main playlist model.

Dedupe Behavior

Playlist Rows

Duplicate playlist rows must not be created because playlists is already unique on (platform, remote_playlist_id).

Songs

Repeated sync of the same playlist may re-run parsing, but songs must continue to upsert by (platform, remote_song_id) and playlist-song links must remain unique by (playlist_id, song_id).

Files

Downloads must continue to rely on the existing file_locations and local-file checks so already downloaded songs are not fetched again.

Error Handling

  • Missing playlist file path: fail fast with a clear CLI error
  • File exists but contains no valid playlist lines: fail fast with a clear CLI error
  • Invalid individual lines: warn and skip, continue processing the rest
  • Playlist parse failure for one playlist: log the failure, continue with the remaining playlists
  • Download failure for one song: preserve the existing downloader behavior

Output

The file-driven run path should report a compact summary including:

  • total lines read
  • valid playlist entries
  • skipped invalid lines
  • deduplicated playlist count
  • synchronized song count
  • downloaded song count

Tests

Add coverage for:

  • file parsing of URL-only and platform,url lines
  • blank lines and comment handling
  • same-file URL dedupe
  • unsupported line handling
  • run --playlist-file ... taking the file-driven branch instead of collect
  • manual playlist import into a manual_file pool
  • sync limited to provided playlist IDs
  • download limited to songs linked to provided playlist IDs
  • repeated execution not creating duplicate playlist rows or duplicate local file downloads

Acceptance Criteria

  • run --playlist-file <path> processes only playlists from the file
  • omitting --playlist-file preserves current behavior
  • duplicate URLs inside one file are processed once
  • repeated runs do not create duplicate playlist rows
  • repeated runs do not redownload already owned local files
  • tests cover the file-driven branch and targeted sync/download behavior