Files
musicdl-catalog-sync-suite/catalog-sync/docs/superpowers/specs/2026-04-15-download-layout-nas-design.md
T

8.5 KiB

Download Layout And NAS Deployment Design

Goal

Refine the current musicdl.catalogsync download flow so it can be deployed cleanly onto a NAS or any other Linux machine with:

  • a portable script layout
  • a machine-local .env configuration file
  • a dedicated music library root separate from scripts and runtime state
  • a download directory structure of platform/first_artist/filename
  • path semantics that can be reused later by the upload workflow

This design intentionally focuses on download and deployment only. Upload automation is deferred to the next sub-project.

Scope

In Scope

  • Introduce a portable deployment layout for NAS and other Linux targets
  • Separate application/runtime files from downloaded music files
  • Standardize local download paths as:
    • <LIBRARY_DIR>/<platform>/<first_artist>/<filename>
  • Preserve relative path semantics in file_locations.locator
  • Add machine-local configuration through config/catalogsync.env
  • Add bootstrap and runtime script conventions suitable for copying to other machines
  • Keep database and runtime files under the application home instead of the music library root
  • Ensure required directories are auto-created when bootstrapping or running

Out of Scope

  • 123 cloud upload implementation
  • Object storage upload implementation
  • Concurrent download
  • Concurrent upload
  • Cross-platform song canonicalization
  • GUI integration
  • Deletion or migration of existing remote file locations

Constraints

  • Reuse the existing musicdl.catalogsync package and CLI as much as possible
  • Keep the deployment scripts portable so they can be copied to another Linux machine
  • Do not hardcode NAS-only paths inside the application logic
  • Store machine-specific paths in configuration, not in source code
  • Keep file_locations.locator stable so the future upload phase can reuse the same relative paths

Deployment Model

Local Repo Versus Target Machine

There are two kinds of scripts:

  1. Bootstrap/deployment scripts that live in the repository and are run from the operator machine
  2. Runtime scripts that are copied onto the target machine and used there repeatedly

This avoids the circular problem of requiring a target-side script before the target-side directories exist.

Target Directory Layout

Recommended target layout:

/volume4/Music_Cloud/
├─ library/
└─ catalogsync/
   ├─ app/
   ├─ bin/
   ├─ config/
   ├─ data/
   ├─ inputs/
   └─ logs/

Responsibilities:

  • library
    • downloaded music files only
  • catalogsync/app
    • synced code, virtual environment, and application files
  • catalogsync/bin
    • target-side runtime scripts
  • catalogsync/config
    • machine-local configuration such as catalogsync.env
  • catalogsync/data
    • SQLite database
  • catalogsync/inputs
    • playlist files and other operator-provided inputs
  • catalogsync/logs
    • runtime logs

Configuration Model

Machine-Local Environment File

Each deployed machine should use a local config file:

ROOT_DIR=/volume4/Music_Cloud
APP_HOME=/volume4/Music_Cloud/catalogsync
LIBRARY_DIR=/volume4/Music_Cloud/library

DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db
INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs
LOG_DIR=/volume4/Music_Cloud/catalogsync/logs

PYTHON_BIN=python3
VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv

DOWNLOAD_LAYOUT=platform_first_artist

Configuration Rules

  • ROOT_DIR
    • optional convenience root for deployment layout
  • APP_HOME
    • runtime home for scripts, DB, logs, and inputs
  • LIBRARY_DIR
    • physical location of downloaded music files
    • may be different from ROOT_DIR
  • DB_PATH
    • defaults to <APP_HOME>/data/catalogsync.db
  • INPUT_DIR
    • defaults to <APP_HOME>/inputs
  • LOG_DIR
    • defaults to <APP_HOME>/logs
  • PYTHON_BIN
    • interpreter used by runtime scripts
  • VENV_DIR
    • target-side virtualenv path
  • DOWNLOAD_LAYOUT
    • first supported value: platform_first_artist

This keeps deployment portable:

  • copying to a new machine mainly requires updating catalogsync.env
  • moving the music library only requires updating LIBRARY_DIR

Download Path Design

Layout Rule

The first supported layout is:

<LIBRARY_DIR>/<platform>/<first_artist>/<filename>

Examples:

/volume4/Music_Cloud/library/netease/周杰伦/七里香.flac
/volume4/Music_Cloud/library/qq/林俊杰/江南.mp3

Artist Directory Rule

  • Use the first artist only
  • Do not create multi-artist directory names in the first version
  • If no artist is available, use a stable fallback such as Unknown Artist

This keeps paths shorter, more stable, and easier to reuse for upload.

Locator Rule

file_locations.locator must store a path relative to LIBRARY_DIR.

Examples:

netease/周杰伦/七里香.flac
qq/林俊杰/江南.mp3

This is important because the future upload phase will reuse the same relative path for:

  • cloud-drive locators
  • object-storage keys beneath a backend root prefix

Directory Creation Behavior

When bootstrapping or first running on a machine, the system should auto-create any missing directories with mkdir -p semantics.

Required directories:

  • <ROOT_DIR>
  • <LIBRARY_DIR>
  • <APP_HOME>
  • <APP_HOME>/app
  • <APP_HOME>/bin
  • <APP_HOME>/config
  • <APP_HOME>/data
  • <APP_HOME>/inputs
  • <APP_HOME>/logs

Rules:

  • existing directories are reused without error
  • missing directories are created automatically
  • permission failures should produce a clear fatal error

Script Model

Repository-Side Bootstrap Scripts

The repository should contain deployment/bootstrap scripts that:

  • connect to the target machine
  • create the target directory layout
  • copy application files
  • create or refresh the runtime scripts
  • create a config template if missing

These scripts must not hardcode a single target path internally beyond defaults that can be overridden.

Target-Side Runtime Scripts

After bootstrap, the target machine should contain reusable runtime scripts under:

<APP_HOME>/bin

Initial examples:

  • download_all.sh
  • download_from_file.sh

Each runtime script should:

  • load config/catalogsync.env
  • ensure the required directories exist
  • use DB_PATH, INPUT_DIR, LOG_DIR, and LIBRARY_DIR
  • write logs to the configured log directory

CLI And Application Semantics

The current code uses --library-root as the download root. This design prefers moving toward a configuration-driven deployment model where:

  • runtime scripts supply the configured paths
  • the application writes downloads into LIBRARY_DIR
  • the DB lives under APP_HOME/data

The implementation may either:

  • keep --library-root internally for compatibility while runtime scripts pass LIBRARY_DIR
  • or introduce a cleaner root/app configuration layer as long as behavior stays aligned with this design

The important requirement is behavioral, not the exact CLI spelling:

  • scripts and runtime state must stay separated from music files
  • downloaded file locations must follow platform/first_artist/filename

Error Handling

  • Missing config file:
    • fail fast with a clear message pointing to catalogsync.env
  • Missing required env values:
    • fail fast with a clear message naming the missing variable
  • Missing artist data:
    • use fallback artist directory and continue
  • Invalid filename/path characters:
    • sanitize to a filesystem-safe name
  • Existing file in the destination path:
    • preserve current dedupe behavior through DB state and active local file records
  • Directory creation failure:
    • fail fast with an actionable error

Testing

Add or update coverage for:

  • path-building helper for platform/first_artist/filename
  • first-artist extraction behavior
  • artist fallback behavior
  • locator values remaining relative to LIBRARY_DIR
  • directory auto-creation for deployment/runtime helpers
  • runtime config loading from catalogsync.env
  • download flow recording the new relative locator format in file_locations

Acceptance Criteria

  • Downloads are stored under <LIBRARY_DIR>/<platform>/<first_artist>/<filename>
  • file_locations.locator stores the path relative to LIBRARY_DIR
  • Application/runtime files are separate from music files
  • A deployment can be copied to another Linux machine by adjusting catalogsync.env
  • Bootstrap/runtime behavior auto-creates the expected directory structure
  • Existing download logic still records local files into the catalog database
  • The resulting local relative paths are suitable for reuse by the later upload implementation