8.5 KiB
Download Layout And NAS Deployment Design
Goal
Refine the current musicdl.catalogsync download flow so it can be deployed cleanly onto a NAS or any other Linux machine with:
- a portable script layout
- a machine-local
.envconfiguration file - a dedicated music library root separate from scripts and runtime state
- a download directory structure of
platform/first_artist/filename - path semantics that can be reused later by the upload workflow
This design intentionally focuses on download and deployment only. Upload automation is deferred to the next sub-project.
Scope
In Scope
- Introduce a portable deployment layout for NAS and other Linux targets
- Separate application/runtime files from downloaded music files
- Standardize local download paths as:
<LIBRARY_DIR>/<platform>/<first_artist>/<filename>
- Preserve relative path semantics in
file_locations.locator - Add machine-local configuration through
config/catalogsync.env - Add bootstrap and runtime script conventions suitable for copying to other machines
- Keep database and runtime files under the application home instead of the music library root
- Ensure required directories are auto-created when bootstrapping or running
Out of Scope
- 123 cloud upload implementation
- Object storage upload implementation
- Concurrent download
- Concurrent upload
- Cross-platform song canonicalization
- GUI integration
- Deletion or migration of existing remote file locations
Constraints
- Reuse the existing
musicdl.catalogsyncpackage and CLI as much as possible - Keep the deployment scripts portable so they can be copied to another Linux machine
- Do not hardcode NAS-only paths inside the application logic
- Store machine-specific paths in configuration, not in source code
- Keep
file_locations.locatorstable so the future upload phase can reuse the same relative paths
Deployment Model
Local Repo Versus Target Machine
There are two kinds of scripts:
- Bootstrap/deployment scripts that live in the repository and are run from the operator machine
- Runtime scripts that are copied onto the target machine and used there repeatedly
This avoids the circular problem of requiring a target-side script before the target-side directories exist.
Target Directory Layout
Recommended target layout:
/volume4/Music_Cloud/
├─ library/
└─ catalogsync/
├─ app/
├─ bin/
├─ config/
├─ data/
├─ inputs/
└─ logs/
Responsibilities:
library- downloaded music files only
catalogsync/app- synced code, virtual environment, and application files
catalogsync/bin- target-side runtime scripts
catalogsync/config- machine-local configuration such as
catalogsync.env
- machine-local configuration such as
catalogsync/data- SQLite database
catalogsync/inputs- playlist files and other operator-provided inputs
catalogsync/logs- runtime logs
Configuration Model
Machine-Local Environment File
Each deployed machine should use a local config file:
ROOT_DIR=/volume4/Music_Cloud
APP_HOME=/volume4/Music_Cloud/catalogsync
LIBRARY_DIR=/volume4/Music_Cloud/library
DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db
INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs
LOG_DIR=/volume4/Music_Cloud/catalogsync/logs
PYTHON_BIN=python3
VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv
DOWNLOAD_LAYOUT=platform_first_artist
Configuration Rules
ROOT_DIR- optional convenience root for deployment layout
APP_HOME- runtime home for scripts, DB, logs, and inputs
LIBRARY_DIR- physical location of downloaded music files
- may be different from
ROOT_DIR
DB_PATH- defaults to
<APP_HOME>/data/catalogsync.db
- defaults to
INPUT_DIR- defaults to
<APP_HOME>/inputs
- defaults to
LOG_DIR- defaults to
<APP_HOME>/logs
- defaults to
PYTHON_BIN- interpreter used by runtime scripts
VENV_DIR- target-side virtualenv path
DOWNLOAD_LAYOUT- first supported value:
platform_first_artist
- first supported value:
This keeps deployment portable:
- copying to a new machine mainly requires updating
catalogsync.env - moving the music library only requires updating
LIBRARY_DIR
Download Path Design
Layout Rule
The first supported layout is:
<LIBRARY_DIR>/<platform>/<first_artist>/<filename>
Examples:
/volume4/Music_Cloud/library/netease/周杰伦/七里香.flac
/volume4/Music_Cloud/library/qq/林俊杰/江南.mp3
Artist Directory Rule
- Use the first artist only
- Do not create multi-artist directory names in the first version
- If no artist is available, use a stable fallback such as
Unknown Artist
This keeps paths shorter, more stable, and easier to reuse for upload.
Locator Rule
file_locations.locator must store a path relative to LIBRARY_DIR.
Examples:
netease/周杰伦/七里香.flac
qq/林俊杰/江南.mp3
This is important because the future upload phase will reuse the same relative path for:
- cloud-drive locators
- object-storage keys beneath a backend root prefix
Directory Creation Behavior
When bootstrapping or first running on a machine, the system should auto-create any missing directories with mkdir -p semantics.
Required directories:
<ROOT_DIR><LIBRARY_DIR><APP_HOME><APP_HOME>/app<APP_HOME>/bin<APP_HOME>/config<APP_HOME>/data<APP_HOME>/inputs<APP_HOME>/logs
Rules:
- existing directories are reused without error
- missing directories are created automatically
- permission failures should produce a clear fatal error
Script Model
Repository-Side Bootstrap Scripts
The repository should contain deployment/bootstrap scripts that:
- connect to the target machine
- create the target directory layout
- copy application files
- create or refresh the runtime scripts
- create a config template if missing
These scripts must not hardcode a single target path internally beyond defaults that can be overridden.
Target-Side Runtime Scripts
After bootstrap, the target machine should contain reusable runtime scripts under:
<APP_HOME>/bin
Initial examples:
download_all.shdownload_from_file.sh
Each runtime script should:
- load
config/catalogsync.env - ensure the required directories exist
- use
DB_PATH,INPUT_DIR,LOG_DIR, andLIBRARY_DIR - write logs to the configured log directory
CLI And Application Semantics
The current code uses --library-root as the download root. This design prefers moving toward a configuration-driven deployment model where:
- runtime scripts supply the configured paths
- the application writes downloads into
LIBRARY_DIR - the DB lives under
APP_HOME/data
The implementation may either:
- keep
--library-rootinternally for compatibility while runtime scripts passLIBRARY_DIR - or introduce a cleaner root/app configuration layer as long as behavior stays aligned with this design
The important requirement is behavioral, not the exact CLI spelling:
- scripts and runtime state must stay separated from music files
- downloaded file locations must follow
platform/first_artist/filename
Error Handling
- Missing config file:
- fail fast with a clear message pointing to
catalogsync.env
- fail fast with a clear message pointing to
- Missing required env values:
- fail fast with a clear message naming the missing variable
- Missing artist data:
- use fallback artist directory and continue
- Invalid filename/path characters:
- sanitize to a filesystem-safe name
- Existing file in the destination path:
- preserve current dedupe behavior through DB state and active local file records
- Directory creation failure:
- fail fast with an actionable error
Testing
Add or update coverage for:
- path-building helper for
platform/first_artist/filename - first-artist extraction behavior
- artist fallback behavior
- locator values remaining relative to
LIBRARY_DIR - directory auto-creation for deployment/runtime helpers
- runtime config loading from
catalogsync.env - download flow recording the new relative locator format in
file_locations
Acceptance Criteria
- Downloads are stored under
<LIBRARY_DIR>/<platform>/<first_artist>/<filename> file_locations.locatorstores the path relative toLIBRARY_DIR- Application/runtime files are separate from music files
- A deployment can be copied to another Linux machine by adjusting
catalogsync.env - Bootstrap/runtime behavior auto-creates the expected directory structure
- Existing download logic still records local files into the catalog database
- The resulting local relative paths are suitable for reuse by the later upload implementation