# Download Layout And NAS Deployment Design ## Goal Refine the current `musicdl.catalogsync` download flow so it can be deployed cleanly onto a NAS or any other Linux machine with: - a portable script layout - a machine-local `.env` configuration file - a dedicated music library root separate from scripts and runtime state - a download directory structure of `platform/first_artist/filename` - path semantics that can be reused later by the upload workflow This design intentionally focuses on download and deployment only. Upload automation is deferred to the next sub-project. ## Scope ### In Scope - Introduce a portable deployment layout for NAS and other Linux targets - Separate application/runtime files from downloaded music files - Standardize local download paths as: - `///` - Preserve relative path semantics in `file_locations.locator` - Add machine-local configuration through `config/catalogsync.env` - Add bootstrap and runtime script conventions suitable for copying to other machines - Keep database and runtime files under the application home instead of the music library root - Ensure required directories are auto-created when bootstrapping or running ### Out of Scope - 123 cloud upload implementation - Object storage upload implementation - Concurrent download - Concurrent upload - Cross-platform song canonicalization - GUI integration - Deletion or migration of existing remote file locations ## Constraints - Reuse the existing `musicdl.catalogsync` package and CLI as much as possible - Keep the deployment scripts portable so they can be copied to another Linux machine - Do not hardcode NAS-only paths inside the application logic - Store machine-specific paths in configuration, not in source code - Keep `file_locations.locator` stable so the future upload phase can reuse the same relative paths ## Deployment Model ### Local Repo Versus Target Machine There are two kinds of scripts: 1. Bootstrap/deployment scripts that live in the repository and are run from the operator machine 2. Runtime scripts that are copied onto the target machine and used there repeatedly This avoids the circular problem of requiring a target-side script before the target-side directories exist. ### Target Directory Layout Recommended target layout: ```text /volume4/Music_Cloud/ ├─ library/ └─ catalogsync/ ├─ app/ ├─ bin/ ├─ config/ ├─ data/ ├─ inputs/ └─ logs/ ``` Responsibilities: - `library` - downloaded music files only - `catalogsync/app` - synced code, virtual environment, and application files - `catalogsync/bin` - target-side runtime scripts - `catalogsync/config` - machine-local configuration such as `catalogsync.env` - `catalogsync/data` - SQLite database - `catalogsync/inputs` - playlist files and other operator-provided inputs - `catalogsync/logs` - runtime logs ## Configuration Model ### Machine-Local Environment File Each deployed machine should use a local config file: ```bash ROOT_DIR=/volume4/Music_Cloud APP_HOME=/volume4/Music_Cloud/catalogsync LIBRARY_DIR=/volume4/Music_Cloud/library DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs LOG_DIR=/volume4/Music_Cloud/catalogsync/logs PYTHON_BIN=python3 VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv DOWNLOAD_LAYOUT=platform_first_artist ``` ### Configuration Rules - `ROOT_DIR` - optional convenience root for deployment layout - `APP_HOME` - runtime home for scripts, DB, logs, and inputs - `LIBRARY_DIR` - physical location of downloaded music files - may be different from `ROOT_DIR` - `DB_PATH` - defaults to `/data/catalogsync.db` - `INPUT_DIR` - defaults to `/inputs` - `LOG_DIR` - defaults to `/logs` - `PYTHON_BIN` - interpreter used by runtime scripts - `VENV_DIR` - target-side virtualenv path - `DOWNLOAD_LAYOUT` - first supported value: `platform_first_artist` This keeps deployment portable: - copying to a new machine mainly requires updating `catalogsync.env` - moving the music library only requires updating `LIBRARY_DIR` ## Download Path Design ### Layout Rule The first supported layout is: ```text /// ``` Examples: ```text /volume4/Music_Cloud/library/netease/周杰伦/七里香.flac /volume4/Music_Cloud/library/qq/林俊杰/江南.mp3 ``` ### Artist Directory Rule - Use the first artist only - Do not create multi-artist directory names in the first version - If no artist is available, use a stable fallback such as `Unknown Artist` This keeps paths shorter, more stable, and easier to reuse for upload. ### Locator Rule `file_locations.locator` must store a path relative to `LIBRARY_DIR`. Examples: ```text netease/周杰伦/七里香.flac qq/林俊杰/江南.mp3 ``` This is important because the future upload phase will reuse the same relative path for: - cloud-drive locators - object-storage keys beneath a backend root prefix ## Directory Creation Behavior When bootstrapping or first running on a machine, the system should auto-create any missing directories with `mkdir -p` semantics. Required directories: - `` - `` - `` - `/app` - `/bin` - `/config` - `/data` - `/inputs` - `/logs` Rules: - existing directories are reused without error - missing directories are created automatically - permission failures should produce a clear fatal error ## Script Model ### Repository-Side Bootstrap Scripts The repository should contain deployment/bootstrap scripts that: - connect to the target machine - create the target directory layout - copy application files - create or refresh the runtime scripts - create a config template if missing These scripts must not hardcode a single target path internally beyond defaults that can be overridden. ### Target-Side Runtime Scripts After bootstrap, the target machine should contain reusable runtime scripts under: ```text /bin ``` Initial examples: - `download_all.sh` - `download_from_file.sh` Each runtime script should: - load `config/catalogsync.env` - ensure the required directories exist - use `DB_PATH`, `INPUT_DIR`, `LOG_DIR`, and `LIBRARY_DIR` - write logs to the configured log directory ## CLI And Application Semantics The current code uses `--library-root` as the download root. This design prefers moving toward a configuration-driven deployment model where: - runtime scripts supply the configured paths - the application writes downloads into `LIBRARY_DIR` - the DB lives under `APP_HOME/data` The implementation may either: - keep `--library-root` internally for compatibility while runtime scripts pass `LIBRARY_DIR` - or introduce a cleaner root/app configuration layer as long as behavior stays aligned with this design The important requirement is behavioral, not the exact CLI spelling: - scripts and runtime state must stay separated from music files - downloaded file locations must follow `platform/first_artist/filename` ## Error Handling - Missing config file: - fail fast with a clear message pointing to `catalogsync.env` - Missing required env values: - fail fast with a clear message naming the missing variable - Missing artist data: - use fallback artist directory and continue - Invalid filename/path characters: - sanitize to a filesystem-safe name - Existing file in the destination path: - preserve current dedupe behavior through DB state and active local file records - Directory creation failure: - fail fast with an actionable error ## Testing Add or update coverage for: - path-building helper for `platform/first_artist/filename` - first-artist extraction behavior - artist fallback behavior - locator values remaining relative to `LIBRARY_DIR` - directory auto-creation for deployment/runtime helpers - runtime config loading from `catalogsync.env` - download flow recording the new relative locator format in `file_locations` ## Acceptance Criteria - Downloads are stored under `///` - `file_locations.locator` stores the path relative to `LIBRARY_DIR` - Application/runtime files are separate from music files - A deployment can be copied to another Linux machine by adjusting `catalogsync.env` - Bootstrap/runtime behavior auto-creates the expected directory structure - Existing download logic still records local files into the catalog database - The resulting local relative paths are suitable for reuse by the later upload implementation