290 lines
8.5 KiB
Markdown
290 lines
8.5 KiB
Markdown
# Download Layout And NAS Deployment Design
|
|
|
|
## Goal
|
|
|
|
Refine the current `musicdl.catalogsync` download flow so it can be deployed cleanly onto a NAS or any other Linux machine with:
|
|
|
|
- a portable script layout
|
|
- a machine-local `.env` configuration file
|
|
- a dedicated music library root separate from scripts and runtime state
|
|
- a download directory structure of `platform/first_artist/filename`
|
|
- path semantics that can be reused later by the upload workflow
|
|
|
|
This design intentionally focuses on download and deployment only. Upload automation is deferred to the next sub-project.
|
|
|
|
## Scope
|
|
|
|
### In Scope
|
|
|
|
- Introduce a portable deployment layout for NAS and other Linux targets
|
|
- Separate application/runtime files from downloaded music files
|
|
- Standardize local download paths as:
|
|
- `<LIBRARY_DIR>/<platform>/<first_artist>/<filename>`
|
|
- Preserve relative path semantics in `file_locations.locator`
|
|
- Add machine-local configuration through `config/catalogsync.env`
|
|
- Add bootstrap and runtime script conventions suitable for copying to other machines
|
|
- Keep database and runtime files under the application home instead of the music library root
|
|
- Ensure required directories are auto-created when bootstrapping or running
|
|
|
|
### Out of Scope
|
|
|
|
- 123 cloud upload implementation
|
|
- Object storage upload implementation
|
|
- Concurrent download
|
|
- Concurrent upload
|
|
- Cross-platform song canonicalization
|
|
- GUI integration
|
|
- Deletion or migration of existing remote file locations
|
|
|
|
## Constraints
|
|
|
|
- Reuse the existing `musicdl.catalogsync` package and CLI as much as possible
|
|
- Keep the deployment scripts portable so they can be copied to another Linux machine
|
|
- Do not hardcode NAS-only paths inside the application logic
|
|
- Store machine-specific paths in configuration, not in source code
|
|
- Keep `file_locations.locator` stable so the future upload phase can reuse the same relative paths
|
|
|
|
## Deployment Model
|
|
|
|
### Local Repo Versus Target Machine
|
|
|
|
There are two kinds of scripts:
|
|
|
|
1. Bootstrap/deployment scripts that live in the repository and are run from the operator machine
|
|
2. Runtime scripts that are copied onto the target machine and used there repeatedly
|
|
|
|
This avoids the circular problem of requiring a target-side script before the target-side directories exist.
|
|
|
|
### Target Directory Layout
|
|
|
|
Recommended target layout:
|
|
|
|
```text
|
|
/volume4/Music_Cloud/
|
|
├─ library/
|
|
└─ catalogsync/
|
|
├─ app/
|
|
├─ bin/
|
|
├─ config/
|
|
├─ data/
|
|
├─ inputs/
|
|
└─ logs/
|
|
```
|
|
|
|
Responsibilities:
|
|
|
|
- `library`
|
|
- downloaded music files only
|
|
- `catalogsync/app`
|
|
- synced code, virtual environment, and application files
|
|
- `catalogsync/bin`
|
|
- target-side runtime scripts
|
|
- `catalogsync/config`
|
|
- machine-local configuration such as `catalogsync.env`
|
|
- `catalogsync/data`
|
|
- SQLite database
|
|
- `catalogsync/inputs`
|
|
- playlist files and other operator-provided inputs
|
|
- `catalogsync/logs`
|
|
- runtime logs
|
|
|
|
## Configuration Model
|
|
|
|
### Machine-Local Environment File
|
|
|
|
Each deployed machine should use a local config file:
|
|
|
|
```bash
|
|
ROOT_DIR=/volume4/Music_Cloud
|
|
APP_HOME=/volume4/Music_Cloud/catalogsync
|
|
LIBRARY_DIR=/volume4/Music_Cloud/library
|
|
|
|
DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db
|
|
INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs
|
|
LOG_DIR=/volume4/Music_Cloud/catalogsync/logs
|
|
|
|
PYTHON_BIN=python3
|
|
VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv
|
|
|
|
DOWNLOAD_LAYOUT=platform_first_artist
|
|
```
|
|
|
|
### Configuration Rules
|
|
|
|
- `ROOT_DIR`
|
|
- optional convenience root for deployment layout
|
|
- `APP_HOME`
|
|
- runtime home for scripts, DB, logs, and inputs
|
|
- `LIBRARY_DIR`
|
|
- physical location of downloaded music files
|
|
- may be different from `ROOT_DIR`
|
|
- `DB_PATH`
|
|
- defaults to `<APP_HOME>/data/catalogsync.db`
|
|
- `INPUT_DIR`
|
|
- defaults to `<APP_HOME>/inputs`
|
|
- `LOG_DIR`
|
|
- defaults to `<APP_HOME>/logs`
|
|
- `PYTHON_BIN`
|
|
- interpreter used by runtime scripts
|
|
- `VENV_DIR`
|
|
- target-side virtualenv path
|
|
- `DOWNLOAD_LAYOUT`
|
|
- first supported value: `platform_first_artist`
|
|
|
|
This keeps deployment portable:
|
|
|
|
- copying to a new machine mainly requires updating `catalogsync.env`
|
|
- moving the music library only requires updating `LIBRARY_DIR`
|
|
|
|
## Download Path Design
|
|
|
|
### Layout Rule
|
|
|
|
The first supported layout is:
|
|
|
|
```text
|
|
<LIBRARY_DIR>/<platform>/<first_artist>/<filename>
|
|
```
|
|
|
|
Examples:
|
|
|
|
```text
|
|
/volume4/Music_Cloud/library/netease/周杰伦/七里香.flac
|
|
/volume4/Music_Cloud/library/qq/林俊杰/江南.mp3
|
|
```
|
|
|
|
### Artist Directory Rule
|
|
|
|
- Use the first artist only
|
|
- Do not create multi-artist directory names in the first version
|
|
- If no artist is available, use a stable fallback such as `Unknown Artist`
|
|
|
|
This keeps paths shorter, more stable, and easier to reuse for upload.
|
|
|
|
### Locator Rule
|
|
|
|
`file_locations.locator` must store a path relative to `LIBRARY_DIR`.
|
|
|
|
Examples:
|
|
|
|
```text
|
|
netease/周杰伦/七里香.flac
|
|
qq/林俊杰/江南.mp3
|
|
```
|
|
|
|
This is important because the future upload phase will reuse the same relative path for:
|
|
|
|
- cloud-drive locators
|
|
- object-storage keys beneath a backend root prefix
|
|
|
|
## Directory Creation Behavior
|
|
|
|
When bootstrapping or first running on a machine, the system should auto-create any missing directories with `mkdir -p` semantics.
|
|
|
|
Required directories:
|
|
|
|
- `<ROOT_DIR>`
|
|
- `<LIBRARY_DIR>`
|
|
- `<APP_HOME>`
|
|
- `<APP_HOME>/app`
|
|
- `<APP_HOME>/bin`
|
|
- `<APP_HOME>/config`
|
|
- `<APP_HOME>/data`
|
|
- `<APP_HOME>/inputs`
|
|
- `<APP_HOME>/logs`
|
|
|
|
Rules:
|
|
|
|
- existing directories are reused without error
|
|
- missing directories are created automatically
|
|
- permission failures should produce a clear fatal error
|
|
|
|
## Script Model
|
|
|
|
### Repository-Side Bootstrap Scripts
|
|
|
|
The repository should contain deployment/bootstrap scripts that:
|
|
|
|
- connect to the target machine
|
|
- create the target directory layout
|
|
- copy application files
|
|
- create or refresh the runtime scripts
|
|
- create a config template if missing
|
|
|
|
These scripts must not hardcode a single target path internally beyond defaults that can be overridden.
|
|
|
|
### Target-Side Runtime Scripts
|
|
|
|
After bootstrap, the target machine should contain reusable runtime scripts under:
|
|
|
|
```text
|
|
<APP_HOME>/bin
|
|
```
|
|
|
|
Initial examples:
|
|
|
|
- `download_all.sh`
|
|
- `download_from_file.sh`
|
|
|
|
Each runtime script should:
|
|
|
|
- load `config/catalogsync.env`
|
|
- ensure the required directories exist
|
|
- use `DB_PATH`, `INPUT_DIR`, `LOG_DIR`, and `LIBRARY_DIR`
|
|
- write logs to the configured log directory
|
|
|
|
## CLI And Application Semantics
|
|
|
|
The current code uses `--library-root` as the download root. This design prefers moving toward a configuration-driven deployment model where:
|
|
|
|
- runtime scripts supply the configured paths
|
|
- the application writes downloads into `LIBRARY_DIR`
|
|
- the DB lives under `APP_HOME/data`
|
|
|
|
The implementation may either:
|
|
|
|
- keep `--library-root` internally for compatibility while runtime scripts pass `LIBRARY_DIR`
|
|
- or introduce a cleaner root/app configuration layer as long as behavior stays aligned with this design
|
|
|
|
The important requirement is behavioral, not the exact CLI spelling:
|
|
|
|
- scripts and runtime state must stay separated from music files
|
|
- downloaded file locations must follow `platform/first_artist/filename`
|
|
|
|
## Error Handling
|
|
|
|
- Missing config file:
|
|
- fail fast with a clear message pointing to `catalogsync.env`
|
|
- Missing required env values:
|
|
- fail fast with a clear message naming the missing variable
|
|
- Missing artist data:
|
|
- use fallback artist directory and continue
|
|
- Invalid filename/path characters:
|
|
- sanitize to a filesystem-safe name
|
|
- Existing file in the destination path:
|
|
- preserve current dedupe behavior through DB state and active local file records
|
|
- Directory creation failure:
|
|
- fail fast with an actionable error
|
|
|
|
## Testing
|
|
|
|
Add or update coverage for:
|
|
|
|
- path-building helper for `platform/first_artist/filename`
|
|
- first-artist extraction behavior
|
|
- artist fallback behavior
|
|
- locator values remaining relative to `LIBRARY_DIR`
|
|
- directory auto-creation for deployment/runtime helpers
|
|
- runtime config loading from `catalogsync.env`
|
|
- download flow recording the new relative locator format in `file_locations`
|
|
|
|
## Acceptance Criteria
|
|
|
|
- Downloads are stored under `<LIBRARY_DIR>/<platform>/<first_artist>/<filename>`
|
|
- `file_locations.locator` stores the path relative to `LIBRARY_DIR`
|
|
- Application/runtime files are separate from music files
|
|
- A deployment can be copied to another Linux machine by adjusting `catalogsync.env`
|
|
- Bootstrap/runtime behavior auto-creates the expected directory structure
|
|
- Existing download logic still records local files into the catalog database
|
|
- The resulting local relative paths are suitable for reuse by the later upload implementation
|