486 lines
18 KiB
Markdown
486 lines
18 KiB
Markdown
# Download Layout And NAS Deployment Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Change `musicdl.catalogsync` downloads to land under `LIBRARY_DIR/<platform>/<first_artist>/...`, preserve relative locators for later upload reuse, and add portable NAS/Linux deployment scripts plus `.env`-driven runtime layout.
|
|
|
|
**Architecture:** Add a small runtime/layout helper module for path building, safe filename components, config defaults, and directory creation. Reuse the existing downloader and CLI, but route download destinations through the new path helper and add deploy/runtime scripts under `scripts/catalogsync` so target machines can be bootstrapped and then run from `catalogsync/bin` with `catalogsync.env`.
|
|
|
|
**Tech Stack:** Python stdlib (`pathlib`, `dataclasses`, `tempfile`, `re`), `click`, existing `musicdl.catalogsync` modules, PowerShell, POSIX shell, `unittest`
|
|
|
|
---
|
|
|
|
### Task 1: Add runtime/layout helper tests and implementation
|
|
|
|
**Files:**
|
|
- Create: `musicdl/catalogsync/runtime.py`
|
|
- Create: `tests/catalogsync/test_runtime.py`
|
|
|
|
- [ ] **Step 1: Write the failing runtime/layout tests**
|
|
|
|
```python
|
|
import tempfile
|
|
import unittest
|
|
from pathlib import Path
|
|
|
|
|
|
class RuntimeLayoutTests(unittest.TestCase):
|
|
def test_runtime_config_builds_defaults_from_root_dir(self):
|
|
from musicdl.catalogsync.runtime import CatalogSyncRuntimeConfig
|
|
|
|
config = CatalogSyncRuntimeConfig.from_mapping(
|
|
{
|
|
"ROOT_DIR": "/volume4/Music_Cloud",
|
|
"PYTHON_BIN": "python3",
|
|
}
|
|
)
|
|
|
|
self.assertEqual(Path("/volume4/Music_Cloud/catalogsync"), config.app_home)
|
|
self.assertEqual(Path("/volume4/Music_Cloud/library"), config.library_dir)
|
|
self.assertEqual(Path("/volume4/Music_Cloud/catalogsync/data/catalogsync.db"), config.db_path)
|
|
self.assertEqual("platform_first_artist", config.download_layout)
|
|
|
|
def test_runtime_config_ensure_directories_creates_expected_tree(self):
|
|
from musicdl.catalogsync.runtime import CatalogSyncRuntimeConfig
|
|
|
|
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as tmpdir:
|
|
root_dir = Path(tmpdir) / "Music_Cloud"
|
|
config = CatalogSyncRuntimeConfig.from_mapping({"ROOT_DIR": str(root_dir)})
|
|
|
|
config.ensure_directories()
|
|
|
|
self.assertTrue((root_dir / "library").is_dir())
|
|
self.assertTrue((root_dir / "catalogsync" / "app").is_dir())
|
|
self.assertTrue((root_dir / "catalogsync" / "bin").is_dir())
|
|
self.assertTrue((root_dir / "catalogsync" / "config").is_dir())
|
|
self.assertTrue((root_dir / "catalogsync" / "data").is_dir())
|
|
self.assertTrue((root_dir / "catalogsync" / "inputs").is_dir())
|
|
self.assertTrue((root_dir / "catalogsync" / "logs").is_dir())
|
|
|
|
def test_build_download_relative_dir_uses_platform_and_first_artist(self):
|
|
from musicdl.catalogsync.runtime import build_download_relative_dir
|
|
|
|
relative_dir = build_download_relative_dir(
|
|
platform="qq",
|
|
singers="Singer A / Singer B",
|
|
)
|
|
|
|
self.assertEqual(Path("qq") / "Singer A", relative_dir)
|
|
|
|
def test_build_download_relative_dir_falls_back_to_unknown_artist(self):
|
|
from musicdl.catalogsync.runtime import build_download_relative_dir
|
|
|
|
relative_dir = build_download_relative_dir(
|
|
platform="netease",
|
|
singers="",
|
|
)
|
|
|
|
self.assertEqual(Path("netease") / "Unknown Artist", relative_dir)
|
|
```
|
|
|
|
- [ ] **Step 2: Run the focused runtime/layout tests to verify they fail**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_runtime -v`
|
|
Expected: FAIL with import error for `musicdl.catalogsync.runtime` or missing helper functions
|
|
|
|
- [ ] **Step 3: Implement the minimal runtime/layout helper module**
|
|
|
|
```python
|
|
from __future__ import annotations
|
|
|
|
import re
|
|
from dataclasses import dataclass
|
|
from pathlib import Path
|
|
|
|
|
|
INVALID_PATH_CHARS_RE = re.compile(r'[<>:"/\\|?*\x00-\x1f]')
|
|
|
|
|
|
def sanitize_path_component(value: str, fallback: str) -> str:
|
|
cleaned = INVALID_PATH_CHARS_RE.sub("_", (value or "").strip()).rstrip(". ")
|
|
return cleaned or fallback
|
|
|
|
|
|
def pick_first_artist_name(singers: str | None) -> str:
|
|
for candidate in re.split(r"\s*(?:/|,|&|\|)\s*", singers or ""):
|
|
if candidate.strip():
|
|
return sanitize_path_component(candidate, "Unknown Artist")
|
|
return "Unknown Artist"
|
|
|
|
|
|
def build_download_relative_dir(platform: str, singers: str | None) -> Path:
|
|
return Path(sanitize_path_component(platform, "unknown")) / pick_first_artist_name(singers)
|
|
|
|
|
|
@dataclass(slots=True)
|
|
class CatalogSyncRuntimeConfig:
|
|
root_dir: Path
|
|
app_home: Path
|
|
library_dir: Path
|
|
db_path: Path
|
|
input_dir: Path
|
|
log_dir: Path
|
|
python_bin: str
|
|
venv_dir: Path
|
|
download_layout: str
|
|
|
|
@classmethod
|
|
def from_mapping(cls, mapping: dict[str, str]) -> "CatalogSyncRuntimeConfig":
|
|
root_dir = Path(mapping["ROOT_DIR"]).resolve()
|
|
app_home = Path(mapping.get("APP_HOME", root_dir / "catalogsync")).resolve()
|
|
library_dir = Path(mapping.get("LIBRARY_DIR", root_dir / "library")).resolve()
|
|
return cls(
|
|
root_dir=root_dir,
|
|
app_home=app_home,
|
|
library_dir=library_dir,
|
|
db_path=Path(mapping.get("DB_PATH", app_home / "data" / "catalogsync.db")).resolve(),
|
|
input_dir=Path(mapping.get("INPUT_DIR", app_home / "inputs")).resolve(),
|
|
log_dir=Path(mapping.get("LOG_DIR", app_home / "logs")).resolve(),
|
|
python_bin=mapping.get("PYTHON_BIN", "python3"),
|
|
venv_dir=Path(mapping.get("VENV_DIR", app_home / "app" / ".venv")).resolve(),
|
|
download_layout=mapping.get("DOWNLOAD_LAYOUT", "platform_first_artist"),
|
|
)
|
|
|
|
def ensure_directories(self) -> None:
|
|
for path in (
|
|
self.root_dir,
|
|
self.library_dir,
|
|
self.app_home / "app",
|
|
self.app_home / "bin",
|
|
self.app_home / "config",
|
|
self.app_home / "data",
|
|
self.app_home / "inputs",
|
|
self.app_home / "logs",
|
|
):
|
|
path.mkdir(parents=True, exist_ok=True)
|
|
```
|
|
|
|
- [ ] **Step 4: Re-run the focused runtime/layout tests**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_runtime -v`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add musicdl/catalogsync/runtime.py tests/catalogsync/test_runtime.py
|
|
git commit -m "feat: add runtime layout helpers"
|
|
```
|
|
|
|
### Task 2: Route downloader output through `platform/first_artist`
|
|
|
|
**Files:**
|
|
- Modify: `musicdl/catalogsync/downloader.py`
|
|
- Modify: `tests/catalogsync/test_services.py`
|
|
|
|
- [ ] **Step 1: Add failing downloader layout tests**
|
|
|
|
```python
|
|
def test_catalog_downloader_records_platform_first_artist_locator(self):
|
|
from musicdl.catalogsync.db import initialize_database
|
|
from musicdl.catalogsync.downloader import CatalogDownloader
|
|
from musicdl.catalogsync.models import CatalogSong
|
|
from musicdl.catalogsync.repository import CatalogRepository
|
|
|
|
class FakeClient:
|
|
def download(self, song_infos, num_threadings=1, auto_supplement_song=False):
|
|
save_path = Path(song_infos[0].work_dir) / "song-c.mp3"
|
|
save_path.parent.mkdir(parents=True, exist_ok=True)
|
|
save_path.write_bytes(b"fake-audio")
|
|
return [SimpleNamespace(save_path=str(save_path))]
|
|
|
|
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as tmpdir:
|
|
db_path = Path(tmpdir) / "catalogsync.db"
|
|
library_root = Path(tmpdir) / "library"
|
|
initialize_database(db_path, default_library_root=library_root).close()
|
|
repo = CatalogRepository(db_path)
|
|
repo.upsert_song(
|
|
CatalogSong(
|
|
platform="qq",
|
|
remote_song_id="song-c",
|
|
name="Song C",
|
|
singers="Singer A / Singer B",
|
|
ext="mp3",
|
|
file_size_bytes=80,
|
|
metadata={"snapshot": {"identifier": "song-c"}},
|
|
)
|
|
)
|
|
downloader = CatalogDownloader(repository=repo)
|
|
|
|
with patch("musicdl.catalogsync.downloader.deserialize_song_info", return_value=SimpleNamespace(singers="Singer A / Singer B")):
|
|
with patch.object(downloader, "get_client", return_value=FakeClient()):
|
|
downloader.download_pending(library_root=library_root, limit=1)
|
|
|
|
location = repo._fetchone(
|
|
"SELECT locator FROM file_locations ORDER BY id DESC LIMIT 1"
|
|
)
|
|
|
|
self.assertEqual("qq/Singer A/song-c.mp3", location["locator"])
|
|
|
|
def test_catalog_downloader_uses_unknown_artist_fallback_directory(self):
|
|
from musicdl.catalogsync.db import initialize_database
|
|
from musicdl.catalogsync.downloader import CatalogDownloader
|
|
from musicdl.catalogsync.models import CatalogSong
|
|
from musicdl.catalogsync.repository import CatalogRepository
|
|
|
|
class FakeClient:
|
|
def download(self, song_infos, num_threadings=1, auto_supplement_song=False):
|
|
save_path = Path(song_infos[0].work_dir) / "song-a.flac"
|
|
save_path.parent.mkdir(parents=True, exist_ok=True)
|
|
save_path.write_bytes(b"fake-audio")
|
|
return [SimpleNamespace(save_path=str(save_path))]
|
|
|
|
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as tmpdir:
|
|
db_path = Path(tmpdir) / "catalogsync.db"
|
|
library_root = Path(tmpdir) / "library"
|
|
initialize_database(db_path, default_library_root=library_root).close()
|
|
repo = CatalogRepository(db_path)
|
|
repo.upsert_song(
|
|
CatalogSong(
|
|
platform="netease",
|
|
remote_song_id="song-a",
|
|
name="Song A",
|
|
singers="",
|
|
ext="flac",
|
|
file_size_bytes=100,
|
|
metadata={"snapshot": {"identifier": "song-a"}},
|
|
)
|
|
)
|
|
downloader = CatalogDownloader(repository=repo)
|
|
|
|
with patch("musicdl.catalogsync.downloader.deserialize_song_info", return_value=SimpleNamespace(singers="")):
|
|
with patch.object(downloader, "get_client", return_value=FakeClient()):
|
|
downloader.download_pending(library_root=library_root, limit=1)
|
|
|
|
location = repo._fetchone(
|
|
"SELECT locator FROM file_locations ORDER BY id DESC LIMIT 1"
|
|
)
|
|
|
|
self.assertEqual("netease/Unknown Artist/song-a.flac", location["locator"])
|
|
```
|
|
|
|
- [ ] **Step 2: Run the focused downloader tests to verify they fail**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_services.CatalogServiceTests.test_catalog_downloader_records_platform_first_artist_locator tests.catalogsync.test_services.CatalogServiceTests.test_catalog_downloader_uses_unknown_artist_fallback_directory -v`
|
|
Expected: FAIL because the downloader still writes `platform/filename`
|
|
|
|
- [ ] **Step 3: Implement the downloader layout change**
|
|
|
|
```python
|
|
from .runtime import build_download_relative_dir
|
|
```
|
|
|
|
```python
|
|
relative_dir = build_download_relative_dir(
|
|
platform=row["platform"],
|
|
singers=getattr(song_info, "singers", None) or row.get("singers"),
|
|
)
|
|
target_dir = target_root / relative_dir
|
|
target_dir.mkdir(parents=True, exist_ok=True)
|
|
song_info.work_dir = str(target_dir)
|
|
```
|
|
|
|
Keep the locator writeback based on the actual saved file:
|
|
|
|
```python
|
|
saved_path = Path(saved_song.save_path)
|
|
relative_path = saved_path.relative_to(target_root).as_posix()
|
|
```
|
|
|
|
- [ ] **Step 4: Re-run the focused downloader tests**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_services.CatalogServiceTests.test_catalog_downloader_records_platform_first_artist_locator tests.catalogsync.test_services.CatalogServiceTests.test_catalog_downloader_uses_unknown_artist_fallback_directory -v`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Run the broader catalogsync tests affected by downloader changes**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_services tests.catalogsync.test_cli -v`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
git add musicdl/catalogsync/downloader.py tests/catalogsync/test_services.py
|
|
git commit -m "feat: store downloads under platform and first artist"
|
|
```
|
|
|
|
### Task 3: Add portable deployment and runtime script templates
|
|
|
|
**Files:**
|
|
- Create: `scripts/catalogsync/bootstrap_to_linux.ps1`
|
|
- Create: `scripts/catalogsync/templates/catalogsync.env.example`
|
|
- Create: `scripts/catalogsync/templates/download_all.sh`
|
|
- Create: `scripts/catalogsync/templates/download_from_file.sh`
|
|
- Modify: `tests/catalogsync/test_runtime.py`
|
|
|
|
- [ ] **Step 1: Add failing tests for deployment template content**
|
|
|
|
```python
|
|
def test_catalogsync_env_example_contains_required_keys(self):
|
|
template = Path("scripts/catalogsync/templates/catalogsync.env.example").read_text(encoding="utf-8")
|
|
self.assertIn("ROOT_DIR=", template)
|
|
self.assertIn("APP_HOME=", template)
|
|
self.assertIn("LIBRARY_DIR=", template)
|
|
self.assertIn("DB_PATH=", template)
|
|
self.assertIn("INPUT_DIR=", template)
|
|
self.assertIn("LOG_DIR=", template)
|
|
self.assertIn("DOWNLOAD_LAYOUT=platform_first_artist", template)
|
|
|
|
def test_runtime_script_template_uses_configured_library_dir(self):
|
|
script = Path("scripts/catalogsync/templates/download_from_file.sh").read_text(encoding="utf-8")
|
|
self.assertIn("LIBRARY_DIR", script)
|
|
self.assertIn("INPUT_DIR", script)
|
|
self.assertIn("musicdl.catalogsync.cli run", script)
|
|
```
|
|
|
|
- [ ] **Step 2: Run the focused runtime/template tests to verify they fail**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_runtime.RuntimeLayoutTests.test_catalogsync_env_example_contains_required_keys tests.catalogsync.test_runtime.RuntimeLayoutTests.test_runtime_script_template_uses_configured_library_dir -v`
|
|
Expected: FAIL because the template files do not exist yet
|
|
|
|
- [ ] **Step 3: Add the deployment and runtime script templates**
|
|
|
|
`scripts/catalogsync/templates/catalogsync.env.example`:
|
|
|
|
```bash
|
|
ROOT_DIR=/volume4/Music_Cloud
|
|
APP_HOME=/volume4/Music_Cloud/catalogsync
|
|
LIBRARY_DIR=/volume4/Music_Cloud/library
|
|
|
|
DB_PATH=/volume4/Music_Cloud/catalogsync/data/catalogsync.db
|
|
INPUT_DIR=/volume4/Music_Cloud/catalogsync/inputs
|
|
LOG_DIR=/volume4/Music_Cloud/catalogsync/logs
|
|
|
|
PYTHON_BIN=python3
|
|
VENV_DIR=/volume4/Music_Cloud/catalogsync/app/.venv
|
|
|
|
DOWNLOAD_LAYOUT=platform_first_artist
|
|
```
|
|
|
|
`scripts/catalogsync/templates/download_all.sh`:
|
|
|
|
```bash
|
|
#!/usr/bin/env bash
|
|
set -euo pipefail
|
|
|
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
APP_HOME="$(cd "${SCRIPT_DIR}/.." && pwd)"
|
|
CONFIG_FILE="${APP_HOME}/config/catalogsync.env"
|
|
source "${CONFIG_FILE}"
|
|
|
|
mkdir -p "${LIBRARY_DIR}" "${APP_HOME}/data" "${INPUT_DIR}" "${LOG_DIR}"
|
|
|
|
"${PYTHON_BIN}" -m musicdl.catalogsync.cli run \
|
|
--db "${DB_PATH}" \
|
|
--library-root "${LIBRARY_DIR}" \
|
|
"$@"
|
|
```
|
|
|
|
`scripts/catalogsync/templates/download_from_file.sh`:
|
|
|
|
```bash
|
|
#!/usr/bin/env bash
|
|
set -euo pipefail
|
|
|
|
if [[ $# -lt 1 ]]; then
|
|
echo "usage: $0 <playlist-file> [extra args...]"
|
|
exit 1
|
|
fi
|
|
|
|
PLAYLIST_FILE="$1"
|
|
shift
|
|
|
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
APP_HOME="$(cd "${SCRIPT_DIR}/.." && pwd)"
|
|
CONFIG_FILE="${APP_HOME}/config/catalogsync.env"
|
|
source "${CONFIG_FILE}"
|
|
|
|
mkdir -p "${LIBRARY_DIR}" "${APP_HOME}/data" "${INPUT_DIR}" "${LOG_DIR}"
|
|
|
|
"${PYTHON_BIN}" -m musicdl.catalogsync.cli run \
|
|
--db "${DB_PATH}" \
|
|
--library-root "${LIBRARY_DIR}" \
|
|
--playlist-file "${PLAYLIST_FILE}" \
|
|
"$@"
|
|
```
|
|
|
|
`scripts/catalogsync/bootstrap_to_linux.ps1` should:
|
|
|
|
```powershell
|
|
param(
|
|
[string]$Host,
|
|
[int]$Port = 22,
|
|
[string]$User,
|
|
[string]$RootDir = "/volume4/Music_Cloud"
|
|
)
|
|
|
|
$AppHome = "$RootDir/catalogsync"
|
|
$RemoteDirs = @(
|
|
$RootDir,
|
|
"$RootDir/library",
|
|
"$AppHome/app",
|
|
"$AppHome/bin",
|
|
"$AppHome/config",
|
|
"$AppHome/data",
|
|
"$AppHome/inputs",
|
|
"$AppHome/logs"
|
|
)
|
|
```
|
|
|
|
Then use `ssh` and `scp` to:
|
|
|
|
- create the remote directories
|
|
- copy the application files into `$AppHome/app`
|
|
- copy the shell script templates into `$AppHome/bin`
|
|
- copy `catalogsync.env.example` into `$AppHome/config/catalogsync.env.example` if missing
|
|
|
|
- [ ] **Step 4: Re-run the focused runtime/template tests**
|
|
|
|
Run: `python -m unittest tests.catalogsync.test_runtime -v`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add scripts/catalogsync tests/catalogsync/test_runtime.py
|
|
git commit -m "feat: add portable catalogsync deployment scripts"
|
|
```
|
|
|
|
### Task 4: Document the new layout and verify the full flow
|
|
|
|
**Files:**
|
|
- Modify: `docs/catalogsync.md`
|
|
- Modify: `README.md`
|
|
|
|
- [ ] **Step 1: Update user-facing docs with the new deployment layout**
|
|
|
|
Add:
|
|
|
|
- the `/volume4/Music_Cloud/library` versus `/volume4/Music_Cloud/catalogsync` split
|
|
- the `platform/first_artist` download layout
|
|
- the `catalogsync.env` example
|
|
- the `scripts/catalogsync/bootstrap_to_linux.ps1` usage
|
|
- the target-side `download_all.sh` and `download_from_file.sh` usage
|
|
|
|
- [ ] **Step 2: Run the full catalogsync unittest suite**
|
|
|
|
Run: `python -m unittest discover -s tests/catalogsync -v`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 3: Run a local smoke check for CLI help**
|
|
|
|
Run: `python -m musicdl.catalogsync.cli run --help`
|
|
Expected: output includes `--playlist-file`
|
|
|
|
- [ ] **Step 4: Inspect the generated diff**
|
|
|
|
Run: `git diff --stat`
|
|
Expected: only the planned runtime/layout/downloader/docs files changed
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add docs/catalogsync.md README.md
|
|
git commit -m "docs: describe NAS download layout workflow"
|
|
```
|