325 lines
11 KiB
Markdown
325 lines
11 KiB
Markdown
# Catalogsync Resolver Source Ranking Design
|
|
|
|
## Goal
|
|
|
|
Improve resolver throughput without sacrificing cross-run learning by introducing a persistent, isolated source-ranking store.
|
|
|
|
The resolver should keep treating the song's original platform as the preferred source, but fallback order should become adaptive instead of fixed. The adaptive order must:
|
|
|
|
- learn over time across jobs and restarts
|
|
- be grouped by original source instead of using one global ranking
|
|
- stay isolated from the main catalog business tables
|
|
- preserve the current "keep trying later sources if earlier ones fail" behavior
|
|
|
|
## Confirmed Decisions
|
|
|
|
The following points were confirmed during design discussion:
|
|
|
|
- the ranking model is grouped by original source
|
|
- statistics must persist across tasks and service restarts
|
|
- the statistics store must be isolated from the main business schema
|
|
- the original source is still tried first
|
|
- after the warmup threshold is reached, fallback should try the top two ranked sources first
|
|
- if the top two fallback sources fail, resolver must continue trying the remaining sources
|
|
|
|
## Scope
|
|
|
|
### In Scope
|
|
|
|
- add a dedicated resolver statistics SQLite side database
|
|
- record persistent fallback attempt and success statistics by `(origin_source, candidate_source)`
|
|
- use fallback statistics to reorder sources after a warmup threshold
|
|
- keep the original source as the first attempt
|
|
- preserve existing resolver matching and candidate selection logic within a single source
|
|
- cover side-database initialization, repository methods, ranking logic, and resolver behavior with tests
|
|
|
|
### Out Of Scope
|
|
|
|
- changing sync-stage behavior
|
|
- caching download URLs across runs
|
|
- changing song uniqueness rules
|
|
- replacing the current matching heuristics inside a source
|
|
- adding a UI for resolver statistics in this iteration
|
|
- distributed or external metrics storage
|
|
|
|
## Problem Statement
|
|
|
|
The current resolver still spends too much time in fallback traversal.
|
|
|
|
Today, resolver behavior is:
|
|
|
|
1. derive the original platform as `preferred_source`
|
|
2. try the preferred source first
|
|
3. if preferred-source fast return does not happen, continue through the configured fallback list
|
|
4. within fallback traversal, source order is static and does not learn from production outcomes
|
|
|
|
This causes two operational problems:
|
|
|
|
- fallback time is longer than necessary because low-yield sources keep being retried early
|
|
- the system does not accumulate knowledge from prior jobs, so every restart returns to the same static ordering
|
|
|
|
The result is that resolver throughput remains bursty even after the dual-pool pipeline work, because the ready queue is still fed by a fallback strategy that does not adapt.
|
|
|
|
## Approaches Considered
|
|
|
|
### Approach A: Global Ranking For All Sources
|
|
|
|
Keep one success-rate table for all candidate sources regardless of original platform.
|
|
|
|
Pros:
|
|
|
|
- simplest data model
|
|
- easiest ranking query
|
|
|
|
Cons:
|
|
|
|
- mixes very different source relationships
|
|
- large-volume platforms can dominate the ranking
|
|
- does not reflect that `qq -> kuwo` and `netease -> kuwo` may behave differently
|
|
|
|
Decision:
|
|
|
|
- rejected because the learning model should follow the original source
|
|
|
|
### Approach B: In-Memory Per-Run Learning Only
|
|
|
|
Track statistics only for the current job and discard them at task end.
|
|
|
|
Pros:
|
|
|
|
- no schema work
|
|
- easy to experiment with
|
|
|
|
Cons:
|
|
|
|
- restarts lose all learning
|
|
- long warmup every time
|
|
- directly conflicts with the requirement for cross-run reuse
|
|
|
|
Decision:
|
|
|
|
- rejected
|
|
|
|
### Approach C: Persistent Side Database Grouped By Original Source
|
|
|
|
Store statistics in a dedicated SQLite side database keyed by original source and fallback source.
|
|
|
|
Pros:
|
|
|
|
- matches the confirmed grouping model
|
|
- survives restarts and future jobs
|
|
- keeps analytics-style tables isolated from the main business schema
|
|
- easy to evolve independently from catalog tables
|
|
|
|
Cons:
|
|
|
|
- requires one more database file and repository
|
|
- adds coordination between resolver and statistics store
|
|
|
|
Decision:
|
|
|
|
- recommended
|
|
|
|
## Recommended Design
|
|
|
|
## High-Level Architecture
|
|
|
|
Add a dedicated resolver statistics store, for example:
|
|
|
|
- `resolver_stats.db`
|
|
|
|
This database is initialized separately from `catalogsync.db` and contains only resolver-learning tables.
|
|
|
|
The main download flow remains:
|
|
|
|
1. build `target_song_info`
|
|
2. determine `preferred_source`
|
|
3. try preferred source first
|
|
4. reorder fallback sources using persistent statistics when warmup criteria are met
|
|
5. try ranked top two fallback sources first
|
|
6. if still unresolved, continue the remaining fallback sources in ranked order
|
|
|
|
The existing resolver still owns matching, candidate picking, and final candidate selection inside each source.
|
|
|
|
## Statistics Model
|
|
|
|
The learning key is:
|
|
|
|
- `origin_source`
|
|
- `candidate_source`
|
|
|
|
Where:
|
|
|
|
- `origin_source` is the normalized original platform for the song being resolved
|
|
- `candidate_source` is a fallback source actually attempted after the original source path failed
|
|
|
|
Statistics are recorded only for fallback attempts. Preferred-source attempts are not stored in this side database for the first iteration because the ranking problem is specifically about fallback order.
|
|
|
|
### Stored Counters
|
|
|
|
Each row should persist:
|
|
|
|
- `origin_source`
|
|
- `candidate_source`
|
|
- `attempt_count`
|
|
- `resolve_success_count`
|
|
- `last_attempt_at`
|
|
- `last_success_at`
|
|
- `created_at`
|
|
- `updated_at`
|
|
|
|
## Warmup and Ranking Rules
|
|
|
|
### Warmup Threshold
|
|
|
|
The warmup threshold is not global song count. It is the total fallback sample count for a specific `origin_source`.
|
|
|
|
Example:
|
|
|
|
- `qq` fallback learning activates only after the sum of all `qq -> *` fallback attempts reaches `1000`
|
|
- `netease` fallback learning activates independently after the sum of all `netease -> *` fallback attempts reaches `1000`
|
|
|
|
### Ranking Formula
|
|
|
|
Use a smoothed success rate:
|
|
|
|
`(resolve_success_count + 1) / (attempt_count + 2)`
|
|
|
|
This avoids unstable rankings when sample counts are still low.
|
|
|
|
### Ranked Traversal
|
|
|
|
For a song with original source `origin_source`:
|
|
|
|
1. always try `preferred_source` first
|
|
2. if preferred source does not resolve a high-confidence downloadable result, enter fallback
|
|
3. if `origin_source` warmup threshold is not met:
|
|
- keep the configured fallback order
|
|
4. if `origin_source` warmup threshold is met:
|
|
- sort fallback candidates by smoothed success rate, highest first
|
|
- preserve configured order as the tie-breaker
|
|
- try the top two ranked fallback sources first
|
|
- if both fail, continue with the remaining ranked fallback sources
|
|
|
|
This preserves completeness while improving average-case resolution speed.
|
|
|
|
## Resolver Flow Changes
|
|
|
|
Resolver source ordering should become a two-phase plan:
|
|
|
|
### Phase 1: Preferred Source
|
|
|
|
- derive `preferred_source` from the snapshot or row platform
|
|
- try preferred-source refresh
|
|
- try preferred-source search
|
|
- if a preferred-source high-confidence result is found, return immediately
|
|
|
|
### Phase 2: Ranked Fallback
|
|
|
|
- build fallback candidates from configured `download_sources` excluding `preferred_source`
|
|
- ask the resolver stats repository for the ranked order for this `origin_source`
|
|
- attempt fallback sources in that order
|
|
- after each fallback attempt:
|
|
- record one attempt
|
|
- if that source resolves a usable candidate, record one success and stop
|
|
- if a fallback source fails to produce a usable candidate, continue to the next source
|
|
|
|
The resolver should still stop at the first acceptable fallback success in this iteration rather than exhaustively scanning later sources for a possibly better file.
|
|
|
|
## Side Database Schema
|
|
|
|
The side database should stay minimal for the first version.
|
|
|
|
### Table: `resolver_source_stats`
|
|
|
|
- `origin_source TEXT NOT NULL`
|
|
- `candidate_source TEXT NOT NULL`
|
|
- `attempt_count INTEGER NOT NULL DEFAULT 0`
|
|
- `resolve_success_count INTEGER NOT NULL DEFAULT 0`
|
|
- `last_attempt_at TEXT`
|
|
- `last_success_at TEXT`
|
|
- `created_at TEXT DEFAULT CURRENT_TIMESTAMP`
|
|
- `updated_at TEXT DEFAULT CURRENT_TIMESTAMP`
|
|
- primary key: `(origin_source, candidate_source)`
|
|
|
|
Recommended indexes:
|
|
|
|
- primary key already covers lookup by `(origin_source, candidate_source)`
|
|
- index on `(origin_source)` for ranking queries
|
|
|
|
## Repository Boundary
|
|
|
|
Introduce a dedicated repository for the side database, for example:
|
|
|
|
- `ResolverStatsRepository`
|
|
|
|
Responsibilities:
|
|
|
|
- initialize side-database schema
|
|
- upsert attempt and success counters
|
|
- report total fallback samples for an origin source
|
|
- return ranked fallback candidates for an origin source given the configured fallback list
|
|
|
|
The main `CatalogRepository` should not absorb this responsibility. Keeping the side database behind a dedicated repository keeps the separation explicit and prevents statistics logic from leaking into core business persistence.
|
|
|
|
## Configuration and File Layout
|
|
|
|
Add a dedicated resolver statistics database path derived from the application root, for example:
|
|
|
|
- `<APP_HOME>/data/resolver_stats.db`
|
|
|
|
This path should be configurable but should default automatically so current operators do not need new setup work.
|
|
|
|
The service should initialize both:
|
|
|
|
- main catalog database
|
|
- resolver statistics side database
|
|
|
|
Service startup should not fail if the side database is empty; it should be created on demand.
|
|
|
|
## Error Handling
|
|
|
|
Resolver statistics must not become a single point of failure.
|
|
|
|
If the side database update fails:
|
|
|
|
- do not fail the actual download item
|
|
- log the statistics error
|
|
- continue resolver fallback using the best available in-memory ordering for that invocation
|
|
|
|
If the side database ranking query fails:
|
|
|
|
- fall back to configured source order
|
|
|
|
This keeps the ranking system opportunistic rather than mission-critical.
|
|
|
|
## Testing Strategy
|
|
|
|
Tests should cover:
|
|
|
|
- side-database schema creation
|
|
- isolated side-database repository queries and updates
|
|
- warmup not reached:
|
|
- configured fallback order is preserved
|
|
- warmup reached:
|
|
- fallback order is re-ranked by per-origin-source statistics
|
|
- top-two-first behavior:
|
|
- top two ranked fallback sources are attempted before the rest
|
|
- continuation behavior:
|
|
- if top two fail, later sources are still attempted
|
|
- grouping behavior:
|
|
- `qq` ranking does not affect `netease` ranking
|
|
- graceful degradation:
|
|
- side-database failure falls back to configured order instead of failing the item
|
|
|
|
## Acceptance Criteria
|
|
|
|
- resolver statistics are stored in a dedicated SQLite side database rather than the main business database
|
|
- fallback statistics persist across jobs and service restarts
|
|
- ranking is grouped by original source
|
|
- before the warmup threshold, fallback order matches configured source order
|
|
- after the warmup threshold, top two fallback candidates for an origin source are tried first according to smoothed success rate
|
|
- if the top two fallback candidates fail, resolver still attempts the remaining fallback sources
|
|
- statistics-store failures do not fail the download item outright
|
|
- automated tests cover ranking, grouping, warmup, and fallback-to-configured-order behavior
|