11 KiB
Catalogsync Resolver Source Ranking Design
Goal
Improve resolver throughput without sacrificing cross-run learning by introducing a persistent, isolated source-ranking store.
The resolver should keep treating the song's original platform as the preferred source, but fallback order should become adaptive instead of fixed. The adaptive order must:
- learn over time across jobs and restarts
- be grouped by original source instead of using one global ranking
- stay isolated from the main catalog business tables
- preserve the current "keep trying later sources if earlier ones fail" behavior
Confirmed Decisions
The following points were confirmed during design discussion:
- the ranking model is grouped by original source
- statistics must persist across tasks and service restarts
- the statistics store must be isolated from the main business schema
- the original source is still tried first
- after the warmup threshold is reached, fallback should try the top two ranked sources first
- if the top two fallback sources fail, resolver must continue trying the remaining sources
Scope
In Scope
- add a dedicated resolver statistics SQLite side database
- record persistent fallback attempt and success statistics by
(origin_source, candidate_source) - use fallback statistics to reorder sources after a warmup threshold
- keep the original source as the first attempt
- preserve existing resolver matching and candidate selection logic within a single source
- cover side-database initialization, repository methods, ranking logic, and resolver behavior with tests
Out Of Scope
- changing sync-stage behavior
- caching download URLs across runs
- changing song uniqueness rules
- replacing the current matching heuristics inside a source
- adding a UI for resolver statistics in this iteration
- distributed or external metrics storage
Problem Statement
The current resolver still spends too much time in fallback traversal.
Today, resolver behavior is:
- derive the original platform as
preferred_source - try the preferred source first
- if preferred-source fast return does not happen, continue through the configured fallback list
- within fallback traversal, source order is static and does not learn from production outcomes
This causes two operational problems:
- fallback time is longer than necessary because low-yield sources keep being retried early
- the system does not accumulate knowledge from prior jobs, so every restart returns to the same static ordering
The result is that resolver throughput remains bursty even after the dual-pool pipeline work, because the ready queue is still fed by a fallback strategy that does not adapt.
Approaches Considered
Approach A: Global Ranking For All Sources
Keep one success-rate table for all candidate sources regardless of original platform.
Pros:
- simplest data model
- easiest ranking query
Cons:
- mixes very different source relationships
- large-volume platforms can dominate the ranking
- does not reflect that
qq -> kuwoandnetease -> kuwomay behave differently
Decision:
- rejected because the learning model should follow the original source
Approach B: In-Memory Per-Run Learning Only
Track statistics only for the current job and discard them at task end.
Pros:
- no schema work
- easy to experiment with
Cons:
- restarts lose all learning
- long warmup every time
- directly conflicts with the requirement for cross-run reuse
Decision:
- rejected
Approach C: Persistent Side Database Grouped By Original Source
Store statistics in a dedicated SQLite side database keyed by original source and fallback source.
Pros:
- matches the confirmed grouping model
- survives restarts and future jobs
- keeps analytics-style tables isolated from the main business schema
- easy to evolve independently from catalog tables
Cons:
- requires one more database file and repository
- adds coordination between resolver and statistics store
Decision:
- recommended
Recommended Design
High-Level Architecture
Add a dedicated resolver statistics store, for example:
resolver_stats.db
This database is initialized separately from catalogsync.db and contains only resolver-learning tables.
The main download flow remains:
- build
target_song_info - determine
preferred_source - try preferred source first
- reorder fallback sources using persistent statistics when warmup criteria are met
- try ranked top two fallback sources first
- if still unresolved, continue the remaining fallback sources in ranked order
The existing resolver still owns matching, candidate picking, and final candidate selection inside each source.
Statistics Model
The learning key is:
origin_sourcecandidate_source
Where:
origin_sourceis the normalized original platform for the song being resolvedcandidate_sourceis a fallback source actually attempted after the original source path failed
Statistics are recorded only for fallback attempts. Preferred-source attempts are not stored in this side database for the first iteration because the ranking problem is specifically about fallback order.
Stored Counters
Each row should persist:
origin_sourcecandidate_sourceattempt_countresolve_success_countlast_attempt_atlast_success_atcreated_atupdated_at
Warmup and Ranking Rules
Warmup Threshold
The warmup threshold is not global song count. It is the total fallback sample count for a specific origin_source.
Example:
qqfallback learning activates only after the sum of allqq -> *fallback attempts reaches1000neteasefallback learning activates independently after the sum of allnetease -> *fallback attempts reaches1000
Ranking Formula
Use a smoothed success rate:
(resolve_success_count + 1) / (attempt_count + 2)
This avoids unstable rankings when sample counts are still low.
Ranked Traversal
For a song with original source origin_source:
- always try
preferred_sourcefirst - if preferred source does not resolve a high-confidence downloadable result, enter fallback
- if
origin_sourcewarmup threshold is not met:- keep the configured fallback order
- if
origin_sourcewarmup threshold is met:- sort fallback candidates by smoothed success rate, highest first
- preserve configured order as the tie-breaker
- try the top two ranked fallback sources first
- if both fail, continue with the remaining ranked fallback sources
This preserves completeness while improving average-case resolution speed.
Resolver Flow Changes
Resolver source ordering should become a two-phase plan:
Phase 1: Preferred Source
- derive
preferred_sourcefrom the snapshot or row platform - try preferred-source refresh
- try preferred-source search
- if a preferred-source high-confidence result is found, return immediately
Phase 2: Ranked Fallback
- build fallback candidates from configured
download_sourcesexcludingpreferred_source - ask the resolver stats repository for the ranked order for this
origin_source - attempt fallback sources in that order
- after each fallback attempt:
- record one attempt
- if that source resolves a usable candidate, record one success and stop
- if a fallback source fails to produce a usable candidate, continue to the next source
The resolver should still stop at the first acceptable fallback success in this iteration rather than exhaustively scanning later sources for a possibly better file.
Side Database Schema
The side database should stay minimal for the first version.
Table: resolver_source_stats
origin_source TEXT NOT NULLcandidate_source TEXT NOT NULLattempt_count INTEGER NOT NULL DEFAULT 0resolve_success_count INTEGER NOT NULL DEFAULT 0last_attempt_at TEXTlast_success_at TEXTcreated_at TEXT DEFAULT CURRENT_TIMESTAMPupdated_at TEXT DEFAULT CURRENT_TIMESTAMP- primary key:
(origin_source, candidate_source)
Recommended indexes:
- primary key already covers lookup by
(origin_source, candidate_source) - index on
(origin_source)for ranking queries
Repository Boundary
Introduce a dedicated repository for the side database, for example:
ResolverStatsRepository
Responsibilities:
- initialize side-database schema
- upsert attempt and success counters
- report total fallback samples for an origin source
- return ranked fallback candidates for an origin source given the configured fallback list
The main CatalogRepository should not absorb this responsibility. Keeping the side database behind a dedicated repository keeps the separation explicit and prevents statistics logic from leaking into core business persistence.
Configuration and File Layout
Add a dedicated resolver statistics database path derived from the application root, for example:
<APP_HOME>/data/resolver_stats.db
This path should be configurable but should default automatically so current operators do not need new setup work.
The service should initialize both:
- main catalog database
- resolver statistics side database
Service startup should not fail if the side database is empty; it should be created on demand.
Error Handling
Resolver statistics must not become a single point of failure.
If the side database update fails:
- do not fail the actual download item
- log the statistics error
- continue resolver fallback using the best available in-memory ordering for that invocation
If the side database ranking query fails:
- fall back to configured source order
This keeps the ranking system opportunistic rather than mission-critical.
Testing Strategy
Tests should cover:
- side-database schema creation
- isolated side-database repository queries and updates
- warmup not reached:
- configured fallback order is preserved
- warmup reached:
- fallback order is re-ranked by per-origin-source statistics
- top-two-first behavior:
- top two ranked fallback sources are attempted before the rest
- continuation behavior:
- if top two fail, later sources are still attempted
- grouping behavior:
qqranking does not affectneteaseranking
- graceful degradation:
- side-database failure falls back to configured order instead of failing the item
Acceptance Criteria
- resolver statistics are stored in a dedicated SQLite side database rather than the main business database
- fallback statistics persist across jobs and service restarts
- ranking is grouped by original source
- before the warmup threshold, fallback order matches configured source order
- after the warmup threshold, top two fallback candidates for an origin source are tried first according to smoothed success rate
- if the top two fallback candidates fail, resolver still attempts the remaining fallback sources
- statistics-store failures do not fail the download item outright
- automated tests cover ranking, grouping, warmup, and fallback-to-configured-order behavior