refactor: SVS shared thread pool with rental model (MOD-9881)#925
Open
refactor: SVS shared thread pool with rental model (MOD-9881)#925
Conversation
…l terminology** Rename `setNumThreads`/`getNumThreads` → `setParallelism`/`getParallelism` and `getThreadPoolCapacity` → `getPoolSize` across VectorSimilarity. Public info API fields (`numThreads`, `lastReservedThreads`, `NUM_THREADS`, `LAST_RESERVED_NUM_THREADS`) remain unchanged. No behavioral changes.
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #925 +/- ##
==========================================
- Coverage 96.98% 96.91% -0.08%
==========================================
Files 129 129
Lines 7574 7651 +77
==========================================
+ Hits 7346 7415 +69
- Misses 228 236 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
thpool tests fix fp16 tests
align bm
rfsaliev
previously approved these changes
Apr 8, 2026
Collaborator
rfsaliev
left a comment
There was a problem hiding this comment.
LGFM, but some suggestions.
dor-forer
reviewed
Apr 9, 2026
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit df969a1. Configure here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Overview
Replaces per-index SVS thread pools with a single shared thread pool that can be physically resized at runtime, and introduces a rental-based execution model for
parallel_for. This eliminates over-provisioning (previously K indexes × N threads = K×N OS threads) and enables the SVS pool to actually grow/shrink when the external pool capacity changes — something the old design could not do.Also adds a new C API
VecSim_UpdateThreadPoolSize()that combines write-mode toggling with pool resizing, and deprecatesSVSParams::num_threads.Key Guarantees and Assumptions
The fundamental scheduling invariant
The shared SVS thread pool has the same size as the RediSearch worker thread pool (
search-workers). Every SVS multi-threaded operation (update, GC) is dispatched viaSVSMultiThreadJob, which submits N reserve jobs to the RediSearch worker pool — one per worker thread. Only the threads that actually check in (are not busy with other work) participate in the SVS operation. This means:rent()requests can never exceed the pool capacity, because each rented SVS thread corresponds to a RediSearch worker that checked in via a reserve job. If a worker is busy, it doesn't check in, so it's never counted inavailableThreadsand never requested from the SVS pool.setParallelism(n)is always called withn = min(availableThreads, problemSize), whereavailableThreads ≤ poolSize(). This is why the assertionn <= pool->size()is safe.ThreadSlots are held viashared_ptr; shrinking drops the pool's reference but the renter keeps the slot (and its OS thread) alive until~RentedThreadsreleases it. Resize never blocks on in-flight operations.Other invariants
VecSimSVSThreadPoolImpl::instance()). There is nonullptrstate; write-in-place mode uses a pool of size 1 (0 worker threads, calling thread only).parallelism_ >= 1— the calling thread always participates, so parallelism can never be 0.ThreadSlots are held viashared_ptr; shrinking drops the pool's reference but the renter keeps the slot (and OS thread) alive until~RentedThreads.shared_ptr<atomic<size_t>> parallelism_;setParallelism()on one index never affects another.Relationship with RediSearch
VecSim_UpdateThreadPoolSize(N)when the worker count changes, so VecSim can create/destroy the matching number of OS threads.Changes from Old Implementation
VecSimSVSThreadPoolImplVecSimSVSThreadPoolImplshared across all indexes viashared_ptrresize()clamped a logical counterresize()spawns/destroys OS threadsparallel_forthreadingatomic<bool>setNumThreads(n)/resize(n)— mutated the poolsetParallelism(n)— sets a per-indexatomic<size_t>, pool unchangedcapacity()(max pre-allocated)poolSize()(current shared pool size)manage_exception_during_run()threw unconditionallymanage_workers_after_run()only throws if an error occurredAPI Changes
Renamed methods
setNumThreads(n)setParallelism(n)SVSIndex,SVSIndexBasevirtual interfacegetNumThreads()getParallelism()SVSIndex,SVSIndexBasevirtual interfacegetThreadPoolCapacity()getPoolSize()SVSIndex,SVSIndexBasevirtual interfaceRemoved methods
VecSimSVSThreadPool::capacity()poolSize()for shared pool sizeVecSimSVSThreadPool::resize()setParallelism()for per-index control,VecSimSVSThreadPool::resize(size_t)(static) for global poolNew C API
New internal APIs
VecSimSVSThreadPoolImpl::instance()shared_ptr<VecSimSVSThreadPoolImpl>by valueVecSimSVSThreadPool(void* log_ctx)VecSimSVSThreadPool::resize(size_t)VecSimSVSThreadPool::poolSize()VecSimSVSThreadPoolImpl::rent(count, log_ctx)RentedThreads)Deprecated
VecSim_SetWriteMode()VecSim_UpdateThreadPoolSize()VecSim_UpdateThreadPoolSize) and unit tests. Not called by RediSearch — RediSearch will use onlyVecSim_UpdateThreadPoolSizeSVSParams::num_threadsVecSim_UpdateThreadPoolSize()New types
ThreadSlot— wrapssvs::threads::Thread+atomic<bool> occupied; non-copyable, non-movableRentedThreads— RAII guard; move-only; releases slots via lock-free atomic stores in destructorbool isScheduledflag onSVSMultiThreadJob; its destructor callsendScheduledJob()on the shared pool (no separateScheduledJobTokenclass)Behavioral Changes
setParallelism(0)resize(0)silently clamped to 1setParallelism(n > poolSize)search-workersn <= pool->size()— scheduling bugn > parallelisminparallel_forThreadingExceptionifn > size_parallelism_initial valuelogCallback, asserts in debug, usesrented.count()for graceful degradation in releaseparallel_formanage_exception_during_run()always threwmanage_workers_after_run()only throws if an error actually occurredupdateSVSIndexWrapperlabels_to_moveis emptySVSParams::num_threadsset to non-zeroPublic info API fields — unchanged
The debug info field names
numThreadsandlastReservedThreads(and their string keysNUM_THREADS,LAST_RESERVED_NUM_THREADS) are unchanged. Their semantics shift slightly:numThreads→ now reportsgetPoolSize()(shared pool size, not per-index capacity)lastReservedThreads→ now reportsgetParallelism()(per-index parallelism for last operation)What Does NOT Change
deps/ScalableVectorSearch/) — no modificationsSVSMultiThreadJobpattern — still submits N reserve jobs to the external worker poolThreadstate machine — used as-is; lifecycle managed viashared_ptrFiles Changed
src/VecSim/algorithms/svs/svs_utils.hThreadSlot,RentedThreads, rental-basedVecSimSVSThreadPoolImpl, per-indexVecSimSVSThreadPoolwrappersrc/VecSim/algorithms/svs/svs.hnum_threadsdeprecation warningsrc/VecSim/algorithms/svs/svs_tiered.hupdateSVSIndexWrapper,SVSMultiThreadJobusesbool isScheduled+ destructor-basedendScheduledJob()instead ofScheduledJobToken,createJobsImplmerged intocreateJobssrc/VecSim/vec_sim.h/vec_sim.cppVecSim_UpdateThreadPoolSize()C APIsrc/VecSim/vec_sim_common.hSVSParams::num_threadsmarked as deprecated in commentstests/unit/test_svs_tiered.cppVecSim_UpdateThreadPoolSizein setup, removednum_threadsfrom params, updated assertionstests/unit/test_svs_fp16.cpptests/unit/test_svs.cppnumThreadsinfo field expects default; newNumThreadsParamIgnoredtesttests/benchmark/bm_utils.h/bm_vecsim_svs.hTesting
testThreadPoolrewritten to cover: default parallelism,setParallelismboundary assertions (ASSERT_DEATH),parallel_forwithn < parallelism, write-in-place mode, exception handlingTestDebugInfoThreadCount/TestDebugInfoThreadCountWriteInPlaceupdated:lastReservedThreadsnow starts at 1 (notnum_threads)ThreadsReservationtest usestieredIndexMock(num_threads)to properly size the shared poolNumThreadsParamIgnored— verifies that settingSVSParams::num_threadsemits a deprecation warning, does not affect the shared pool size, and that omitting it produces no warningnum_threads = 1inSVSParamsnow usetieredIndexMock(1)to resize the shared pool insteadMark if applicable
Note
High Risk
High risk because it rewrites SVS threading and job scheduling, introducing shared mutable concurrency primitives (rental, deferred shrink, per-index parallelism) that can affect correctness and stability under load.
Overview
SVS threading is refactored to a shared, physically resizable thread pool with a rental-based
parallel_for, allowing multiple indexes/jobs to concurrently rent disjoint worker threads and making shrink/grow safe during active operations.The SVS public/virtual threading knobs are renamed from threads to parallelism (e.g.,
get/setParallelism,getPoolSize),SVSParams::num_threadsis deprecated/ignored (with warning log when set), and SVS debug info now reports shared pool size asnumThreadsand last-used per-index parallelism aslastReservedThreads.Adds
VecSim_UpdateThreadPoolSize()C API to toggle write mode (0→ in-place,>0→ async) while resizing the shared SVS pool, updates tiered SVS update/GC scheduling to snapshot pool size and defer pool shrink until scheduled jobs complete, and expands/adjusts benchmarks and unit tests (including newtest_svs_threadpool.cpp) to cover resize/rental/deferred-shrink behavior and the deprecated param handling.Reviewed by Cursor Bugbot for commit df969a1. Bugbot is set up for automated code reviews on this repo. Configure here.