Remove bridge sync indexing cap#261
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BRIDGE_SEARCH_INDEX_MAX_RESOURCE_IDSandBRIDGE_CACHE_REFRESH_MAX_RESOURCE_IDSenv vars as compatibility aliases for batch size, not total limits.Root Cause
The May 23 production bridge run imported 8,945 changed resources, but the targeted Elasticsearch refresh only processed the first 5,000 IDs. That left a tail of stale ES documents/facets even though the database rows had the corrected provider/publisher values.
Implementation Notes
The fix keeps the operational guardrail that large deltas should be chunked. The
5000value now controls batch size, so a 300K-item delta is processed as multiple batches rather than one giant request or a truncated first slice.Validation
BTAA_SKIP_TEST_DB=true python -m pytest backend/tests/services/test_bridge_search_index.py backend/tests/services/test_bridge_cache_refresh.py backend/tests/services/test_bridge_sync_report.pyruff format --check backend/app/services/bridge_sync/search_index.py backend/app/services/bridge_sync/cache_refresh.py backend/app/services/bridge_sync/report.py backend/tests/services/test_bridge_search_index.py backend/tests/services/test_bridge_cache_refresh.py backend/tests/services/test_bridge_sync_report.pyruff check backend/app/services/bridge_sync/search_index.py backend/app/services/bridge_sync/cache_refresh.py backend/app/services/bridge_sync/report.py backend/tests/services/test_bridge_search_index.py backend/tests/services/test_bridge_cache_refresh.py backend/tests/services/test_bridge_sync_report.pygit diff --checkRefs #145