Summary
Add a background gap-fill sync mode that detects and fills missing blocks without requiring a full reindex. Currently, if blocks are missed (due to crashes, RPC timeouts, or partial failures), the only recovery path is manual intervention or reindexing from scratch.
Motivation
Atlas already tracks failed_blocks, but there's no automated mechanism to:
- Detect gaps in the blocks table (missing block numbers between indexed ranges)
- Retry failed blocks after the initial failure window
- Fill gaps that occur from crashes or restarts during indexing
Design
Gap Detection
Use SQL window functions to find missing block ranges:
SELECT number + 1 AS gap_start,
next_number - 1 AS gap_end
FROM (
SELECT number, LEAD(number) OVER (ORDER BY number) AS next_number
FROM blocks
) gaps
WHERE next_number - number > 1
ORDER BY number DESC -- newest gaps first
LIMIT 100;
Fast path: Before running the expensive window function, do a quick check:
SELECT (MAX(number) - MIN(number) + 1) - COUNT(*) AS missing_count FROM blocks;
If 0, skip the full scan.
Fill Strategy
- Run as a separate background tokio task alongside the main indexer
- Fill newest gaps first (most likely to be needed by users)
- Configurable concurrency (e.g., 4 workers, separate from main indexer workers)
- Adaptive throttling: Pause gap-fill when the main realtime sync falls behind (e.g., lag > 10 blocks), resume when caught up
- Reuse existing fetch and write infrastructure
Also retry failed_blocks
- Periodically scan
failed_blocks table for blocks that can be retried
- Exponential backoff based on
retry_count
- Remove from
failed_blocks on successful indexing
Considerations
- Gap-fill should use a separate rate limiter / semaphore budget to avoid starving the realtime sync
- Derived tables (ERC-20 balances, NFT ownership) need to be updated for gap-filled blocks
- The
indexer_state.last_indexed_block watermark should not advance past gaps — or a separate "contiguous height" metric should be tracked
References
- tidx implementation:
sync/engine.rs — run_gapfill_loop with LAG() gap detection, concurrent workers via JoinSet, adaptive throttling based on realtime lag
Summary
Add a background gap-fill sync mode that detects and fills missing blocks without requiring a full reindex. Currently, if blocks are missed (due to crashes, RPC timeouts, or partial failures), the only recovery path is manual intervention or reindexing from scratch.
Motivation
Atlas already tracks
failed_blocks, but there's no automated mechanism to:Design
Gap Detection
Use SQL window functions to find missing block ranges:
Fast path: Before running the expensive window function, do a quick check:
If 0, skip the full scan.
Fill Strategy
Also retry failed_blocks
failed_blockstable for blocks that can be retriedretry_countfailed_blockson successful indexingConsiderations
indexer_state.last_indexed_blockwatermark should not advance past gaps — or a separate "contiguous height" metric should be trackedReferences
sync/engine.rs—run_gapfill_loopwithLAG()gap detection, concurrent workers viaJoinSet, adaptive throttling based on realtime lag