Commit 306c533
committed
feat: replace SSTables with Parquet, add predicate partitioning and tiered cache (Phase 2)
Replace custom SSTable binary format with Apache Parquet columnar storage,
introduce vertical partitioning by predicate, and add a three-tier cache
(Caffeine heap -> local disk LRU -> S3).
Storage redesign:
- Parquet files on S3 with ZSTD compression and dictionary encoding
- Predicate-based partitioning (data/predicates/{id}/) eliminates
predicate column from files, tightening column statistics
- Three sort orders per partition (SOC, OSC, CSO) for optimal query
performance regardless of access pattern
- Single MemTable in SPOC order, partitioned on flush
- JSON catalog with per-file column statistics for catalog-level pruning
Cache system:
- L1: Caffeine heap cache (configurable, default 256 MB)
- L2: Local disk LRU cache (configurable, default 10 GB)
- L3: S3 source of truth
- Write-through on flush avoids cold reads
Compaction:
- L0->L1 merge when epoch count >= 8 per predicate
- L1->L2 merge when epoch count >= 4 per predicate
- Tombstone suppression at highest level
Hadoop dependency elimination:
- Zero Hadoop JARs in dependency tree
- PlainParquetConfiguration + custom SimpleCodecFactory bypass all
Hadoop runtime paths
- 14 minimal stub classes in org.apache.hadoop.* satisfy parquet-hadoop
JVM class loading requirements
Deleted: SSTable, SSTableWriter, Manifest (replaced by Parquet + Catalog)
All 529 tests pass.1 parent e5379d5 commit 306c533
39 files changed
Lines changed: 3036 additions & 1178 deletions
File tree
- core/sail/s3
- src
- main/java/org
- apache/hadoop
- conf
- fs
- mapreduce
- lib/input
- mapred
- eclipse/rdf4j/sail/s3
- cache
- storage
- test/java/org/eclipse/rdf4j/sail/s3/storage
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
55 | 75 | | |
56 | 76 | | |
57 | 77 | | |
| |||
Lines changed: 15 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
0 commit comments