Fixes, performance improvements and new benchmarks by apirogov · Pull Request #33 · apirogov/tilezz

apirogov · 2026-04-22T16:05:43Z

No description provided.

PERF-1: is_zero/is_one now compare coeffs directly (no Vec alloc) PERF-2: zz_add/zz_sub/zz_neg return Self via in-place mutation (no Vec) PERF-3: scale() works in-place via zz_coeffs_mut (no Vec via zz_mul_scalar) PERF-4: re()/im() build result via zz_coeffs_mut (no Vec collect) PERF-5: conj() mutates a copy in place (no Vec collect) PERF-6: From conversion impls use zz_coeffs_mut (no Vec collect) PERF-7: GaussInt::pow uses binary exponentiation (O(log n) muls) PERF-8: zz_inv rationalization loop gets debug_assert for i64 overflow

New stringmatch module with linear-time suffix array construction using the SA-IS algorithm (Nong, Zhang, Chan 2011). Works on u32 integer sequences with a sentinel (0) as the last element. Includes 18 tests covering edge cases, known examples, and randomized comparison against naive suffix sort.

LCP: O(n) construction using Kasai's algorithm with rank array. RMQ: O(n log n) sparse table for O(1) range minimum queries. 19 new tests across both modules, all verified against naive implementations.

MatchIndex: generalized suffix array over i8 integer sequences. Concatenates strings with unique sentinels, builds SA/LCP/RMQ, tracks document/position arrays for per-string query support. Maximal matches: for any pair (i,j), enumerates all maximal common substrings via SA scan with running LCP minimum. Right-maximality is automatic (LCP = exact match length). Left-maximality checked per candidate. 21 new tests including brute-force comparison for random inputs and edge cases (empty strings, self-matches, negative values).

Domain layer for matching cyclic angle sequences along reverse complementary subsequences: - Doubles each string for cyclic wraparound matching - Indexes both double(s) and double(revcomp(s)) in MatchIndex - Maps positions back to original cyclic coordinates - Re-verifies cyclic left-maximality (linear left-max is wrong at the doubled string seam and for full-wrap matches) - Keeps only longest match per cyclic position pair 14 tests including spectre-like angles, cyclic wraparound, self-matches, and randomized comparison against brute-force.

…match

Phase 7: integrates CyclicMatchIndex with Rat geometric validation to find all valid glue operations between tiles. Two-phase approach: CMI-seeded multi-edge matches + single-edge brute force with heuristic pruning.

…alse rejection Brute-force validator tries ALL (ia,ib) pairs with try_glue and compares against TileSet::valid_glues for hexagon pairs (36 single-edge, all valid, same result), hexamino pairs (mixed multi/single-edge), spectre pairs (multi-edge + mystic), and mixed square+triangle pairs. Found and fixed bug: junction degeneracy check rejected jl==0 (collinear), but collinear junctions are valid — edges just merge into a longer edge. Only |jl|==hturn (180-degree fold-back) is a reliable rejection heuristic.

…r single-edge validation Empirical analysis of spectre self-match (159 unique intervals, 79 accepted, 80 rejected) showed: - |jl|==hturn (fold-back) heuristic was WRONG: 4 valid glues at (0,4),(0,8), (6,4),(6,8) were incorrectly rejected. All pass try_glue. - |jl|==hturn heuristic was USELESS: 0 out of 80 rejected cases had |jl|==hturn. - No junction-angle criterion works: all 79 accepted cases have jl<0 or jr<0, so any angle-based rejection would produce false positives. Single-edge Phase 2 now only uses the covered grid (skip positions already handled by Phase 1 multi-edge matches) + try_glue for geometric validation. No untested heuristic code remains.

When two tiles share a single edge, the interior angles at each junction vertex must not sum to more than a full circle. If they do (exterior angle sum < 0), the neighbor edges physically overlap, creating a self-intersection. If they sum to exactly 0, the edges are collinear — a multi-edge extension already handled by the CMI index. Only strictly positive sums (gap at both junctions) yield true single-edge candidates. This replaces the covered-grid approach with a cleaner two-phase design: Phase 1 seeds multi-edge candidates from the CMI index; Phase 2 enumerates all position pairs and applies the heuristic to reject locally inconsistent candidates before the global try_glue check.

…dd heuristic to valid_glues_containing - Extract get_new_match, build_glue_op, sort_by_interval helpers from duplicated try_seed closures in valid_glues and valid_glues_containing - Replace 3 identical mystic sequence literals with MYSTIC_ZZ12 constant - Add is_single_edge_candidate heuristic to valid_glues_containing's single-edge phase for consistency with valid_glues

…haustive containing query test

…negative offsets, add tests, remove ROADMAP.md

…double normalization

…nst brute-force try_glue

… construction

The CMI matches for (i,j) are asymmetric with (j,i) because try_glue is directional: a.try_glue((ia,ib), b) differs from b.try_glue((ib,ia), a). Missing the reverse direction caused 18 patches to be missed at size 8. Also adds CMI diagonal seeding (k in 0..=m.len per match), heuristic isolation tests, and patch_growth timing output.

…lo bug The junction gap check generalizes is_single_edge_candidate to arbitrary match lengths: at each junction vertex, check that the exterior angle sum is positive (physical gap). This rejects ~25% of candidates before the expensive Snake self-intersection check. Also fixes a bug in Rat::get_match where ext_end used self.len() instead of other.len() for the modulo, producing wrong B-side positions when tiles had different edge counts. This changed patch counts (e.g. hexagon size 8: 2260 -> 2243) but all results are now validated correct. Adds profile_glue binary for per-phase timing analysis.

Seed analysis showed the boundary extension (pa-1, pb+1) produced 0 unique matches across all 81 pairs of size-4 hexagon patches. Removed. Also reduced CMI diagonal seeding from all interior positions to endpoints only (k=0 and k=len) — validated correct against brute force through size 8. Interior positions produced no unique results beyond what overlapping CMI matches already cover.

Instrumented duplicate detection showed: - CMI phase has 2,704 internal duplicates (overlapping maximal matches) - CMI vs single-edge: 2,040 overlaps (single-edge re-finds CMI matches) - Single-edge phase has 0 internal duplicates The seen HashSet is needed at the CMI boundary but not within single-edge. Consolidated try_add_glue into try_glue_if_new, removed dead helpers, updated doc comments.

k=len seeds were past-the-end positions of CMI matches. Since CMI matches are maximal (can't extend forward), these positions don't match, making k=len seeds redundant. Verified by brute force through size 8: k=0-only produces identical results. This eliminates all 2,704 CMI-internal duplicates. The seen set is now only needed for CMI vs single-edge overlap (2,040 duplicates).

Split candidate generation into two disjoint phases: - Multi-edge (CMI): accepts len >= 2, junction gap >= 0 (reject overlap only) - Single-edge scan: accepts len == 1 only (gap > 0 rejects overlap + collinear) By construction, the phases produce disjoint result sets, so no deduplication HashSet is needed. Single function with a flag controls strictness. Removed heuristic flag and no-heuristic test helper.

…d_rats_for_pairs Binary decomposition was fundamentally incomplete: it misses patches that cannot be formed by power-of-2 splits (e.g. T tetromino at size 4). At size 8 square, binary found 365 vs naive's 693 (47% missed). Also adds valid_rats_for_pairs using BTreeSet instead of BTreeMap<Vec<GlueOp>> to avoid OOM on large tile pair counts.

…owth Binary decomposition missed patches that couldn't be formed by power-of-2 splits (e.g. T tetromino at size 4). Naive (k-1)+1 is universally complete. Results for square ZZ4 (was vs now): size 4: 6 vs 7 (missed T-tetromino) size 8: 365 vs 693 (binary missed 47%) size 10: 4252 vs 8808 (binary missed 52%)

Replaces layer-by-layer TileSet rebuild with a DFS that grows patches one tile at a time using monotone birth-counter ordering on boundary edges, eliminating the need to rebuild the CMI suffix array at each layer. Deduplicates via canonical Rat at each size level. Benchmark (hexagon, size 9): 8x speedup over the old TileSet approach (38s vs 307s), with identical results verified at all sizes. New files: - src/intgeom/growing.rs: GrowingPatch type, site enumeration, DFS - src/bin/patch_bench.rs: benchmark binary comparing both approaches

Replace brute-force O(n*m) matching with TileSet::valid_glues(0,1) using the existing CMI suffix array infrastructure. Keep GrowingPatch angles canonical via lex_min_rot, enabling direct position mapping between TileSet and GrowingPatch coordinate systems. 2x speedup over brute-force Redelmeier (19s vs 38s at hexagon size 9), 16x faster than the original layer-by-layer approach (307s).

Add --ring flag to patch_bench (zz4/zz12). Add ZZ4 square test verifying Redelmeier matches old TileSet approach up to size 6. Generic print_results eliminates ZZ12-only type constraint.

Add make_free() to mod out reflections (min(rat, rat.reflected())). Add test verifying counts match OEIS A000104 (free polyominoes without holes) up to size 9. Add --free flag to patch_bench. Validated: 1, 1, 2, 5, 12, 35, 107, 363, 1248 free polyominoes.

…idation Fixes the position tracing bug (y[0] was double-counted with a_xy junction fixup) and adds 15 unit tests covering geometry, matching, glue application, OEIS validation (through size 12), and comparison against general Redelmeier.

…ization - Replace std::HashSet with rustc-hash::FxHashSet for faster hashing (~10% improvement) - Add from_canonical_angles_unchecked() to Rat to skip redundant lex_min_rot normalization - This eliminates double-normalization in the hot path: try_apply already normalizes angles, but to_rat() was re-normalizing via prepare_seq() - Add pprof integration with flamegraph support for profiling - Total improvement: 25.7% faster (12.66s -> 9.41s at max-size 13)

The Rat is now cached when Patch is created (in from_seed and try_apply). to_rat() returns a clone of the cached Rat instead of recomputing. This adds slight complexity but ensures correctness and is marginally faster.

- Replace BTreeSet with FxHashSet for storing patches by size - This gives O(1) lookup/insert instead of O(log n) - Add Hash implementation for Rat to support FxHashSet - Improve size 14 from 36.87s to 31.03s (~16% faster) - Total improvement from original: ~38% faster (12.66s -> 7.84s at size 13)

- In try_apply, compute min(rat, reflected(rat)) before storing - This cuts enumerate calls in half (each free polyomino has 2 one-sided forms) - Size 14: 31.03s -> 27.75s (~10% faster) - Now storing free polyominoes directly, eliminating need for make_free at end - Update test to correctly validate OEIS A000105 free polyomino values

- Suppress dead_code for make_free and enumerate_onesideed (kept for API/compatibility) - Prefix unused variables with underscore - Add allow(dead_code) attribute to pub(crate) enumerate_sites

- Remove ring selection, now only supports ZZ12 - Default to hexagon tiles, size 7 - Cleaner code overall

- Remove old TileSet approach - Just run Redelmeier algorithm for benchmarks - Default to hexagons, size 9

- Add cached Rat to GrowingPatch (avoid recomputing on each to_rat call) - Use from_canonical_angles_unchecked to skip redundant lex_min_rot - Both changes significantly speed up general Redelmeier algorithm - patch_bench hex size 9: 18.8s (was 18.89s before these changes)

Switch result storage from BTreeSet to FxHashSet for O(1) amortized inserts. Add grow_redelmeier_free() that stores min(r, r.reflected()) during enumeration, halving the search space. Both changes port the key optimizations from the specialized polyomino benchmark.

The TileSet builds a full CyclicMatchIndex (SA-IS suffix array, LCP, sparse RMQ table) for every enumerate_sites call on just 2 tiles. Replace with direct O(n*m) angle comparison matching, identical to the optimized polyomino benchmark's compute_match. Snake validation is now done per-site in the caller instead of per-unique-shape in TileSet.

Track positions and visited set in GrowingPatch for O(1) overlap checks. Skip Snake validation for ZZ4 (vertex overlap is sufficient). Falls back to Snake check for other rings where segment crossing is possible.

Replace Ratio<i64>-based cyclotomic positions with scaled integer tuples (Pos2/Pos4/Pos8) for ZZ4, ZZ10, ZZ12. Eliminates cross-multiply + GCD overhead from every position addition, hash, and comparison in the hot path.

…ZZ12 Mirror existing intersect/is_ccw/is_between/wedge geometry exactly using scaled integer SR type instead of cyclotomic Ratio arithmetic. Only check new segments against kept parent segments. Skip Snake::try_from for ZZ12.

The pentagonal rings Q(sqrt(5), sqrt(10-2*sqrt(5))) have degree-4 real subfields with non-integer basis constants, so the existing signum_sum_sqrt_expr_4 (designed for integer roots like ZZ24's {1,2,3,6}) cannot apply. The new algorithm recursively reduces from Q(sqrt(5), sqrt(y)) to Q(sqrt(5)) by grouping as P + Q*sqrt(y) and computing sign(P^2 - Q^2*y) via the existing signum_sum_sqrt_expr_2.

Gates the direct cyclotomic intersection implementation behind a feature flag, keeping the integer Pos4 fast path as default. Also fixes trait resolution ambiguity by explicitly qualifying < as SymNum>::turn().

- Restore code comments in Snake::add_unsafe that were accidentally removed - Remove unnecessary u64 casts in pentagonal signum comparison - Remove needless borrows on seq references - Collapse nested if in snake validation check - Group GrowState struct to reduce grow_recursive_inner args (8→7) - Group ProfileStats struct to reduce polyomino grow_recursive args (9→7)

Gate pos4 intersection code with #[cfg(not(feature = "cyclotomic_intersect"))], fix cfg-gated test imports, replace wildcard enum import, use i64::from for lossless casts, fix needless_range_loop in tests, and fix polyomino test loop.

AI Bot added 30 commits April 22, 2026 18:06

optimize rat_enum performance

6a458cb

micro optimizations for unit grid implementation

189758f

update AGENTS.md

46eb761

Add LCP array (Kasai's) and sparse table RMQ

585ce6c

LCP: O(n) construction using Kasai's algorithm with rank array. RMQ: O(n log n) sparse table for O(1) range minimum queries. 19 new tests across both modules, all verified against naive implementations.

Update roadmap: phases 1-5 complete

de5d790

Add end-to-end tests, public API exports, and doc comments for string…

9440569

…match

Add TileSet with CMI-based glue discovery and single-edge heuristic

5ffdcee

Phase 7: integrates CyclicMatchIndex with Rat geometric validation to find all valid glue operations between tiles. Two-phase approach: CMI-seeded multi-edge matches + single-edge brute force with heuristic pruning.

Clean up unused test imports in tileset module

e2c0d46

Fix ignored doc test: enable CyclicMatchIndex example to compile and run

53b5079

Simplify valid_glues_containing to filter valid_glues results, add ex…

abbd589

…haustive containing query test

Quality audit: fix match_length full-match bug, fix cycle/slice_from …

916ecd9

…negative offsets, add tests, remove ROADMAP.md

Eliminate redundant get_match call: add try_glue_precomputed to skip …

010ea60

…double normalization

Add patch_growth binary: enumerate tile patches via binary decomposition

f44328b

Add --validate mode to patch_growth: cross-check TileSet results agai…

4386506

…nst brute-force try_glue

Fix degenerate glue panic: reject ±hturn junction angles before Snake…

343fe6d

… construction

Remove dead code: cfg(test) on heuristic helper, rename debug_ test

a0b873d

AI Bot added 27 commits April 22, 2026 18:06

Add ZZ4 fast path for Snake intersection checking via visited set

7b3de5a

Add --ring flag to patch_growth for ZZ4/ZZ12 ring selection

5da9777

Add ZZ4 ring support and square correctness test

90bcc08

Add --ring flag to patch_bench (zz4/zz12). Add ZZ4 square test verifying Redelmeier matches old TileSet approach up to size 6. Generic print_results eliminates ZZ12-only type constraint.

Cache Rat in Patch to avoid recreating it on each to_rat() call

877b6ca

The Rat is now cached when Patch is created (in from_seed and try_apply). to_rat() returns a clone of the cached Rat instead of recomputing. This adds slight complexity but ensures correctness and is marginally faster.

Add flamegraph.svg and perf.data to .gitignore

d674e12

Clean up clippy warnings

4a451cd

- Suppress dead_code for make_free and enumerate_onesideed (kept for API/compatibility) - Prefix unused variables with underscore - Add allow(dead_code) attribute to pub(crate) enumerate_sites

Simplify patch_bench to ZZ12 only

a49316c

- Remove ring selection, now only supports ZZ12 - Default to hexagon tiles, size 7 - Cleaner code overall

Remove unused import in growing.rs

335982c

Simplify patch_bench to Redelmeier only

7a2ea73

- Remove old TileSet approach - Just run Redelmeier algorithm for benchmarks - Default to hexagons, size 9

Add incremental vertex overlap detection to Redelmeier growth

aa1970c

Track positions and visited set in GrowingPatch for O(1) overlap checks. Skip Snake validation for ZZ4 (vertex overlap is sufficient). Falls back to Snake check for other rings where segment crossing is possible.

Use scaled integer positions instead of Ratio in GrowingPatch

7e98c6c

Replace Ratio<i64>-based cyclotomic positions with scaled integer tuples (Pos2/Pos4/Pos8) for ZZ4, ZZ10, ZZ12. Eliminates cross-multiply + GCD overhead from every position addition, hash, and comparison in the hot path.

Add cyclotomic_intersect feature flag for ZZ12 segment crossing checks

1852771

Gates the direct cyclotomic intersection implementation behind a feature flag, keeping the integer Pos4 fast path as default. Also fixes trait resolution ambiguity by explicitly qualifying < as SymNum>::turn().

apirogov force-pushed the agent-dev branch from 31e24f0 to 3412387 Compare April 22, 2026 16:07

apirogov merged commit 6eaa10a into main Apr 22, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes, performance improvements and new benchmarks#33

Fixes, performance improvements and new benchmarks#33
apirogov merged 63 commits intomainfrom
agent-dev

apirogov commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

apirogov commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant