Skip to content

fix(inline): terminate fixpoint when no inline survives verification — fixes #147 i64 hang (v1.1.3)#149

Merged
avrabe merged 2 commits into
mainfrom
fix/147-inline-livelock
May 30, 2026
Merged

fix(inline): terminate fixpoint when no inline survives verification — fixes #147 i64 hang (v1.1.3)#149
avrabe merged 2 commits into
mainfrom
fix/147-inline-livelock

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 30, 2026

Summary

Closes #147: on v1.1.2, loom optimize --passes inline hung indefinitely on a small i64 module (gale-ffi z_impl_k_sem_give u64-unpack + caller) — 0 panics, no completion. This is the last loom-side gate for gale's verified cross-language-LTO route.

Root cause — a livelock, not a slow solve

inline_functions loops "until no inline candidate remains." A candidate whose inline the Z3 verifier reverts (an i64 function whose inline-equivalence can't be proven) stays a candidate every iteration — its call site is restored on revert — so the pass live-locks: inline → verify reverts → inline → … forever. Verify itself returns in microseconds (the run spews counterexample / reverting rapidly, and the pass completes instantly under LOOM_Z3_MAX_INSTRUCTIONS=0). It surfaced only because v1.1.2's #145 fix removed the SortDiffers panic that used to abort the pass.

This corrects the earlier diagnosis — the 56-min / 4h CI matrices were this livelock spinning, not slow Z3.

Fix

  • inline_functions terminates when an iteration keeps no inline (made no net progress), tracked via a per-iteration kept_any flag (body changed AND not reverted). 64-iteration hard cap as a backstop. Legitimate inlining is unchanged (a kept inline → loop continues exactly as before).
  • Reverts the verify() watchdog-thread experiment from the investigation (the hang was never in verify); the Z3 solve-time bound stays.
  • Re-enables 4 of the 5 previously-#[ignore]'d i64 inline tests (they now run in ms). The 5th asserts the i64 helper is inlined, which the verifier soundly reverts as unprovable — real verified i64 inlining stays a Add a Z3 per-query timeout to the translation validator (bounds i64 verification; re-enable ignored i64 inline tests) #147 follow-up.

Verified locally

loom optimize merged.i64.wasm --passes inline --stats   # v1.1.3
✅ Optimization complete!   (exit 0; output passes wasm-tools validate;
                             the unprovable i64 fn reverted once, then terminates)

Inline test suite: 4 i64 tests pass in 0.02s; no regression on non-i64 inline tests.

Falsification

Wrong if loom optimize --passes inline still fails to terminate on an i64 module, or if the kept_any guard breaks a legitimate multi-pass inline (a kept inline must still let the loop continue). The gale reporter offered to re-test on landing.

Closes #147

avrabe added 2 commits May 30, 2026 16:18
…loses #147)

gale repro (#147): `loom optimize --passes inline` hung indefinitely
on a ~5 KB i64 module (z_impl_k_sem_give u64-unpack + caller) on
v1.1.2 — 0 panics, no completion.

Root cause: NOT a Z3 hang. The inline pass loops "until no inline
candidate remains", but a candidate whose inline the Z3 verifier
reverts (an i64 function whose inline-equivalence can't be proven)
stays a candidate every iteration — its call site is restored on
revert — so the pass livelocks: inline → verify reverts → inline →
… forever. Verify itself returns fast (confirmed: the run spews
"counterexample / reverting" rapidly, not a single slow solve, and
the pass completes instantly under LOOM_Z3_MAX_INSTRUCTIONS=0).

It only surfaced now because the #145 fix removed the SortDiffers
panic that previously aborted the pass; with the panic gone the
underlying livelock was exposed.

Fix (loom-core/src/lib.rs, inline_functions):
- Track whether each iteration KEEPS any inline (func body changed
  AND not reverted by verify_or_revert). An iteration that reverts
  everything made no progress -> break instead of re-spinning.
- Hard cap of 64 iterations as a backstop.
Legitimate inlining is unaffected (a kept inline sets kept_any ->
the loop continues exactly as before).

Also reverts the verify() watchdog-thread experiment from the
investigation (the hang was never in verify); the Z3 solve-time
bound in z3_config_with_timeout stays.

Verified locally: the gale repro now completes (gtimeout exit 0,
"Optimization complete", output passes wasm-tools validate) — the
unprovable i64 function is reverted once, then the pass terminates.

Closes #147
Trace: REQ-6
Bumps version 1.1.2 -> 1.1.3 and re-enables the four no-panic i64
inline regression tests that the #147 livelock fix unblocks (they now
run in ms; the one asserting i64 IS inlined stays #[ignore] — the
verifier soundly reverts the unprovable i64 inline, a #147 follow-up).

The earlier "i64 Z3 verification is slow" diagnosis was wrong: the
56-min / 4h CI matrices were the inline-pass livelock spinning, not
slow solves. With the livelock fixed the i64 inline tests complete in
milliseconds and the Test matrix should run normally again.

Closes #147.

Trace: REQ-6
@avrabe avrabe merged commit 550d3c0 into main May 30, 2026
15 of 22 checks passed
@avrabe avrabe deleted the fix/147-inline-livelock branch May 30, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a Z3 per-query timeout to the translation validator (bounds i64 verification; re-enable ignored i64 inline tests)

1 participant