Skip to content

feat(vm): plumb per-call source lines through the compiled dispatcher#355

Open
davydog187 wants to merge 7 commits into
mainfrom
feat/dispatcher-line-attribution
Open

feat(vm): plumb per-call source lines through the compiled dispatcher#355
davydog187 wants to merge 7 commits into
mainfrom
feat/dispatcher-line-attribution

Conversation

@davydog187

Copy link
Copy Markdown
Contributor

Summary

The compiled (dispatcher) engine now attributes source lines to native-call raise sites. Previously the encoder stripped :source_line opcodes for perf and dispatcher_call_function/5 published {nil, source} at native-call boundaries to avoid mis-attribution — so pairs(\"asdf\") errors and error(\"boom\") calls in compiled chunks omitted the line in their §6.1 prefix.

The fix bakes the line directly into the three call opcode tuples (:call_one/:call_zero/:call_multi) at encode time. The dispatcher's hot path is unchanged (the same single tuple-read per call); dispatcher_call_function/6 extracts the line and publishes it via current_position/0 so stdlib raise sites attribute correctly.

This is the deferred "B5d-v2" work referenced by both lib/lua/compiler/bytecode.ex:129 and the relaxed :compiled assertions in pcall_error_value_test.exs.

After

chunk = ~LUA\"\"\"
function foo()
  for i in pairs(\"asdf\") do
    print(i)
  end
end
\"\"\"c
{_, lua} = Lua.eval!(chunk)

Lua.call_function!(lua, [:foo], [])
# ** (Lua.RuntimeException) Lua runtime error: at -no-source-:2:
#
#   bad argument #1 to 'pairs' (table expected, got string)

(Line 2 is now populated where it was previously omitted.)

Design

  • Storage shape: line is the final slot of each call opcode tuple, not a parallel line_map on the prototype. Avoids the nested-body problem (calls inside if/for bodies sit in their own PC-space) and keeps tuple-read cost constant.
  • Encoder change (bytecode.ex): encode_list/3 tracks current_line while walking instructions. :source_line updates the line but stays stripped. annotate_line/2 appends the line to call opcodes only — other opcodes pass through unchanged.
  • Dispatcher (dispatcher.ex): each call-opcode pattern gains a line slot, passes it to Executor.dispatcher_call_function/6.
  • Executor (executor.ex): publishes {line, source} at the native-call boundary; uses the same line for nil-call / non-function-call type errors that previously hardcoded line: 0.

Out of scope

Line attribution for non-call dispatcher raise sites (binops, indexing, concat) is still deferred. Same gap, much wider surface — follow-up PR if needed.

Benchmarks (manual, Apple M1 Pro)

Bench main branch Δ
fib(30) chunk 0.63 ips 0.73 ips within noise
closures chunk 1.37 K ips 1.40 K ips within noise

Hot path unchanged: same tuple-read per call.

Test plan

  • Flipped :compiled branch suppression in pcall_error_value_test.exs and pcall_state_preservation_test.exs — both engines now match ^test\.lua:\d+: boom$.
  • Added regression in call_function_error_value_test.exs asserting call_function!/3 rich render contains regression.lua:2: for both outer and nested call positions.
  • Added bytecode-shape test pinning the line slot on call opcodes (outer and nested bodies).
  • Updated README doctest for pcall(function() error(\"nope\") end) — the §6.1 prefix now flows through the compiled engine.
  • mix test — full suite green (2263 passed, 19 skipped, +4 new tests).
  • mix docs --warnings-as-errors clean.
  • mix format clean.
  • No benchmark regression vs main on fib(30) or closures.

… boundary

`Lua.VM.ProtectedCall.error_value/1` had no clause for
`Lua.VM.ArgumentError`, so it fell through to `Exception.message/1` —
which calls `ErrorFormatter.format/3` and embeds ANSI escape codes plus
the `at <source>:<line>:` header in the returned reason. PR #337 plugged
this for the `:value`-bearing exceptions (`RuntimeError`, `TypeError`,
`AssertionError`) but `ArgumentError` builds its message from individual
fields and has no `:value` slot, so it was missed.

Affected both `Lua.call_function/3` and `pcall`/`xpcall`: any stdlib
bad-argument raise (e.g. `pairs("asdf")`) returned the rendered string
instead of the bare §6.1 error value.

Promote `ArgumentError.build_base/5` to a public `raw_message/1` and add
a clause to `error_value/1` that returns it. Regression tests added for
both protected-call boundaries.

Fixes #342.
`Lua.VM.ProtectedCall` is `@moduledoc false`, so the autolink from
`ArgumentError.raw_message/1`'s public doc tripped `mix docs
--warnings-as-errors` on CI.
`Lua.call_function!/3` raises `Lua.RuntimeException` whose `:original`
field carries the underlying VM exception verbatim. That's the escape
hatch for Elixir callers who want to pattern-match on the structured
error (e.g. `%Lua.VM.ArgumentError{function_name: "pairs", arg_num: 1,
expected: "table", got: "string"}`) instead of parsing the terse §6.1
string that `call_function/3` returns.

The contract was already implemented (RuntimeException.exception/1 line
66 stores `original: error`) but wasn't pinned by a test, so a future
refactor that wrapped or stringified the underlying exception would have
silently regressed it.
The dispatcher's call opcodes now carry the source line of their call
site, baked in at encode time. `dispatcher_call_function/6` publishes
that line via `current_position/0` so raise sites inside stdlib (e.g.
`error()`'s §6.1 prefix, `pairs("asdf")`'s bad-argument check) attribute
to the right line instead of suppressing the prefix.

Previously the encoder stripped `:source_line` opcodes to avoid one
no-op dispatch per source line (~228k extra cycles for fib(25)) and the
dispatcher published `{nil, source}` at native-call boundaries to avoid
mis-attribution. This was the deferred "B5d-v2" work referenced by both
the bytecode-encoder comment and a relaxed `:compiled` assertion in
`pcall_error_value_test.exs`.

The fix:

- Encoder tracks `current_line` while walking instructions; `:source_line`
  opcodes update it but stay stripped. `annotate_line/2` appends the
  line as a final slot to `:call_one`/`:call_zero`/`:call_multi` opcode
  tuples only. Other opcodes pass through unchanged.
- Dispatcher's three call-opcode patterns gain a `line` slot and pass it
  to `Executor.dispatcher_call_function/6`.
- Executor publishes `{line, source}` at the native-call boundary and
  uses the same line for the nil-call / non-function-call type errors.
- Nested bodies (inside `:test`, loops, generic_for) work because the
  encoder seeds a fresh `current_line` per nested encode and the nested
  bodies carry their own `:source_line` opcodes, so a call inside an
  `if`/`for` body still gets its own line baked in.

Tests:

- Flipped the `:compiled` branches in `pcall_error_value_test.exs` and
  `pcall_state_preservation_test.exs` that previously asserted
  suppression — both engines now match `^test\.lua:\d+: boom$`.
- Added regressions in `call_function_error_value_test.exs` and
  `bytecode_test.exs` pinning that calls in both outer and nested
  positions carry the right line.
- Updated the `pcall(function() error("nope") end)` doctest in README
  to reflect the §6.1 prefix that now flows through the compiled engine.

Benchmarks (manual, Apple M1 Pro):
- fib(30) main → 0.63 ips · branch → 0.73 ips (no regression)
- closures main → 1.37 K ips · branch → 1.40 K ips (no regression)

Hot path is unchanged: per-instruction lookup is gone, and call opcodes
gain one tuple slot read at the same dispatch step they already did.

Follow-up: line attribution for non-call dispatcher raise sites (binops,
indexing, concat) is still deferred — file as "B5d-v3" if needed.
@davydog187

Copy link
Copy Markdown
Contributor Author

🤖 Ultracode review — round 1

Major

  • [MAJOR] lib/lua/vm/executor.ex:443-444 — Native iterator raising error() during generic_for leaks line 0 on the compiled path (parity break)

    • Reproduced against the PR branch: for x in error, "deep", nil do end inside a pcall yields compiled test.lua:0: deep but interpreted test.lua:2: deep. dispatcher_call_value/4 hard-codes call_value(callable, args, proto, state, 0) (executor.ex:444), so the native-iterator clause publishes line 0 via set_position. The PR annotated the three :call opcodes but left @op_generic_for (bytecode.ex:353, shape {tag, base, var_regs, body}) carrying no line, and the bridge at dispatcher.ex:902/1283 has nothing but 0 to pass. The interpreter's generic_for threads a real line (executor.ex:644, :929). This is a literal :0: leaking into a rendered/pcall position prefix AND a compiled-vs-interpreted divergence in the error value — the exact LINE-ACCURACY/PARITY failure this PR exists to prevent. Fires for any native iterator that calls error() with a string mid-iteration.
    • Suggestion: Bake the source line into @op_generic_for (mirror annotate_line/2 for the call opcodes), thread it through dispatcher.ex:902 and :1283 into dispatcher_call_value, and have the native-iterator clause set_position(line, source) like dispatcher_call_function does. Add a regression test asserting compiled pcall value == interpreted for a native iterator raising error() during stepping (no :0:).
  • [MAJOR] test/lua/vm/pcall_error_value_test.exs:188 — Dual-engine error()-prefix tests use a loose :\d+: regex and never pin the line, leaving the PR's core guarantee untested

    • The PR's whole purpose is line ACCURACY on the compiled path, but both dual-engine prefix tests assert only err =~ ~r/^test\.lua:\d+: boom$/. I confirmed the real lines by running the chunks: pcall_error_value raises on line 2, pcall_state_preservation on line 4 (three statements precede error(), a stronger off-by-one detector). \d+ matches :1:, :3:, :99: under BOTH engines, so a compiled off-by-one or stale-line regression passes green — false safety for exactly the fixed failure mode. There is also no test asserting compiled line == interpreted line for the same in-VM pcall chunk; the only exact-line pins (regression.lua:2) run a single engine via Lua.eval!.
    • Suggestion: Pin exact lines: assert err == "test.lua:2: boom" (pcall_error_value_test.exs:188) and assert err == "test.lua:4: boom" (pcall_state_preservation_test.exs:70). Add one explicit compiled-vs-interpreted equality test running the same chunk through both engines and asserting byte-identical error strings.

Minor

  • [MINOR] test/lua/vm/pcall_error_value_test.exs:22-25 — Stale moduledoc claims the dispatcher suppresses the line prefix, contradicting the PR's new assertions

    • The moduledoc still reads "under the dispatcher the per-call line is not yet plumbed, so the prefix is suppressed rather than attributed to a stale line." This PR plumbs the line; the test body now asserts the prefix IS present under both engines (the :compiled -> assert err == "boom" branch was deleted in this very diff, line 188). The moduledoc now directly contradicts the assertions below it.
    • Suggestion: Update the moduledoc to state both engines now emit the source:line: prefix via the per-call line baked into the call opcode (mirror the inline comment at line 184-187).
  • [MINOR] lib/lua/vm/dispatcher.ex:77-79 — Dispatcher comment cites encode_list/2 and a B5d-v2 deferral that this PR removes

    • The comment reads "the bytecode encoder strips :source_line entries in encode_list/2. Line tracking for compiled prototypes is B5d-v2." Both halves are now false: the encoder is encode_list/3 (the /2 -> /3 rename IS in this diff) and tracks current_line to bake lines into call opcodes, and line tracking for compiled prototypes is exactly what this PR delivers. Per the repo rule against referencing plan ids in source, the dangling B5d-v2 deferral is doubly worth dropping now that it's also factually wrong.
    • Suggestion: Update to note :source_line is stripped in encode_list/3 with the rolling line baked into call opcodes; drop the "B5d-v2" deferral claim.

Nit

  • [NIT] test/lua/compiler/bytecode_test.exs:327 — Bytecode test hardcodes opcode 51 instead of the existing Bytecode.op_generic_for() accessor
    • New test code uses :erlang.element(1, op) == 51 to locate the generic_for opcode while the same test uses Bytecode.op_call_zero()/Bytecode.op_call_multi() accessors elsewhere. Bytecode.op_generic_for/0 exists (bytecode.ex:450). The magic number is inconsistent with adjacent style and would silently mislocate if opcodes are renumbered.
    • Suggestion: Replace == 51 with == Bytecode.op_generic_for().

The compiled :generic_for path hard-coded line 0 when invoking the
iterator, so a native iterator that raised mid-step (e.g.
`for x in error, "deep", nil do end`) leaked `test.lua:0:` while the
interpreter attributed `test.lua:2:` — a line-accuracy/parity break.

- Bake the for-statement line into the :generic_for opcode via
  annotate_line/2, mirroring the call opcodes.
- Thread it through the dispatcher's :generic_for dispatch and the
  :cps_generic_for resume marker into dispatcher_call_value/5, which
  passes it to call_value/5 so the native-iterator boundary publishes
  the right position.
- Pin exact lines in the dual-engine error()-prefix tests
  (test.lua:2 / test.lua:4) instead of a loose :\d+: regex, and add
  compiled-vs-interpreted byte-equality tests covering both error()
  and a native-iterator raise.
- Refresh the stale pcall_error_value moduledoc and the dispatcher
  encode_list/3 comment to match the plumbed-line reality.
@davydog187

Copy link
Copy Markdown
Contributor Author

🤖 Ultracode review — round 2

[MAJOR] lib/lua/compiler/bytecode.ex:337,344 — Compiled while/repeat condition body leaks :0: through pcall — error-VALUE parity break vs interpreter

The encoder strips :source_line and bakes a rolling current_line into call opcodes, but every nested-body encode_list call hardcodes the seed to 0. The while/repeat CONDITION body (cond_body) is pure expression code carrying NO :source_line opcode, so a native call that raises inside the condition bakes line 0. Reproduced under both engines via the PR's own run/strip_bytecode harness:

  • while error('wc') do end → compiled "test.lua:0: wc" vs interpreted "test.lua:2: wc"
  • repeat ... until error('rc') → compiled "test.lua:0: rc" vs interpreted "test.lua:3: rc"

The while_loop cond_body [get_upvalue, get_field, move, call] contains no :source_line opcode, while the outer {:source_line, 2} is in scope at the :while_loop opcode but discarded by the hardcoded 0 seed. Unlike the binop/index render-only issue, this is a genuine compiled-vs-interpreted parity break in a pcall-CAUGHT error VALUE (the §6.1 prefix is part of error()'s string value) — exactly the :0:-leak / pcall-parity failure the round-2 risk brief flags. The existing 95-test suite never exercises a native call inside a loop condition, so it is uncaught.

Suggestion: Thread the in-scope current_line into nested-body encoding instead of hardcoding 0. Give the body-encoding path access to current_line and seed encode_list(cond_body, [], current_line) for both :while_loop (bytecode.ex:337) and :repeat_loop (:344), so condition bodies inherit the line as the interpreter does (executor.ex:865/882). Seeding :test/:numeric_for/:generic_for bodies the same way is more robust (they're masked today only by their per-statement :source_line). Add a regression test asserting both engines agree for while error('x') do end and repeat until error('x').

[MINOR] lib/lua/vm/executor.ex:247 — Compiled binop/index/concat/compare raise sites render :0: in rich render and to_map (diverges from interpreter)

The dispatcher's non-call raise sites pass line: 0 to the safe_* helpers (executor.ex:247-310 arith/compare, 337-343 index, 463 concat). Because 0 is truthy, TypeError.exception/1's line = Keyword.get(opts,:line) || auto_line (type_error.ex:33) keeps 0 rather than falling back to current_position(). For function foo() local a=nil; return a+1 end, Lua.call_function! on the compiled path renders at t.lua:0: and error.original.line == 0.

NOTE: the pcall error VALUE is byte-identical across engines for binops (TypeError's §6.1 value carries no source:line: prefix) — both return "attempt to perform arithmetic on a nil value (local 'a')". So the divergence is confined to the rich render and to_map, it is PRE-EXISTING (these sites always passed 0) and explicitly scoped-out by the PR's own comments — not a regression. Flagged at minor because a literal :0: is user-visible and the PR title invites the expectation it's covered.

Suggestion: If deferring the full line-baking, pass nil instead of 0 from these dispatcher bridge sites (a one-token change per call site) so TypeError.exception/1 falls back to current_position() and the bogus :0: is replaced by no-prefix or the right line — matching how the pcall value path already drops the prefix. Otherwise extend the line-baking to the binop/index/concat/compare opcodes so the compiled rich render matches the interpreter.

[NIT] test/lua/compiler/bytecode_test.exs:614 — Magic opcode number 51 in new bytecode test instead of the public accessor

The new test uses Enum.find(fn op -> :erlang.element(1, op) == 51 end) to locate :generic_for, while the surrounding code in the same test correctly uses Bytecode.op_call_zero()/op_call_multi(). A public Bytecode.op_generic_for() accessor exists (bytecode.ex:454) and @op_generic_for is 51. The assert generic_for_op != nil does make a renumber fail loudly, so it's cosmetic — but it's an inconsistency introduced in this diff's own test code.

Suggestion: Replace == 51 with == Bytecode.op_generic_for() for consistency and renumber-safety.

The bytecode encoder stripped `:source_line` opcodes into a rolling
`current_line` baked into call opcodes, but every nested-body
`encode_list` call hardcoded the seed to 0. A while/repeat condition
body carries no `:source_line` opcode of its own, so a native call
raising inside the condition (e.g. `while error('x') do end`) baked
`:0:` into the §6.1 error VALUE, diverging from the interpreter, which
threads the enclosing line through the body. Route body-carrying opcodes
(`:test`, `:numeric_for`, `:while_loop`, `:repeat_loop`, `:generic_for`)
through `encode/2` so their nested bodies inherit the in-scope
`current_line`.

Also pass `nil` (not the truthy literal `0`) from the dispatcher's
non-call raise bridges (binop / compare / unop / index / set / concat /
self) so `TypeError`/`RuntimeError.exception/1` fall back to
`current_position/0` instead of rendering a bogus `:0:` in the rich
render and `to_map`. The pcall error value was already byte-identical
across engines for these; this aligns the rich render too. Per-opcode
line baking for these sites remains deferred to B5d-v2.

Adds while/repeat condition-body regressions asserting both engines
agree with no `:0:` leak.
@davydog187

Copy link
Copy Markdown
Contributor Author

🤖 Ultracode review — round 3

  • [MINOR] test/lua/call_function_error_value_test.exs:187 — No test pins compiled-vs-interpreted parity for the ArgumentError rendered line, nor outer-line restore after a nested native catch

    • Two PR-specific guarantees on the brief's highest-risk surfaces (PARITY and STATE) are unpinned. (1) The new ArgumentError line tests (call_function_error_value_test.exs:187-216) run only the compiled path — Lua.call_function/Lua.eval! with no engine flag and no bytecode stripping — so they assert error.original.line == 2 and regression.lua:2: for compiled only. The cross-engine block in pcall_error_value_test.exs:295-360 covers error()/native-iterator/while/repeat but contains NO ArgumentError case (e.g. pairs("asdf")), so an interpreter-vs-dispatcher divergence on the bad-argument render line would pass CI. No test anywhere compares the two engines for an ArgumentError line. (2) The position process-dict save/restore this PR adds (executor.ex set_position/prev_pos around the native_func call) is new STATE surface, but the existing 'nested pcall keeps mutations at every level' test (pcall_state_preservation_test.exs:311) asserts only heap effects [1,2,3,false] — never line numbers. No test asserts that after an inner native error is caught by an inner pcall, a subsequent OUTER native error reports the OUTER line. Both guarantees currently hold by inspection but nothing locks them.
    • Suggestion: Add (1) an ArgumentError case (e.g. pairs("asdf")) to the pcall_error_value_test.exs cross-engine block asserting the rendered/source:line: value is byte-identical under :compiled and :interpreted; and (2) a nested-pcall test asserting that after an inner native error() is caught, a subsequent outer native error() reports the OUTER line (exact test.lua:<outer>: ...), proving the position dict was restored rather than leaked.
  • [NIT] lib/lua/vm/executor.ex:238, 358, 435 — Reworked comments re-introduce plan id B5d-v2 (CLAUDE.md: no plan ids in source)

    • CLAUDE.md forbids plan ids in comments/@doc/@moduledoc. This PR rewrote three comment blocks (the diff replaces the old 0-line rationale with the new nil-line rationale) and the rewritten lines still carry B5d-v2. Verified via gh pr diff 355 | grep '^+.*B5d-v2': exactly three ADDED lines contain it — that is B5d-v2's subject (executor.ex:238), compiled prototypes is B5d-v2 (358), and baking for these opcodes is B5d-v2 (435). Because the PR edited these lines, the convention applies to them now. (The dispatcher.ex line-tracking comment was correctly updated to drop its plan id; this is the inconsistency.) Lowered to nit because the lines previously also bore a B5d-v2 reference, so this is a missed cleanup rather than a fresh violation.
    • Suggestion: Drop the B5d-v2 tokens from the three rewritten comments in executor.ex. Describe the deferred work by what it is, e.g. "per-instruction line baking for binop / index / concat raise sites is not yet implemented", mirroring the dispatcher.ex comment that this PR correctly de-plan-ified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant