Commit 42fe6f5
Add ActionDecoder: JEPA embedding space → structured tool calls
Implements the Action Decoder module that bridges VL-JEPA latent reasoning
back to concrete ATN tool calls (tool name + JSON arguments).
Architecture (LLaVA-style MLP adapter + small autoregressive decoder):
- MLPAdapter: 2-layer GELU MLP projects JEPA embeddings into decoder space
- SmallDecoder: autoregressive Transformer with causal self-attention and
cross-attention to the JEPA context at every layer
- JSONConstrainer: byte-level token mask state machine enforcing valid JSON
structure during generation (XGrammar-compatible integration path documented)
- ActionDecoder: composes all three with SimpleTokenizer and exposes a clean
forward() (teacher forcing) / decode() (autoregressive) API
Includes PoC config (~5M params, CPU-runnable) and production config
(~500M params, single GPU target). Output format matches the ATN trace
format {"tool": "<name>", "args": {...}} used by TraceEncoder/ActionEncoder.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 0c2a1d4 commit 42fe6f5
1 file changed
Lines changed: 902 additions & 0 deletions
0 commit comments