Skip to content

pi0.5: raise prompt token cap 128/180 -> 200 to match pi0.5 base training#478

Open
ElmoPA wants to merge 1 commit into
mainfrom
elmo/pi05-token-cap-200
Open

pi0.5: raise prompt token cap 128/180 -> 200 to match pi0.5 base training#478
ElmoPA wants to merge 1 commit into
mainfrom
elmo/pi05-token-cap-200

Conversation

@ElmoPA

@ElmoPA ElmoPA commented May 31, 2026

Copy link
Copy Markdown
Contributor

The PI prompt tokenizer was capping text at tokenizer_max_length=128 (a
legacy default carried over from the old build_tokenized_collate) while
the model was instantiated with max_token_len=180 -- both below pi0.5's
actual training length.

pi0.5 base was trained/released with max_token_len=200 (openpi Pi0Config
pi05 default; gs://openpi-assets/checkpoints/pi05_base). Raising both
knobs to 200 matches the pretrained checkpoint and is in-distribution
(Gemma uses RoPE, so there is no positional-table limit). This roughly
doubles the usable annotation budget once the proprio + control-mode
blocks are spliced into the prompt, and removes the truncation footgun
where an over-long prompt (truncation=True, right side) silently clipped
the trailing State / "Action:" anchor.

  • max_token_len: 180 -> 200
  • tokenizer_max_length: 128 -> 200

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

ElmoPA commented May 31, 2026

Copy link
Copy Markdown
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

…ning

The PI prompt tokenizer was capping text at tokenizer_max_length=128 (a
legacy default carried over from the old build_tokenized_collate) while
the model was instantiated with max_token_len=180 -- both below pi0.5's
actual training length.

pi0.5 base was trained/released with max_token_len=200 (openpi Pi0Config
pi05 default; gs://openpi-assets/checkpoints/pi05_base). Raising both
knobs to 200 matches the pretrained checkpoint and is in-distribution
(Gemma uses RoPE, so there is no positional-table limit). This roughly
doubles the usable annotation budget once the proprio + control-mode
blocks are spliced into the prompt, and removes the truncation footgun
where an over-long prompt (truncation=True, right side) silently clipped
the trailing State / "Action:" anchor.

- max_token_len:        180 -> 200
- tokenizer_max_length: 128 -> 200

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ElmoPA ElmoPA force-pushed the elmo/pi05-token-cap-200 branch from 1d1b74c to 5bbcc39 Compare May 31, 2026 19:15
@github-actions

Copy link
Copy Markdown

Claude Code Review

Summary

Bumps pi0.5 prompt token cap from 128/180 → 200 to align with the pi0.5 base checkpoint's pretraining max_token_len=200, eliminating silent right-truncation of the State/"Action:" anchor.

Key concerns

  • Norm stats / cached tokenization: If any precomputed prompt tokens or attention masks are cached on disk (e.g., in zarr) at the old shape (128 or 180), they will mismatch at load time. Worth confirming tokenization is done online in the collate and not baked into a dataset artifact.
  • Memory/throughput regression: Prompt attention in Gemma is quadratic. Going from 180 → 200 is ~23% more attention compute on the prefix. Should be fine but worth a quick sanity check on the current training config's per-GPU memory headroom (especially with action_horizon=100 + proprio block).
  • Consistency check: The two knobs (max_token_len on the model, tokenizer_max_length on the prompt builder) now agree at 200 — good. Make sure there isn't a third hardcoded 180/128 elsewhere (e.g., in egomimic/algo/ PI module, eval/inference configs, or a pi0.5_*.yaml variant that inherits/overrides this).

Suggestions

  • grep -rn "180\|128" egomimic/algo/ egomimic/hydra_configs/model/pi0* egomimic/scripts/ for stragglers, particularly inference-time configs — a mismatch at eval would silently re-truncate.
  • Consider adding an assertion in the PI algo __init__ that tokenizer_max_length == model.max_token_len to prevent this class of drift from recurring.
  • A one-line note in the PR / config comment citing the openpi Pi0Config pi05 default would be valuable for the next person who wonders why 200.
  • If feasible, log the post-tokenization prompt length distribution for one training run to confirm 200 is actually sufficient (i.e., truncation rate ≈ 0) for current annotation + proprio + control_mode payloads.

Verdict

Approve — small, well-justified config alignment with the upstream checkpoint. Just verify no cached tokenizations and no stray 180/128 constants before merging.


Reviewed by Claude · Review workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant