Skip to content

Rollout annotation prompts#403

Open
rl2aloha wants to merge 2 commits into
elmo/pi-rollout-localfrom
elmo/rollout-cfg
Open

Rollout annotation prompts#403
rl2aloha wants to merge 2 commits into
elmo/pi-rollout-localfrom
elmo/rollout-cfg

Conversation

@rl2aloha

@rl2aloha rl2aloha commented May 8, 2026

Copy link
Copy Markdown
Contributor

Per-task / per-object prompt files used by rollout.py --annotation-path
(or the a <path> intervention command). Each file holds a single
natural-language instruction tokenized into the Task: ... block of
the pi prompt.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

rl2aloha commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

rl2aloha and others added 2 commits May 29, 2026 18:25
Per-task / per-object prompt files used by `rollout.py --annotation-path`
(or the `a <path>` intervention command). Each file holds a single
natural-language instruction tokenized into the `Task: ...` block of
the pi prompt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PI's process_batch_for_training now owns prompt sampling, embodiment/
control-mode/State splicing, and tokenization (see PI._build_prompts /
_tokenize_prompts). The collate's only job is to stack tensors and
preserve list-valued keys.

- Use annotation_collate instead of build_tokenized_collate.
- Replace _build_collate_from_checkpoint_cfg with
  _apply_annotation_to_algo: override annotation_key="annotations",
  sampling_mode="first", default_prompt=<rollout annotation> on the
  loaded PI algo so rollout-time prompts deterministically flow through.
- Drop _debug_print_proprio_norm and the post-collate sampled_prompt
  debug print -- both inspected fields the collate no longer produces.
- Stop loading the hydra run snapshot in _load_policy; the only
  consumer was the removed collate builder.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

Claude Code Review

Summary

Adds per-task/object prompt text files under egomimic/robot/annotations/ and refactors rollout.py to delegate prompt assembly/tokenization to PI.process_batch_for_training rather than rebuilding a tokenized collate from the checkpoint's hydra config.

Key concerns

  1. Silent fallback when model lacks expected attributes. _apply_annotation_to_algo does getattr(self.policy, "model", None) and silently returns if missing, and each attribute is set only if hasattr(...). If the algo's attribute names ever drift (e.g. prompt_default vs default_prompt), the rollout will run with the trained-in default prompt and the user's --annotation-path becomes a no-op with no warning. Recommend at least a print warning when none of the three attrs were settable, or when self.annotation is non-None but default_prompt couldn't be applied.

  2. load_annotation no longer fails loudly if algo lookup fails. Previously, providing an annotation mid-rollout would build a real tokenized collate. Now if _apply_annotation_to_algo silently no-ops, the user gets a "Loaded new annotation" success message but the prompt isn't actually used. Same fix as above.

  3. All prompts hardcode "left hand". Yet PolicyRollout supports arm in {"left", "right", "both"}. If someone points --annotation-path .../toy/pick_basic.txt at a right-arm rollout, the policy gets a contradictory instruction. Not a blocker (these are research scratch files) but worth a note in a README, or at minimum naming the directory annotations/left/....

  4. Inconsistent phrasing across files ("Using the left hand," vs "With left hand," vs "With the left hand,"). If the trained distribution favored one form, the others may degrade performance. Worth confirming against training data prompts.

  5. Removed debug print of sampled_prompt. That was actually useful for sanity-checking that the right prompt made it through. Since prompt assembly now happens deep inside PI.process_batch_for_training, consider adding an equivalent debug print there (gated) or in _apply_annotation_to_algo confirming what was set.

  6. No newline at end of any file. Minor, but f.read().strip() makes it harmless. Still, easy lint fix.

Suggestions

  • Add a warning in _apply_annotation_to_algo if self.annotation is not None but no model attribute was found to write default_prompt to — this is the silent-failure path that will burn someone.
  • Print the effective prompt once after _apply_annotation_to_algo (e.g. print(f"[rollout] PI default_prompt set to: {model.default_prompt!r}")) so it shows up in rollout logs.
  • Confirm annotation_collate preserves annotations as a list-of-lists per the docstring claim — worth a one-line unit test given how brittle the prompt-plumbing path is.
  • Consider annotations/{object}/left/{task}.txt layout for future right/bimanual prompts.
  • Add trailing newlines to the .txt files.

Verdict

Comment — text files are fine to merge; the rollout refactor is reasonable but the silent-failure paths around hasattr/getattr should get a warning print before this is relied on for real rollouts.


Reviewed by Claude · Review workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant