in-context-learning: human↔robot pairing + RICL on EgoVerse pi0.5#491
in-context-learning: human↔robot pairing + RICL on EgoVerse pi0.5#491RyanPCo wants to merge 6 commits into
Conversation
Claude Code ReviewReview of PR #491: RICL on EgoVerse pi0.5SummaryAdds retrieval-based in-context learning to pi0.5 via a thin Key concerns1.
|
… annotations
Tooling + data for pairing human (aria) and robot (eva) pick_place episodes for
side-by-side eval.
- inspect_episode_metadata.py: read-only audit of app.episodes (scene/objects/
task_description coverage per embodiment) to choose a pairing strategy.
- pair_episodes_by_language.py: pull per-episode dense-language annotations from
R2, parse + normalize the manipulated-object set, and match aria scenes to eva
demos by object-set containment. Emits two tiers: the co-located "alignment
data set" true pairs (from the DB) and language-matched similar-task pairs.
- human_robot_pairs.json: generated pairs (both tiers).
The DB scene/objects columns are unusable for matching (objects="{None}", scene
has 2 values), so matching uses the annotations. Bulk aria/eva episodes share a
~50-object vocabulary but no identical scenes; only the alignment sets are truly
co-located.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…re agents Add egomimic/scripts/human_robot_pairing.md capturing the non-obvious data facts (R2 vs AWS-S3 access gotcha, junk scene/objects columns, the alignment co-located captures, annotation format/coverage, the eva/aria segment-count asymmetry that motivates containment matching, the object-name synonym map, and next steps) plus how to run the two scripts. Pointer added from top-level AGENTS.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
kNN-retrieved in-context demos injected into the pi0.5 prefix, reusing the existing zarr/Cartesian/trainHydra/PI stack. No PI0Pytorch surgery: the flow prefix is image-count-agnostic and fully bidirectional, so PIRicl just appends k retrieved base_0_rgb images to the obs dict and splices discretized retrieved state/action into the prompt (same binning as the pi0.5 State block). Cross-embodiment: bank=aria, query=eva, scoped by human_robot_pairs.json. eva (14-D) and aria (12-D, no gripper) share one 32-D action space via the converters. Adds egomimic/ricl/ (retrieval DINOv3->cKDTree->top-k cache, conditioning, data collate, metrics + CPU tests), PIRicl algo, PIRiclEval (retrieval vs zero-context floor), RiclDataModuleWrapper, and pi0.5_ricl / cotrain_pi_ricl / eval_pi_ricl configs. Logic CPU-verified (23 checks); finetune + DINOv3-at-scale + D0/D1 eval are GPU-cluster steps (see egomimic/ricl/README.md). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
RICL's configs resolve against the pi0.5/pi-train setup. Bring in only the dependency-closure pieces needed to run on this cluster (PR #491 stays main-based; does not pull pi-train's pi.py/human.py code or mecka-only data configs): - paths/default.yaml dataset_dir -> /storage/project/r-dxu345-0/shared/egoverseS3ZarrDatasets - model/pi0.5_base.yaml pi05 checkpoint path + training recipe (inherited via pi0.5_bc_eva -> pi0.5_ricl) - train_zarr_cartesian.yaml cluster training settings (gpus, sample_frac, num_workers) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The @OverRide runtime check (overrides 7.7.0) rejects the subclass method because the parent PI._build_prompts is annotated `-> list[str]` while the override had no return annotation (treated as None). This blocked PIRicl instantiation in any current env. The override already returns list[str]; just annotate it to match the parent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
830d3fc to
d92ed0e
Compare
First end-to-end run of `model=pi0.5_ricl data=cotrain_pi_ricl` through
trainHydra (bank=aria -> query=eva). Fixes the integration gaps that blocked it;
fast_dev_run now completes clean (train step with retrieved bank frames +
PIRiclEval). See egomimic/ricl/SMOKE_TEST_BRINGUP.md.
- ZarrBankFrameProvider: build a ZarrDataset (bank keymap + transform_list) per
episode instead of reading post-transform keys off raw zarr. observations.state.ee_pose
/ actions_cartesian / base_0_rgb are produced at load time, never stored, so the
old direct-read raised KeyError. Wire bank_keymap/bank_transform_list through
RiclDataModuleWrapper; set them (Aria cartesian_pi keymap + cartesian transform)
in cotrain_pi_ricl.yaml.
- RiclQueryDataset: delegate unknown attrs to the wrapped dataset (set_norm_stats_from).
- MultiDataset._iter_leaves: unwrap dataset wrappers (.base) so key/shape inference
reaches the real leaves (else embodiment never registers -> ac_keys[emb] KeyError).
- trainHydra: accept RiclDataModuleWrapper in the datamodule assertion.
- cotrain_pi_ricl: re-point valid_datasets with the data.-prefixed interpolation
(eva_pi/cotrain_pi_base use a root-absolute ${train_datasets...} that doesn't
resolve once nested under `data`).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d92ed0e to
7e11df7
Compare

This branch adds the in-context learning line of work on top of
main.1. Human↔robot episode pairing (
ed870df5,f2adfcd3)Dense-language-annotation pairing between human (aria) and robot (eva) episodes for side-by-side eval:
egomimic/scripts/human_robot_pairs.json(Tier-1 co-located alignment sets + Tier-2 similar-task object-overlap pairs), the matcher, and a findings doc for future agents.2. RICL — retrieval-based in-context learning on pi0.5 (
5b6a442a)kNN-retrieved in-context demonstrations injected into the pi0.5 prefix, reusing the existing zarr / Cartesian /
trainHydra/PI-algo / DINOv3 stack. For each query observation we retrieve the k nearest demos and add their(image, state, action)to the prefix so the policy can imitate a task it never trained on. Cross-embodiment: retrieval bank = aria (human), query = eva (robot).Key point — no
PI0Pytorchsurgery. The flow pi0.5embed_prefixiterates over all images in the observation and embeds the full prompt, with the entire prefix attended bidirectionally (and EgoVerse already feeds it a variable image set). SoPIRiclonly: (1) appends the k retrievedbase_0_rgbframes as extra entries in the obsimagesdict, and (2) splices each retrieved demo's discretized(state, action)into the prompt text (same binning as the pi0.5 State block). eva (14-D) and aria (12-D, no gripper) already share one 32-D action space via the converters.Added
egomimic/ricl/—retrieval(DINOv3 → cKDTree → per-query top-k cache),conditioning(the prefix surgery),data(collate + frame-idx wrapper + bank provider),metrics(+ CPU unit tests for each)egomimic/algo/pi_ricl.py(PIRicl(PI), three small overrides) andegomimic/eval/pi_ricl_eval.py(PIRiclEval: retrieval vs zero-context floor)RiclDataModuleWrapperinpl_utils/pl_data_utils.py; configsmodel/pi0.5_ricl,data/cotrain_pi_ricl,evaluator/eval_pi_riclegomimic/ricl/README.md— design notes + cluster runbookStatus. All logic is CPU-verified locally (23 checks: retrieval smoke, conditioning, data collate, metrics; both Hydra configs validated; ruff-clean; imports resolve in the
emimicenv). The finetune (from the openpi pi0.5 base), DINOv3 embedding at scale, and the D0 (eva→eva sanity) → D1 (aria→eva) eval are GPU-cluster steps documented in the README — they can't run on the dev box (noopenpi/CUDA).🤖 Generated with Claude Code