add robotwin sim for train and eval by RyanPCo · Pull Request #503 · GaTech-RL2/EgoVerse

RyanPCo · 2026-06-16T19:25:46Z

No description provided.

RyanPCo · 2026-06-16T19:26:03Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

add robotwin sim for train and eval #503 👈 (View in Graphite)
in-context-learning: human↔robot pairing + RICL on EgoVerse pi0.5 #491
local: personal env overrides — Do not merge #493 : 1 other dependent PR (#494 )
6D rotation normalization to prevent large changes from euler #490
remove rotation bounds checks for norm stats #480 : 1 other dependent PR (#481 )
optional viz videos on train dataset #476
optional flag to generate validation videos for each task #475
pi train human new #473
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

- train_robotwin_ricl: CKPT_DIR -> repo-relative pi05_base_pytorch (the old /storage/project/r-dxu345-0 default was from a different cluster and does not exist here) - drop accidental external/robocasa + external/robosuite gitlinks (no .gitmodules entries, empty; RoboTwin is the only sim we need) - CLAUDE.md: Environment section MacBook -> Georgia Tech SLURM cluster Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- build_robotwin_bank_index.py: consolidated DINOv2 kNN index over a RoboTwinCorpus (HDF5 bank), built with the same .embed as eval's OnlineRetriever; output format matches build_embedding_index.build_retrieval_index. + CPU unit test. - train_robotwin_ricl.stage_full: save the train corpus's quantiles.json beside the checkpoints (eval normalizes identically via robotwin_policy). - robotwin_policy: wire PIRicl tokenizer to the vendored pg_tokenizer (was None) so forward_eval can tokenize the spliced prompt; overridable via usr_args["tokenizer"]. - train_robotwin_ricl.sbatch: this-cluster launcher (hoffman-lab a40); the script uses a Lightning Trainer directly, not submitit. - docs: refresh ricl/CLAUDE.md + robotwin_setup.md (cluster runbook, eval artifacts, selective sim install to avoid the torch==2.4.1 downgrade). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Full fp32-AdamW fine-tune of the 3.6B pi0.5 model OOMs on this cluster's largest GPU (a40 44GB / l40s 48GB; no A100/H100/H200) — optimizer state alone is ~29GB. - train_robotwin_ricl.py: add --adam8bit (bitsandbytes AdamW8bit) — optimizer state ~29GB -> ~7GB, so the run fits (~22GB + activations). Default off (H200 keeps AdamW). - train_robotwin_ricl.sbatch: --adam8bit, batch 2, PYTORCH_CUDA_ALLOC_CONF= expandable_segments:True, a40 (l40s is GRES-restricted for interactive jobs). - CLAUDE.md: venv is uv-managed (no pip -> `uv pip install`); GPU mem ceiling + 8-bit-AdamW workaround + overcap partition for extra/idle GPUs. Verified: job steps with bf16-mixed AMP, action_loss ~1.5 and falling, no OOM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Surfaced by the first real RoboTwin closed-loop runs: - robotwin_policy._load_algo: PIRicl is not an nn.Module — set eval mode via nets.eval(), move nets to CUDA, and set algo.device (Lightning normally does this). - get_action: wrap forward in bf16 autocast (model is bf16; training used bf16-mixed) and pass a dummy zero action so process_batch_for_training can infer (B, horizon) at inference (sample_actions ignores its values). - eval_robotwin_ricl.sbatch: venv-aware launcher on a known-good node; TORCH_COMPILE_ DISABLE=1 (eval-only, avoids the multi-min max-autotune) + EVAL_TEST_NUM bound. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Eval ran all episodes in SAPIEN (exit 0); the 250-step checkpoint scored 1/10 on beat_block_hammer (undertrained — pipeline is the deliverable). - eval_robotwin_ricl.sbatch: python -u (live progress; output was block-buffered). - robotwin_setup.md: mark both TODOs done; document the selective sim install, the external/RoboTwin fork patches that let policy eval run without curobo (stub planner, in-process planner, skip expert_check, task-name instruction, no eval video), and the bad-node/CUDA-driver + uv-pip gotchas. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

robotwin_new_cluster_setup.md — reproducible end-to-end install (repo+submodules, uv venv, tokenizer, base ckpt, data, 8-bit-AdamW training, selective sim install + RoboTwin-fork patches, closed-loop eval) plus a gotchas/troubleshooting section and a verification ladder. Cross-linked from robotwin_setup.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- setup_multitask_robotwin.py: download aloha-agilex slices for disjoint train/eval task sets and symlink N demos/task into train_root/eval_root (<task>/{data, instructions}); hard-fails if the sets overlap (eval tasks must be unseen in train). - robotwin_policy: no_incontext mode — skip the OnlineRetriever and pass no ricl_retrieved_* keys so PIRicl runs as base pi0.5 (plain-finetune eval baseline). - train sbatch: ${NO_INCONTEXT:+--no-incontext} toggle. Enables the canonical RICL test: train on N tasks, eval retrieval-vs-floor on held-out NEW tasks; plain trained on the SAME data is the baseline. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- robotwin_policy: parse no_incontext from a string override (eval_policy --overrides passes strings), not just a yaml bool. - eval_multitask_compare.sbatch: for each held-out task, build its DINOv2 bank index, then closed-loop eval the RICL model (within-task retrieval ON) and the plain model (retrieval OFF). Pass per-run paths via eval_policy --overrides. Queue with --dependency=afterany on the two training jobs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

RyanPCo mentioned this pull request Jun 16, 2026

in-context-learning: human↔robot pairing + RICL on EgoVerse pi0.5 #491

Draft

RyanPCo force-pushed the ryanco/robotwin-sim branch 3 times, most recently from e0d067e to 950a5f8 Compare June 16, 2026 20:44

RyanPCo changed the title ~~add robocasa sim for train and eval~~ add robotwin sim for train and eval Jun 17, 2026

RyanPCo force-pushed the ryanco/robotwin-sim branch from 98c08e7 to 41a1e0a Compare June 18, 2026 16:16

RyanPCo and others added 9 commits June 18, 2026 12:58

add robocasa sim for train and eval

4278106

RyanPCo force-pushed the ryanco/in-context-learning branch from b502f24 to bb62e15 Compare June 18, 2026 19:58

RyanPCo force-pushed the ryanco/robotwin-sim branch from 41a1e0a to 49687e5 Compare June 18, 2026 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add robotwin sim for train and eval#503

add robotwin sim for train and eval#503
RyanPCo wants to merge 9 commits into
ryanco/in-context-learningfrom
ryanco/robotwin-sim

RyanPCo commented Jun 16, 2026

Uh oh!

RyanPCo commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RyanPCo commented Jun 16, 2026

Uh oh!

RyanPCo commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RyanPCo commented Jun 16, 2026 •

edited

Loading