Skip to content

May 2026 Update from UT internal metamon#86

Merged
jakegrigsby merged 4 commits into
mainfrom
release/may-2026
May 22, 2026
Merged

May 2026 Update from UT internal metamon#86
jakegrigsby merged 4 commits into
mainfrom
release/may-2026

Conversation

@jakegrigsby
Copy link
Copy Markdown
Collaborator

@jakegrigsby jakegrigsby commented May 22, 2026

A large update that moves metamon development back to the public repo

Datasets + Showdown Version

  • Everything has been synced up to May 20th 2026. Replays, usage stats, "revealed teams" all current as of May 19th.
  • New self-play dataset "pac-tauros" focusing on high-GXE Gen1OU play
  • New team sets built from recent replays and improved rating-dependent usage stat predictions. All legal as of today.
  • We are back in sync with current pokemon showdown; metamon can run on public showdown despite recent breaking changes to sim mechanics with minimal drop in performance.

Models

  • Lots of small Gen1OU specialist policies (V2A*) that were part of an ablation study. Released for extra opponent diversity in data generation
  • TaurosV0 was recently rank 1 on PokéAgent Gen1OU. It is an effort to specialize on high-ladder Gen1OU, and contributed to the (still half-baked) test-time ensembling pipeline that let metamon hit rank 1 on the public showdown ladder pretty consistently in late April.
  • New "grouped" observation space and associated TstepEncoder architecture.

Other

  • Much improved pipeline (rl/evaluate) for launching local ladders for self-play, doing h2h evals, and generating learning curves.
  • Model-based team prediction pipeline received a lot of internal work this spring but remains an experimental feature.
  • Many other minor improvements

@jakegrigsby jakegrigsby merged commit 1f04973 into main May 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant