Commit 9f717a2
Mark Saroufim
Add MI355X details (#108)
* Add AMD February 2026 competition: mxfp4-mm, moe-mxfp4, mixed-mla
3 problems targeting MI355X from AMD-AIM/reference-kernels@20260209.
Also fixes eval.py regex to support underscored keys and booleans.
* Document known moe-mxfp4 non-determinism issue on MI355X
aiter's fused_moe kernel produces different results across calls with
identical inputs on gfx950, causing the reference submission to fail
correctness checks against itself.
* Update AMD problems with latest changes from AMD-AIM
- mixed-mla: Add tp (tensor parallel) parameter, variable num_heads,
qseqlen=4 prefill cases, updated test/benchmark shapes
- moe-mxfp4: Updated benchmark shapes with TP=4/TP=8 variants,
different batch sizes
- mxfp4-mm: Added m=32 benchmark, adjusted shape set
* Increase AMD problem timeouts to 30 minutes
aiter JIT compilation on first run can take 10+ minutes on MI355X,
causing test timeouts. Bump all timeouts to 1800s (30 min).1 parent facc675 commit 9f717a2
17 files changed
Lines changed: 2417 additions & 0 deletions
File tree
- problems
- amd_202602
- mixed-mla
- moe-mxfp4
- mxfp4-mm
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
0 commit comments