Skip to content

Commit 9f717a2

Browse files
author
Mark Saroufim
authored
Add MI355X details (#108)
* Add AMD February 2026 competition: mxfp4-mm, moe-mxfp4, mixed-mla 3 problems targeting MI355X from AMD-AIM/reference-kernels@20260209. Also fixes eval.py regex to support underscored keys and booleans. * Document known moe-mxfp4 non-determinism issue on MI355X aiter's fused_moe kernel produces different results across calls with identical inputs on gfx950, causing the reference submission to fail correctness checks against itself. * Update AMD problems with latest changes from AMD-AIM - mixed-mla: Add tp (tensor parallel) parameter, variable num_heads, qseqlen=4 prefill cases, updated test/benchmark shapes - moe-mxfp4: Updated benchmark shapes with TP=4/TP=8 variants, different batch sizes - mxfp4-mm: Added m=32 benchmark, adjusted shape set * Increase AMD problem timeouts to 30 minutes aiter JIT compilation on first run can take 10+ minutes on MI355X, causing test timeouts. Bump all timeouts to 1800s (30 min).
1 parent facc675 commit 9f717a2

17 files changed

Lines changed: 2417 additions & 0 deletions

File tree

problems/amd_202602.yaml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
name: AMD Developer Challenge February 2026
2+
deadline: "2026-03-15 06:00"
3+
description: "AMD Developer Challenge: MXFP4 matrix multiplication, Mixture-of-Experts, and Multi-head Latent Attention optimized for MI355X."
4+
problems:
5+
- directory: amd_202602/mxfp4-mm
6+
name: amd-mxfp4-mm
7+
deadline: "2026-03-15 06:00"
8+
gpus:
9+
- MI355X
10+
- directory: amd_202602/moe-mxfp4
11+
name: amd-moe-mxfp4
12+
deadline: "2026-03-15 06:00"
13+
gpus:
14+
- MI355X
15+
- directory: amd_202602/mixed-mla
16+
name: amd-mixed-mla
17+
deadline: "2026-03-15 06:00"
18+
gpus:
19+
- MI355X

0 commit comments

Comments
 (0)