Commit 0f80d7a
authored
Add nvfp4 group gemm example. (#77)
* add nvfp4 group gemm example.
* extend reference code to support m and n are not multiple of 128 cases.
* support different problem sizes to match real group gemm use case.
* modify test
* reduce range to reduce inf1 parent cc4fed5 commit 0f80d7a
7 files changed
Lines changed: 1982 additions & 0 deletions
File tree
- problems/nvidia/nvfp4_group_gemm
0 commit comments