issue/1207: 修复 NVIDIA T1-1 失败算子#1208
Open
GordonYang1 wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
本 PR 修复 2025 秋季算子比赛 T1-1 相关算子在 NVIDIA 后端的失败和不稳定问题。
涉及算子:
index_copyfmodlogdetupsample_nearestlogical_andlogical_notaddbmmgaussian_nll_loss本 PR 实际修改并修复了以下算子:
index_copyfmodlogdetupsample_nearestlogical_andlogical_notaddbmm和gaussian_nll_loss本 PR 未修改代码;已通过连续单算子测试重新确认,目前可以通过。Changes
index_copy:使用torch.randperm生成不重复 index,避免重复 index 导致同一位置被多次写入,从而出现非确定性结果。fmod:测试中避免生成 0 作为除数,防止在equal_nan=False时出现NaNvsNaN比较失败。logdet:使用稳定的正定 / 对角占优矩阵作为测试输入,同时保留 strided case 覆盖。upsample_nearest:导出upsample_nearest,并让 1Dinterpolate(mode="nearest")复用已有的upsample_nearest路径。logical_and:将缺失的ntops.torch.logical_andCUDA 路径替换为ne + bitwise_and组合实现。logical_not:将缺失的ntops.torch.logical_notCUDA 路径替换为先用eq写入临时 bool tensor,再以 alias-safe 的方式写回out。Test Result
Related issue
Fixes #1207