Skip to content

refactor InfiniOps cpu runtime through InfiniRT#644

Draft
spike-zhu wants to merge 1 commit into
masterfrom
feat/infinirt_runtime
Draft

refactor InfiniOps cpu runtime through InfiniRT#644
spike-zhu wants to merge 1 commit into
masterfrom
feat/infinirt_runtime

Conversation

@spike-zhu
Copy link
Copy Markdown

Summary

Motivation

Closes #

Type of Change

  • feat — new feature / new operator / new platform
  • fix — bug fix
  • perf — performance improvement (no behavioral change)
  • refactor — code restructuring without behavior change
  • test — adding or fixing tests only
  • docs — documentation only
  • build / ci — build system or CI configuration
  • chore — tooling, formatting, or other non-code changes
  • Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

  • CPU (WITH_CPU)
  • NVIDIA (WITH_NVIDIA)
  • Iluvatar (WITH_ILUVATAR)
  • MetaX (WITH_METAX)
  • Cambricon (WITH_CAMBRICON)
  • Moore (WITH_MOORE)
  • Ascend (WITH_ASCEND)
  • PyTorch C++ bindings (WITH_TORCH)
  • Build system / CMake / CI
  • Python bindings / user-facing API

Test Results on Supported Platforms

Platform Built pytest Result Notes / Hardware
NVIDIA
Iluvatar
MetaX
Cambricon
Moore
Ascend
Full `pytest` output (optional)
paste here

Benchmark / Performance Impact

Notes for Reviewers


Checklist

Every contributor must verify every item below before requesting
review. Tick each box only after the check has actually been performed —
do not tick speculatively. If an item truly does not apply, replace the
checkbox with N/A and briefly explain why in an inline comment.

Title, Branch, and Commits

  • PR title follows Conventional Commits (e.g. feat(nvidia): …, fix(cuda/gemm): …).
  • Branch name follows <type>/xxx-yyyy-zzzz where <type> matches the PR title's Conventional Commits type and words are joined with hyphens (see CONTRIBUTING.md §Branches).
  • Each commit message follows Conventional Commits.
  • Small PR is a single squashable commit; or, for a large PR, every commit is meaningful, well-formed, and independently reviewable (see CONTRIBUTING.md §Pull Requests).
  • No stray merge commits from master — the branch is rebased cleanly on top of the current master.
  • No fixup! / squash! / wip commits remain.

Scope and Design

  • Changes are minimal — nothing unrelated to the stated motivation was added (CONTRIBUTING.md §Code/General).
  • No dead code, commented-out blocks, debug prints, printf/std::cout/print(...) left behind, or TODO without an owner and issue link.
  • No unrelated formatting churn that would obscure the diff.
  • Public API changes (if any) are intentional, documented, and reflected in affected callers/tests.

General Code Hygiene (applies to all languages)

  • The code is self-explanatory; comments were added only where the why is non-obvious (CONTRIBUTING.md §Code/General).
  • Every modified or added file ends with a single trailing newline (CONTRIBUTING.md §Code/General).
  • No trailing whitespace, tab/space mixing, or stray BOMs.
  • Identifiers in comments and error messages are wrapped in backticks (e.g. the `seqlens_k` tensor) (CONTRIBUTING.md §Code/General).
  • All comments and error messages are in English (CONTRIBUTING.md §Code/General).
  • Comments and error messages are complete sentences — capitalized first letter, terminal punctuation — unless the language/framework convention says otherwise (CONTRIBUTING.md §Code/General; §Python).

C++ Specific (if C++ files changed)

  • Code follows the Google C++ Style Guide strictly.
  • clang-format (version 21, per .github/workflows/clang-format.yml) has been run against all modified .h, .cc, .cuh, and .mlu files; the diff is clean.
  • clang-tidy concerns (per .clang-tidy) have been reviewed — no new warnings beyond the existing baseline.
  • Operator parameter order is inputs first, outputs last; attributes are between inputs and outputs; naming follows PyTorch → ONNX → CUDA API precedence (CONTRIBUTING.md §C++).
  • No exceptions are thrown. Error paths use assert with messages that include at least __FILE__, __LINE__, and __func__ (CONTRIBUTING.md §C++).
  • Error and warning message wording follows the LLVM Coding Standards (CONTRIBUTING.md §C++).
  • Kernel files are named correctly: custom = kernel / kernel_v2 / …; well-known algorithms use the algorithm name; library-based implementations use the library name (CONTRIBUTING.md §C++).
  • Kernel and kernel launcher are in separate files: launcher .h, kernel follows platform conventions (e.g. .cuh + .cu) even when non-templated (CONTRIBUTING.md §C++).
  • Constructor initializer list order matches member declaration order (CONTRIBUTING.md §C++).
  • Exactly one blank line between classes, between classes and functions, and between functions (CONTRIBUTING.md §C++).
  • Exactly one blank line between members (functions and variables) within a class (CONTRIBUTING.md §C++).
  • Exactly one blank line before and after the contents of a namespace (CONTRIBUTING.md §C++).
  • New operators added via src/base/<op>.h (inheriting Operator<Op>) with platform implementations under src/<category>/<platform>/ inheriting the base (CONTRIBUTING.md §Adding an Operator).
  • No raw new/delete; RAII / smart pointers / existing allocators are used.

Python Specific (if Python files changed)

  • Code is PEP 8 compliant; ruff check passes cleanly on CI (see .github/workflows/ruff.yml).
  • ruff format --check passes cleanly — if not, run ruff format and commit the result.
  • Comments are complete English sentences, starting with a capital letter and ending with punctuation; Markdown backticks are used for code references (CONTRIBUTING.md §Python).
  • Framework-specific conventions (e.g. lowercase pytest.skip messages without terminal period) are honored where applicable (CONTRIBUTING.md §Python).
  • No blank line between the function signature and the body when there is no docstring or comment (CONTRIBUTING.md §Python).
  • A blank line is present before and after if, for, and similar control-flow statements (CONTRIBUTING.md §Python).
  • A blank line appears before each return, except when it directly follows a control-flow statement (CONTRIBUTING.md §Python).
  • Docstrings (if any) follow PEP 257 (CONTRIBUTING.md §Python).
  • Type hints are added / kept consistent with the surrounding code.

Testing

  • pytest was run locally on every supported platform that this PR can affect, and the results are recorded in the "Test Results" table above (CONTRIBUTING.md §Pull Requests).
  • For any platform that could not be tested, an explicit reason is given in the table and a reviewer with access has been tagged.
  • New functionality has matching tests under tests/ following tests/test_add.py / tests/test_gemm.py patterns (CONTRIBUTING.md §Adding an Operator).
  • Tests use pytest.mark.parametrize correctly: dependent parameters share one decorator (e.g. @pytest.mark.parametrize("dtype, rtol, atol", …)), independent parameters use separate decorators ordered by parameter declaration.
  • Where appropriate, pytest.mark.auto_act_and_assert is used and the test returns a Payload whose func and ref share the same calling convention.
  • Default dtype / device parameterization is relied on, or overridden with an explicit pytest.mark.parametrize when necessary.
  • Any new test that is flaky under parallelism is marked so, or documented to require pytest -n 1.
  • For bug fixes: a regression test has been added that fails on master and passes with this PR.

Build, CI, and Tooling

  • The project builds cleanly from a fresh directory with pip install .[dev] on at least one affected platform.
  • compile_commands.json still regenerates (CMake option CMAKE_EXPORT_COMPILE_COMMANDS=ON in pyproject.toml — required by the code-lint skill and clang-tidy -p).
  • New backends / devices have been added to auto-detection in CMakeLists.txt under if(AUTO_DETECT_DEVICES) and to if(AUTO_DETECT_BACKENDS) if applicable.
  • Only one CUDA-like GPU backend is selectable at a time — the existing mutual-exclusion check in CMakeLists.txt is not broken.
  • Both CI workflows (clang-format.yml, ruff.yml) are green locally (or expected to be green on CI).
  • No new runtime dependency was added without updating pyproject.toml's [project.optional-dependencies] (or justified in the PR description).

Documentation

  • README.md, CONTRIBUTING.md, or inline docs updated when behavior, build flags, or developer workflow changed.
  • New operators, new dispatch helpers, or new public utilities are documented (docstring, header comment, or an addition to CONTRIBUTING.md §Some Code Explanations).
  • Any user-visible breaking change is called out explicitly under "Motivation" and in the commit/PR title with a ! or BREAKING CHANGE: footer.

Security and Safety

  • No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
  • Third-party code is license-compatible and attributed.
  • No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were introduced.

@spike-zhu spike-zhu requested a review from a team June 4, 2026 09:29
)
params.update(_find_optional_tensor_params(op_name))
params.update(_find_vector_tensor_params(op_name))
return params
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要注意核对 CONTRIBUTING.md,比如这里 return 上面应该空一行。

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是可以直接删了这个文件。

Comment thread src/native/cpu/device_.h
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是可以直接删了这个文件。

Comment thread src/native/cpu/runtime_.h
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是可以直接删了这个文件。

Comment thread CMakeLists.txt
option(GENERATE_PYTHON_BINDINGS "Generate Python bindings" OFF)
option(USE_EXISTING_GENERATED_WRAPPERS
"Build from existing generated wrapper sources instead of regenerating them" OFF)
option(INFINIOPS_MINIMAL_ADD_BINDINGS
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是用来干啥的?如果只是开发时测试用途的话,这里就先不要了。

Comment thread CMakeLists.txt
option(GENERATE_OPERATOR_CALL_INSTANTIATIONS
"Generate explicit operator call instantiations" ON)
option(GENERATE_PYTHON_BINDINGS "Generate Python bindings" OFF)
option(USE_EXISTING_GENERATED_WRAPPERS
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个功能的确是需要的,但是不应该跟着这个 PR 提,应该单独拆分出一个 PR。而且这个看上去有点一刀切,应该做成那种通过识别 hash 等方式来决定是否重新生成,然后自动开启。

Comment thread src/runtime.h
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确认一下这个文件里的三个类是否都真的会被用到,也许有的只是 InfiniRT 内部会用到,但是 InfiniOps 用不到,如果存在这种情况的话,那这个文件里就可以去掉对应的那几行。

Comment thread src/data_type.h
Comment on lines +10 to +15
using infini::rt::BFloat16;
using infini::rt::DataType;
using infini::rt::Float16;
using infini::rt::kDataTypeToDesc;
using infini::rt::kDataTypeToSize;
using infini::rt::kStringToDataType;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个地方的顺序有点奇怪,按理说应该按照替换钱的顺序排序,然后适当插入空行,可以看看是不是改成如下形式:

using infini::rt::DataType;

using infini::rt::Float16;
using infini::rt::BFloat16;

using infini::rt::kDataTypeToSize;
using infini::rt::kDataTypeToDesc;
using infini::rt::kStringToDataType;

Comment thread src/CMakeLists.txt
list(FILTER BASE_SRCS EXCLUDE REGEX ".*tensor\\.cc$")
target_sources(infiniops PRIVATE ${BASE_SRCS})

set(INFINIOPS_EMPTY_SOURCE "${CMAKE_CURRENT_BINARY_DIR}/infiniops_empty.cc")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 infiniops_empty.cc 是干啥的?

@spike-zhu spike-zhu marked this pull request as draft June 5, 2026 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants