Skip to content

[WIP] feat: support MLA and refactor MHA#163

Open
Chamberlain0w0 wants to merge 4 commits into
masterfrom
feat/mla
Open

[WIP] feat: support MLA and refactor MHA#163
Chamberlain0w0 wants to merge 4 commits into
masterfrom
feat/mla

Conversation

@Chamberlain0w0
Copy link
Copy Markdown
Contributor

@Chamberlain0w0 Chamberlain0w0 commented May 29, 2026

  1. 修改目前 MHA 实现
    a. 原来的 TransformerConfig::attention_type = kStandard / kRoPE 不太合适,Megatron 及其他开源实现中通常把 attn_type 分为 self/cross。这块命名更改为 Megatron 中使用的 --position-embedding-type,可选值为 learned_absolute / rope / yarn / mrope / relative / none。相应地修改创建 WPE/apply rope 的相关条件判断。
    b. 删除了 CausalSelfAttention::ForwardStandardForwardWithRoPE 两条分支,合并成一个统一的 Forward。GQA 也被纳入统一路径。
    c. ApplyRotaryEmbeddingCausalSelfAttention 成员函数提到了 transformer utils.cc
    d. causal mask buffer 现在无论 learned absolute 还是 RoPE 都会初始化;如果外部没有传 mask,会 fallback 到内部 causal mask。这个对 RoPE 直接调用且不传 mask 的场景是一个小的行为统一。

  2. 添加 MLA Module
    --TODO--

@Chamberlain0w0 Chamberlain0w0 changed the title [WIP] feat: support MLA [WIP] feat: support MLA and refactor MHA Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant