Skip to content

[release-v1.21.6] Add SplitTensorsTransform to QEFFAutoModel to preve…#968

Merged
quic-hemagnih merged 1 commit intoquic:release/v1.21.6from
asmigosw:split_tensor_transform
May 7, 2026
Merged

[release-v1.21.6] Add SplitTensorsTransform to QEFFAutoModel to preve…#968
quic-hemagnih merged 1 commit intoquic:release/v1.21.6from
asmigosw:split_tensor_transform

Conversation

@asmigosw
Copy link
Copy Markdown
Contributor

@asmigosw asmigosw commented May 6, 2026

Add SplitTensorsTransform to QEFFAutoModel to prevent >2GB protobuf exports

FP16ClipTransform inlines external weights, causing large embedding
models (e.g. BAAI/bge-reranker-v2-m3) to exceed the 2GB ModelProto
parser limit in the AIC compiler

Adding SplitTensorsTransform to _onnx_transforms spills large
initializers to sidecar *.onnx.data files. Updated existing tests
and added regression tests to verify external data spilling behavior.

…nt >2GB protobuf export issue

Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
@quic-hemagnih
Copy link
Copy Markdown
Contributor

CI-Ready

1 similar comment
@asmigosw
Copy link
Copy Markdown
Contributor Author

asmigosw commented May 7, 2026

CI-Ready

@quic-hemagnih quic-hemagnih merged commit e7dbee2 into quic:release/v1.21.6 May 7, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants