feat(keypoint-detection): enable ViTPose config/build/perf by jeon185 · Pull Request #905 · microsoft/winml-cli

jeon185 · 2026-06-16T18:19:42Z

Enables the 6 ViTPose keypoint-detection models from #284 to pass wmk config -> build -> perf (CPU and OpenVINO). Eval isn't included here - I'll do the accuracy side in a follow-up since it needs a couple of design decisions first.

Two things were blocking all 6 models:

wmk config failed with "Task 'keypoint-detection' not supported by TasksManager". Optimum has the ViTPose ONNX export config but no task->class entry for keypoint-detection, and AutoModelForKeypointDetection only covers SuperPoint. Added the (vitpose, keypoint-detection) -> VitPoseForPoseEstimation mapping, same way we already do it for CLIP/SAM.
The plus checkpoints (MoE backbone) crashed during export with "dataset_index must be provided when using multiple experts". Optimum's VitPoseModelPatcher injects a constant dataset_index, but patch_model_for_export defaults model_kwargs to None so it crashed on init. Passing an explicit model_kwargs={} fixes that. The trace step (Step 3) was also running the model outside the patcher context, so I wrapped it the same way the export step already is.

The exporter change isn't ViTPose-specific - it helps any MoE model whose patcher injects forward args.

Verified config/build/perf on all 6: vitpose-base-simple, vitpose-plus-small/base/large/huge, and synthpose-vitpose-huge-hf. Added unit tests for the mapping and the patcher model_kwargs handling.

One note: you still need to pass --task keypoint-detection explicitly for now - the task isn't auto-detected from the config yet. I left auto-detection out of this PR to keep it small; can add it here or as a follow-up if you'd prefer.

Refs #284.

ViTPose keypoint-detection models could not pass the wmk pipeline: 1. Task resolution: Optimum registers the ViTPose ONNX export config but has no task-to-class entry for keypoint-detection, and transformers' AutoModelForKeypointDetection only recognizes SuperPoint. Add MODEL_CLASS_MAPPING[(vitpose, keypoint-detection)] = VitPoseForPoseEstimation (models/hf/vitpose.py) so the resolver loads the correct class. 2. MoE export: the vitpose-plus checkpoints use a Mixture-of-Experts backbone whose patcher injects a constant dataset_index at export time. Optimum's patch_model_for_export defaults model_kwargs to None, so the patcher crashed on init. Pass an explicit model_kwargs={} in _get_optimum_patcher. Also wrap the Step 3 hierarchy trace in the same patcher context (it previously ran the model forward without the injected dataset_index, failing before export). Verified config -> build -> perf on all 6 acceptance models in #284 (base-simple, plus-{small,base,large,huge}, synthpose-vitpose-huge-hf).

jeon185 requested a review from a team as a code owner June 16, 2026 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(keypoint-detection): enable ViTPose config/build/perf#905

feat(keypoint-detection): enable ViTPose config/build/perf#905
jeon185 wants to merge 1 commit into
mainfrom
feat/keypoint-detection-enablement

jeon185 commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jeon185 commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jeon185 commented Jun 16, 2026 •

edited

Loading