Add OnnxDiscrepancyCheck speedup metric with default timing updates by xadupre · Pull Request #2502 · microsoft/Olive

xadupre · 2026-06-05T15:50:20Z

Describe your changes

Added speedup measurement for OnnxDiscrepancyCheck and updated behavior based on review feedback:

Changed timing_iterations default from 10 to 5.
If timing_iterations is set to 0, speedup measurement is skipped.
Added unit tests to validate the new default and the skip behavior.
Fixed test mocks to properly configure device attribute for compare_generation tests.

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

Copilot

Pull request overview

This PR enhances the OnnxDiscrepancyCheck pass by adding an inference speedup measurement (ONNX vs PyTorch) and introducing configurable warmup/timing iteration settings, with updated defaults and tests to validate the new behavior.

Changes:

Added warmup_iterations and timing_iterations config parameters (defaulting timing_iterations to 5) and implemented speedup measurement with an option to skip when timing_iterations=0.
Updated session/device setup to target the configured accelerator (with CPU fallback) and run PyTorch on the matched torch device.
Added unit tests to validate the new default and the skip behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`olive/passes/onnx/discrepancy_check.py`	Adds speedup timing, new config params/defaults, and device-aware session/model handling.
`test/passes/onnx/test_discrepancy_check.py`	Adds tests for the new `timing_iterations` default and the “skip when 0” behavior.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

xadupre · 2026-06-11T17:45:59Z

@copilot test/passes/onnx/test_discrepancy_check.py:105: in test_compare_generation_returns_common_prefix_length
result = pass_instance.compare_generation(config, mock_ref_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
olive/passes/onnx/discrepancy_check.py:389: in compare_generation
input_ids = input_ids.to(ref_model.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E TypeError: to() received an invalid combination of arguments - got (MagicMock), but expected one of:
E * (torch.device device = None, torch.dtype dtype = None, bool non_blocking = False, bool copy = False, *, torch.memory_format memory_format = None)
E * (torch.dtype dtype, bool non_blocking = False, bool copy = False, *, torch.memory_format memory_format = None)
E * (Tensor tensor, bool non_blocking = False, bool copy = False, *, torch.memory_format memory_format = None)

Copilot · 2026-06-11T17:58:53Z

@copilot test/passes/onnx/test_discrepancy_check.py:105: in test_compare_generation_returns_common_prefix_length
result = pass_instance.compare_generation(config, mock_ref_model)
^^...

Fixed in 25c6ab4. Added mock_ref_model.device = torch.device("cpu") to both compare_generation test cases so that the mock returns a proper torch.device object instead of a MagicMock.

…ter and gathered for a dashboard (#2503) ## Describe your changes The pass OnnxDiscrepancyCheck is used to gather metrics on canary models. They need to be dumped to let the user see them in detail, the user can be a dashboard as well. When `--test` is active, `add_discrepancy_check_pass` now injects `max_mae: 0.1` and `report_output_dir` (set to the run config's `output_dir`) into the discrepancy check pass configuration alongside `reference_model_path`. Both `report_output_dir` and the CLI `save_discrepancy_check_results` helper normalize file paths to their parent directory to avoid writing artifacts under a path that looks like a file. Fixed test regressions in `test_workflow_run_command_with_test_override` and `test_workflow_run_command_with_test_reuses_test_output_dir`: - Set `mock_run.return_value = None` so `save_discrepancy_check_results` exits early at its null-guard check instead of attempting to serialize a `MagicMock` via `json.dumps`. - Updated the `assert_called_once_with` expectation to include `max_mae` and `report_output_dir` in the expected discrepancy check pass config. `OnnxDiscrepancyCheck` now raises a `ValueError` after persisting the report when any configured threshold is exceeded (numeric metric thresholds or generation token sequence minimum), failing the run instead of silently succeeding. This restores the documented "the pass fails if any configured threshold is exceeded" behavior for `--test` runs. Replaced `print` statements in `olive/cli/base.py` with `logger.debug` calls to follow the project's logging conventions. Merged upstream changes from main (`#2502`) that added device-aware ONNX session preparation and inference speedup measurement to `OnnxDiscrepancyCheck`. The merge combined the device/torch_device setup and enhanced `prepare_session` call from main with the `report_dir` normalization and reference model export introduced by this branch. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

add speedup to OnnxDiscrepancyCheck

7713d07

xadupre commented Jun 5, 2026

View reviewed changes

Comment thread olive/passes/onnx/discrepancy_check.py

Copilot started work on behalf of xadupre June 5, 2026 17:15 View session

Fix discrepancy speedup defaults and zero-iteration skip

510a86a

Copilot AI changed the title ~~add speedup to OnnxDiscrepancyCheck~~ Add OnnxDiscrepancyCheck speedup metric with default timing updates Jun 5, 2026

Copilot finished work on behalf of xadupre June 5, 2026 17:25

xadupre marked this pull request as ready for review June 5, 2026 17:28

Copilot AI review requested due to automatic review settings June 5, 2026 17:28

Copilot started reviewing on behalf of xadupre June 5, 2026 17:28 View session

github-advanced-security AI found potential problems Jun 5, 2026

View reviewed changes

Comment thread test/passes/onnx/test_discrepancy_check.py Fixed

Comment thread test/passes/onnx/test_discrepancy_check.py Fixed

Copilot AI reviewed Jun 5, 2026

View reviewed changes

Comment thread olive/passes/onnx/discrepancy_check.py

Comment thread olive/passes/onnx/discrepancy_check.py

Copilot started work on behalf of xiaoyu-work June 5, 2026 20:40 View session

Suppress protected-access pylint warnings in discrepancy check tests

3a43a02

Copilot finished work on behalf of xiaoyu-work June 5, 2026 20:47

Copilot AI requested a review from xiaoyu-work June 5, 2026 20:47

Potential fix for pull request finding

454d9a9

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of xadupre June 8, 2026 12:43 View session

xadupre and others added 2 commits June 8, 2026 14:43

Merge branch 'main' into xadupre/speedup

a1e7586

Fix generation tensor device handling

595bffc

Copilot finished work on behalf of xadupre June 8, 2026 12:50

Merge branch 'main' into xadupre/speedup

5629eda

Copilot started work on behalf of xadupre June 11, 2026 17:45 View session

Copilot finished work on behalf of xadupre June 11, 2026 17:50

Copilot started work on behalf of xadupre June 11, 2026 17:51 View session

Fix mock_ref_model.device in compare_generation tests

25c6ab4

Copilot finished work on behalf of xadupre June 11, 2026 17:59

xiaoyu-work approved these changes Jun 12, 2026

View reviewed changes

xiaoyu-work merged commit 86d7575 into main Jun 12, 2026
13 checks passed

xiaoyu-work deleted the xadupre/speedup branch June 12, 2026 22:05

Copilot AI mentioned this pull request Jun 13, 2026

dumps OnnxDiscrepancyCheck result on disc so they can be picked up later and gathered for a dashboard #2503

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OnnxDiscrepancyCheck speedup metric with default timing updates#2502

Add OnnxDiscrepancyCheck speedup metric with default timing updates#2502
xiaoyu-work merged 8 commits into
mainfrom
xadupre/speedup

xadupre commented Jun 5, 2026 •

edited by Copilot AI

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

xadupre commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

xadupre commented Jun 5, 2026 • edited by Copilot AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Checklist before requesting a review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

xadupre commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

xadupre commented Jun 5, 2026 •

edited by Copilot AI

Loading