Skip to content

Convert instrumentation metric reporting to Pydantic models#192

Merged
AlexJones0 merged 3 commits into
lowRISC:masterfrom
AlexJones0:instrumentation_refactor_2
May 14, 2026
Merged

Convert instrumentation metric reporting to Pydantic models#192
AlexJones0 merged 3 commits into
lowRISC:masterfrom
AlexJones0:instrumentation_refactor_2

Conversation

@AlexJones0
Copy link
Copy Markdown
Contributor

@AlexJones0 AlexJones0 commented May 13, 2026

This PR is the third of a series of PRs to introduce instrumentation reporting to DVSim, which in this case also involves refactoring the existing instrumentation implementation to allow better modeling & static typing around this use case.

This PR features the main overhaul to the instrumentation data model logic. Unfortunately this PR is on the slightly larger side by necessity, as it would otherwise require a lot of extraneous code & commits to make the migration cleanly. But the code should hopefully be relatively easy to review. The main changes involved are:

  1. Replace all existing metric dataclasses with Pydantic models, and move them into a central instrumentation/records.py file. Stop re-exporting so many of these through the instrumentation/__init__.py package.
  2. Add the ability to independently retrieve the scheduler & job metrics from each instrumentation.
  3. Identify each instrumentation with a stable name attribute on the class itself.
  4. Instead of collapsing the reported instrumentation metrics into some nested dictionary of arbitrary data, construct a Pydantic model for the combined instrumentation report. Note that this does change the generated output instrumentation logic slightly (but it should be an improvement in readability, in general).

For the 1st commit, I recommend reviewing base.py, then records.py alongside (metadata.py, timing.py, resources.py), and finally __init__.py for the easiest time.

This will allow us to introduce reporting logic later on which can consume the instrumentation data whilst being confident in the static typing provided. Plugins can then still be supported via Pydantic's extra=allow config on the combined scheduler and job instrumentation data models.

See the commit messages for more information.

Copy link
Copy Markdown
Collaborator

@machshev machshev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlexJones0! This looks like a nice tidy up...

@AlexJones0 AlexJones0 force-pushed the instrumentation_refactor_2 branch from 9692f19 to 0f1d1a5 Compare May 14, 2026 10:18
@AlexJones0 AlexJones0 marked this pull request as ready for review May 14, 2026 10:18
@AlexJones0
Copy link
Copy Markdown
Contributor Author

The latest force push is a rebase on master now that #189 has been merged, with no other changes.

Migrate from Python dataclasses to Pydantic models for instrumentation
metric reports, with a central `instrumentation/records.py` file
containing each of the definition. In a future commit, this will allow
us to easily make a statically typed Pydantic model for a full
instrumentation report of the combined results.

To promote better typing, this also changes the
`SchedulerInstrumentation` interface such that you can now separately
query the collected scheduler and job data, to provide a slightly more
intuitive interface. This change is then propagated across the various
built-in instrumentation types.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Rather than just relying on the class `__name__`, we will want a well-
defined field that matches the various fields we are using within our
Pydantic model. This will allow plugins to declare a 'name' that can
then be queried to get their specific metrics.

Note that there is now some overlap with the name that is used in the
instrumentation registry/factory, but that will be addressed in a future
commit.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Get rid of the existing logic that collapsed everything to arbitrary
dictionaries and ensure that the merged instrumentation report is a
well-formed Pydantic model that can then be confidently queried, with
extra attributes (instrumentation records) still allowed to support
custom instrumentations registered via e.g. plugins.

Note that there is slightly less flattening on the produced JSONs
(this is a design change) - each independent job (and the scheduler
itself) now contains many dictionaries, with one per instrumentation,
to effectively scope each instrumentation's contribution to the final
report. This doesn't impact readability too much and will importantly
allow us to load the reports back in the future.

The intention here is that built-in instrumentation types have a
well-defined field in the report model that can be queried, whereas any
additional plugin data is still made available through the Pydantic
`extra="allow"` configuration option.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
@AlexJones0 AlexJones0 force-pushed the instrumentation_refactor_2 branch from 0f1d1a5 to c9bbf0a Compare May 14, 2026 10:58
@AlexJones0
Copy link
Copy Markdown
Contributor Author

And the latest force-push just fixed a bad copy-paste in a docstring 😅

@AlexJones0 AlexJones0 added this pull request to the merge queue May 14, 2026
Merged via the queue into lowRISC:master with commit 0c6432a May 14, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants