Skip to content

Use also upper counter for more accurate benchmarking  #22

@giordano

Description

@giordano

We can use __builtin_ipu_get_scount_u, together with __builtin_ipu_get_scount_l, to get the full 64-bit cycle counter. The challenge is that when calling both builtins in a row (ideally first the upper and the lower) may overflow the lower and flip the upper, so we need to manually deal with the case where the lower counter is less than 12 (or 6?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    code generationRelated to GPUCompiler code generation infrastructure

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions