Mlir Backend: Tensor Dialect Support#45
Conversation
855d715 to
58ffe40
Compare
|
It is really cool ! Congrats ! When you say that you wait for an xDSL release, is it because you know that the feature will be added ? |
Thanks! yeah I contributed tensor.pad but its probably easiest to just wait for the next release to add it to xtc. |
|
I will need some time to review this! Pls ping me when it's ready! |
@qaco Sorry for the delay during my break I thought I had an idea to make the pad better but turns out it wouldn't work as I expected. The PR is ready to review. I'm currently working on making the compile time not insanely terrible but the majority of the tensor dialect PR would be unchanged. |
62d2977 to
c0fbf01
Compare
guillon
left a comment
There was a problem hiding this comment.
It's fine for me, @liamsemeria you can merge when ready.
Note for future work:
- may use the sdsl pad directly
- need to study performance issues with one-shot-bufferize
If there are further devs to do w.r.t. these points, we will make new PRs.
c0fbf01 to
1eca727
Compare
|
@guillon I dont have the option to merge on my end, but I updated I rebased it with main and its ready to merge. |
@qaco The PR is ready to merge, can you check? |
fc5a68b to
a2b3b20
Compare
a2b3b20 to
f2b2881
Compare
fe3ffe7 to
2738915
Compare
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
2d4619a to
8d527ff
Compare
also added multi-output graphs and cleanup passes needed for fusion
also removed pre tensor lowering ir dump
…added using_tensors_hint also added matmul layout test, comments and readability changes
8d527ff to
bae99b3
Compare
|
@liamsemeria why do we need an additional setting |
@guillon It's needed to enable tensor specific changes to the compiler passes which are moving vectorization lowering patterns to after bufferization and using the FoldUnitDims pattern. If it was enabled by default the major changes would be that there would be a lingering transform.sequence (the post vectorization patterns) in the A way the I can explain it more tomorrow if you need any clarifications. |
bae99b3 to
48d6580
Compare
I need to understand first the |
Should we make the integration of tensor transparent ? Like : if nothing happens at tensor level, the tensors are immediately bufferized, and then the buffer-level transformations occur |
Good point. Do you think it's feasible to have a single transform pass @liamsemeria? |
Replaced using_tensor_hint with detecting tensors automatically, will be removed later since it requires updating all memref tests which is best done in a seperate PR. Fixed an issue where applying UnitDims folding applied to the entire function instead of the specified vectorized loop.
Motivation
Support for ops in the tensor dialect allows for tracking of producer-consumer relationships and broadcasting, which allow for operator fusion and element-wise operations respectively.
Description
The mlir backend now has an option
use_tensor_dialectthat causes ops to be generated in the tensor dialect. The tensor dialect gets lowered into memref by a new bufferization pass after the transform pass is applied (can be printed withprint_bufferization_ir=True).Discussion
How the Tensor Dialect Affects the IR:
matmul and conv2d:
The bufferization results in the exact same lowered mlir as the memref dialect ops (at least for unscheduled).
relu:
Collapsing the shape to 1 dim requires the tensor to be expanded (unlike the memref), resulting in an extra memory allocation after bufferization. So the relu for the tensor dialect is non-collapsing, which is also required for consumer fusion to work properly in the future.
pad and unpad:
The tensor implementation uses a
linalg.genericwhich is needed for fusion. It has dynamic dims which requires mlir: updated extra-tools version #70 an update to the extra tools for the c backend to work properly.