preserve SIMD element type information#155005
Conversation
This comment has been minimized.
This comment has been minimized.
c1444a0 to
05b0368
Compare
This comment has been minimized.
This comment has been minimized.
c9dd7e6 to
68ed562
Compare
|
Some changes occurred in compiler/rustc_codegen_gcc Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 |
|
r? @nnethercote rustbot has assigned @nnethercote. Use Why was this reviewer chosen?The reviewer was selected based on:
|
| // CHECK: define [2 x <1 x ptr>] @pair_ptrx1_t([2 x <1 x ptr>] {{.*}} %0) | ||
| #[unsafe(no_mangle)] extern "C" fn pair_ptrx1_t(x: Pair<Simd<*const (), 1>>) -> Pair<Simd<*const (), 1>> { x } | ||
|
|
||
| // When it fits in a 128-bit register, it's passed directly. |
There was a problem hiding this comment.
the types above are actually used by neon intrinsics. Below are a couple that technically work but are unlikely to come up practically.
Then for any type is smaller than 128-bit padding is added which means the type information is lost (but I think that is needed for ABI reasons?). Larger types are passed indirectly, so the type information is not needed there (but we do still technically provide it, maybe it's useful elsewhere).
68ed562 to
fbab8b2
Compare
| Primitive::Pointer(_) => cx.type_ptr(), | ||
| }; | ||
|
|
||
| let len = self.size.bytes() / hint_vector_elem.size(cx).bytes(); |
There was a problem hiding this comment.
Is it worth asserting that this division is exact?
| /// by always picking i8) by codegen backends. | ||
| /// | ||
| /// The element kind is used to provide more accurate type information to the backend, which | ||
| /// helps with optimization (e.g. because it prevents extra bitcasts that obscure a pattern). |
There was a problem hiding this comment.
This comment is repetitive ("optimization" appears twice) and I found "The element kind" confusing -- I thought it was referring to something new, before I realized it's just referring back to hint_vector_elem.
A suggested alternative:
The
hint_vector_elemis strictly for optimization purposes. E.g. it can be used by a codegen backend to prevent extra bitcasts that obscure a pattern. Alternatively, it can be safely ignored by always pickingi8.
and provide it to LLVM for better optimization
fbab8b2 to
6f428df
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@rustbot ready |
There was a problem hiding this comment.
LGTM. I'm no expert on this stuff so I will cc @nikic to give them a chance to look if they want.
|
Eh, good enough. Any genuine problems can be fixed in a follow-up. @bors r+ rollup |
Rollup merge of #155005 - folkertdev:simd-element-type-llvm, r=nnethercote preserve SIMD element type information Preserve the SIMD element type and provide it to LLVM for better optimization. This is relevant for AArch64 types like `int16x4x2_t`, see also llvm/llvm-project#181514. Such types are defined like so: ```rust #[repr(simd)] struct int16x4_t([i16; 4]); #[repr(C)] struct int16x4x2_t(pub int16x4_t, pub int16x4_t); ``` Previously this would be translated to the opaque `[2 x <8 x i8>]`, with this PR it is instead `[2 x <4 x i16>]`. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher. This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support. discussion at [#t-compiler > loss of vector element type information](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/loss.20of.20vector.20element.20type.20information/with/584483611)
Preserve the SIMD element type and provide it to LLVM for better optimization.
This is relevant for AArch64 types like
int16x4x2_t, see also llvm/llvm-project#181514. Such types are defined like so:Previously this would be translated to the opaque
[2 x <8 x i8>], with this PR it is instead[2 x <4 x i16>]. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher.This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support.
discussion at #t-compiler > loss of vector element type information