preserve SIMD element type information#155005
preserve SIMD element type information#155005folkertdev wants to merge 1 commit intorust-lang:mainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
c1444a0 to
05b0368
Compare
This comment has been minimized.
This comment has been minimized.
93d1e47 to
c9dd7e6
Compare
and provide it to LLVM for better optimization
c9dd7e6 to
68ed562
Compare
|
Some changes occurred in compiler/rustc_codegen_gcc Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 |
|
r? @nnethercote rustbot has assigned @nnethercote. Use Why was this reviewer chosen?The reviewer was selected based on:
|
| // CHECK: define [2 x <1 x ptr>] @pair_ptrx1_t([2 x <1 x ptr>] {{.*}} %0) | ||
| #[unsafe(no_mangle)] extern "C" fn pair_ptrx1_t(x: Pair<Simd<*const (), 1>>) -> Pair<Simd<*const (), 1>> { x } | ||
|
|
||
| // When it fits in a 128-bit register, it's passed directly. |
There was a problem hiding this comment.
the types above are actually used by neon intrinsics. Below are a couple that technically work but are unlikely to come up practically.
Then for any type is smaller than 128-bit padding is added which means the type information is lost (but I think that is needed for ABI reasons?). Larger types are passed indirectly, so the type information is not needed there (but we do still technically provide it, maybe it's useful elsewhere).
Preserve the SIMD element type and provide it to LLVM for better optimization.
This is relevant for AArch64 types like
int16x4x2_t, see also llvm/llvm-project#181514. Such types are defined like so:Previously this would be translated to the opaque
[2 x <8 x i8>], with this PR it is instead[2 x <4 x i16>]. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher.This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support.
discussion at #t-compiler > loss of vector element type information