I was enabling Miri tests in my SIMD wrapper crate and ran into UB for the vld3 intrinsics.
The error below is for vld3q_f32 which returns a float32x4x3_t.
test aarch64::tests::test_vld3q_f32 ... error: Undefined Behavior: memory access failed: attempting to access 64 bytes, but got alloc751281 which is only 48 bytes from the end of the allocation
--> \.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\..\..\stdarch\crates\core_arch\src\macros.rs:250:20
|
250 | let w: W = ptr::read_unaligned($ptr as *const W);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred here
|
::: \.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\..\..\stdarch\crates\core_arch\src\arm_shared\neon\generated.rs:22080:5
|
22080 | crate::core_arch::macros::deinterleaving_load!(f32, 4, 3, a)
| ------------------------------------------------------------ in this macro invocation
|
This comment on a std::simd private load function warns about the interaction of repr(simd) and read_unaligned.
/// This function is necessary since `repr(simd)` has padding for non-power-of-2 vectors (at the time of writing).
/// With padding, `read_unaligned` will read past the end of an array of N elements.
core_arch's Simd is repr(simd) which pads to the nearest power-of-two from 48 to 64 bytes for Simd<f32, 12>.
|
#[repr(simd)] |
|
#[derive(Copy)] |
|
pub(crate) struct Simd<T: SimdElement, const N: usize>([T; N]); |
Thus, type W here of Simd<f32, 12> is actually 64 bytes which triggers UB in Miri when a pointer to [f32; 12] is passed.
|
($elem:ty, $lanes:literal, 3, $ptr:expr) => {{ |
|
use $crate::core_arch::macros::deinterleave_mask; |
|
use $crate::core_arch::simd::Simd; |
|
use $crate::{mem::transmute, ptr}; |
|
|
|
type V = Simd<$elem, $lanes>; |
|
type W = Simd<$elem, { $lanes * 3 }>; |
|
|
|
let w: W = ptr::read_unaligned($ptr as *const W); |
|
|
|
let v0: V = simd_shuffle!(w, w, deinterleave_mask::<$lanes, 3, 0>()); |
|
let v1: V = simd_shuffle!(w, w, deinterleave_mask::<$lanes, 3, 1>()); |
|
let v2: V = simd_shuffle!(w, w, deinterleave_mask::<$lanes, 3, 2>()); |
|
|
|
transmute((v0, v1, v2)) |
|
}}; |
I was enabling Miri tests in my SIMD wrapper crate and ran into UB for the
vld3intrinsics.The error below is for
vld3q_f32which returns afloat32x4x3_t.This comment on a
std::simdprivate load function warns about the interaction ofrepr(simd)andread_unaligned.core_arch'sSimdisrepr(simd)which pads to the nearest power-of-two from 48 to 64 bytes forSimd<f32, 12>.stdarch/crates/core_arch/src/simd.rs
Lines 39 to 41 in 949ae81
Thus,
type Where ofSimd<f32, 12>is actually 64 bytes which triggers UB in Miri when a pointer to[f32; 12]is passed.stdarch/crates/core_arch/src/macros.rs
Lines 253 to 268 in 949ae81