You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
foo: nilf %r2,3 # bitwise and with 0b11 to prevent UB due to index out of bounds vlgvf %r0, %v24,0(%r2) # put the elementatindex %r2 from the vector %24into %r0 llgfr %r2, %r0 # zero-extend thatvalue, put it into %r2 br %r14 # ret
in particular, the vlgvf (f for fullword, there are variations for other widths) is the relevant instruction here. It extracts the value at the given index.
Contrary to most other targets, the index argument to a vec_extractdoes not need to be const. The std::intrinsics::simd::simd_extract function does need its index argument to be const, and therefore can't straightforwardly be used to implement vec_extract.
(extern "C" is used so that the vector is passed by-value, but we see the same assembly when the vector is created within the function)
Indexing into the underlying array in this way may soon be banned, though there is an alternative approach that does the same thing. Unfortunately, this version does not optimize well:
The portable-simd implementation of Index appears to be doing the same thing, and generates the same code https://godbolt.org/z/eecM6qdbW. That totally makes sense for most targets, because a pointer load is the best you can do.
but that is unwieldy kind of unwieldy, and while I can make it work for stdarch it won't work for portable_simd.
Solutions
I think there should be a way of indexing into a vector that emits an extractelement rather than a getelementptr. Semantically that seems more accurate (and might optimize better in some cases?), even though on most targets the generated assembly will be the same.
Some things I'm not sure about
how big a deal this is for s390x (cc @uweigand do you know how important generating vlgvf is?)
I'm trying to add an implementation of
vec_extractfor thes390x-unknown-linux-gnutarget in stdarch:https://godbolt.org/z/e65Mvf5vM
Turns into this LLVM
And generates the following assembly
in particular, the
vlgvf(ffor fullword, there are variations for other widths) is the relevant instruction here. It extracts the value at the given index.Contrary to most other targets, the index argument to a
vec_extractdoes not need to beconst. Thestd::intrinsics::simd::simd_extractfunction does need its index argument to be const, and therefore can't straightforwardly be used to implementvec_extract.Attempt 1
I tried simple field extraction:
https://godbolt.org/z/sbhYj316x
(
extern "C"is used so that the vector is passed by-value, but we see the same assembly when the vector is created within the function)Indexing into the underlying array in this way may soon be banned, though there is an alternative approach that does the same thing. Unfortunately, this version does not optimize well:
The portable-simd implementation of
Indexappears to be doing the same thing, and generates the same code https://godbolt.org/z/eecM6qdbW. That totally makes sense for most targets, because a pointer load is the best you can do.Attempt 2
I did find that this version does optimize well
but that is unwieldy kind of unwieldy, and while I can make it work for
stdarchit won't work forportable_simd.Solutions
I think there should be a way of indexing into a vector that emits an
extractelementrather than agetelementptr. Semantically that seems more accurate (and might optimize better in some cases?), even though on most targets the generated assembly will be the same.Some things I'm not sure about
vlgvfis?)repr(simd)types compiler-team#838 wants to fixconstvalue is a terrible idea on some/most targets and unimplemented by design