Implement SVE2 Xor, XorRotateRight#115891
Conversation
src/coreclr/jit/hwintrinsic.h
Outdated
There was a problem hiding this comment.
Why can't we just use HW_Category_ShiftRightByImmediate?
There was a problem hiding this comment.
I was trying to differentiate the two types of modifier from the start, just in case they do need to be handled differently in future. I'm not sure if they will though, they are very similar.
There was a problem hiding this comment.
in future.
let's introduce the new flag when there is a need. For now, just reuse HW_Category_ShiftRightByImmediate
src/coreclr/jit/hwintrinsicarm64.cpp
Outdated
There was a problem hiding this comment.
This is not needed. It will be handled by else if (category == HW_Category_ShiftRightByImmediate) logic above.
|
@kunalspathak I think these failures are unrelated. The one failing on WoA is related to threading. |
src/coreclr/jit/lowerarmarch.cpp
Outdated
There was a problem hiding this comment.
i don't think we need this as well. It should get handled in intrin.category == HW_Category_ShiftRightByImmediate condition above.
| HARDWARE_INTRINSIC(Sve2, BitwiseSelectRightInverted, -1, 3, {INS_sve_bsl2n, INS_sve_bsl2n, INS_sve_bsl2n, INS_sve_bsl2n, INS_sve_bsl2n, INS_sve_bsl2n, INS_sve_bsl2n, INS_sve_bsl2n, INS_invalid, INS_invalid}, HW_Category_SIMD, HW_Flag_Scalable|HW_Flag_SpecialCodeGen|HW_Flag_HasRMWSemantics) | ||
| HARDWARE_INTRINSIC(Sve2, ShiftLeftAndInsert, -1, 3, {INS_sve_sli, INS_sve_sli, INS_sve_sli, INS_sve_sli, INS_sve_sli, INS_sve_sli, INS_sve_sli, INS_sve_sli, INS_invalid, INS_invalid}, HW_Category_ShiftLeftByImmediate, HW_Flag_Scalable|HW_Flag_HasImmediateOperand|HW_Flag_HasRMWSemantics) | ||
| #define LAST_NI_Sve2 NI_Sve2_ShiftLeftAndInsert | ||
| HARDWARE_INTRINSIC(Sve2, Xor, -1, 3, {INS_sve_eor3, INS_sve_eor3, INS_sve_eor3, INS_sve_eor3, INS_sve_eor3, INS_sve_eor3, INS_sve_eor3, INS_sve_eor3, INS_invalid, INS_invalid}, HW_Category_SIMD, HW_Flag_Scalable|HW_Flag_SpecialCodeGen|HW_Flag_HasRMWSemantics) |
There was a problem hiding this comment.
remind me, but why we need HW_Flag_SpecialCodeGen for this one? Did you try without adding this flag and what problems we run into?
There was a problem hiding this comment.
It's because the instruction only accepts a D lane arrangement but the intrinsic attempts to pass B/H/S based on the BaseJitType. As it's bitwise it's OK to override it.
Maybe this could be generalized later with more flags to deal with a fixed lane arrangement, or the emitter could just be modified to be smarter about the options passed in. We'd need more space for flags if we wanted to go down the first route. For now I've just used SpecialCodeGen.
kunalspathak
left a comment
There was a problem hiding this comment.
Added some more comments.
@a74nh @kunalspathak
Contributing towards #115479