Ensure Vector256.Dot produces a V256 result#88712
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis resolves #88704
|
|
/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
CC. @dotnet/jit-contrib, this resolves #88704 and the last of the outerloop HWIntrinsic related failures. Remaining jitstress failures are in other areas. |
| @@ -4462,29 +4462,30 @@ GenTree* Lowering::LowerHWIntrinsicDot(GenTreeHWIntrinsic* node) | |||
| assert(comp->compIsaSupportedDebugOnly(InstructionSet_AVX)); | |||
|
|
|||
| // We will be constructing the following parts: | |||
There was a problem hiding this comment.
The issue was simply that we were returning a V128 still when we should've been returning a full V256
The old code was basically doing:
vdpps ymm0, ymm1, ymm2, -15
vextractf128 xmm1, ymm0, 1
vaddps xmm0, xmm0, xmm1The new code instead does:
vdpps ymm0, ymm1, ymm2, -1
vperm2f128 ymm1, ymm0, ymm0, 1
vaddps ymm0, ymm0, ymm1So we have the same general size of code and are simply swapping lower/upper rather than extracting upper, we then do a 256-bit add rather than a 128-bit add.
In a couple edge cases, we do end up with slightly better code due to how the register allocator handles the uses of the clones and so we see a very small -51 bytes disasm improvement.
|
|
||
| tmp2 = comp->gtNewSimdBinOpNode(GT_ADD, TYP_SIMD16, tmp3, tmp1, simdBaseJitType, 16); | ||
| BlockRange().InsertAfter(tmp1, tmp2); | ||
| NamedIntrinsic permute2x128 = (simdBaseType == TYP_DOUBLE) ? NI_AVX_Permute2x128 : NI_AVX2_Permute2x128; |
There was a problem hiding this comment.
the debug assert only checks AVX: assert(comp->compIsaSupportedDebugOnly(InstructionSet_AVX)); in this case, should it be updated to avx2?
There was a problem hiding this comment.
We have an assert a bit higher up already. We could replicate it here if you really want it to be.
This resolves #88704