Check Avx512BW.IsSupported in BitArray.CopyTo()#114818
Check Avx512BW.IsSupported in BitArray.CopyTo()#114818tannergooding merged 2 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-collections |
| Vector128<byte> upperShuffleMask_CopyToBoolArray = Vector128.Create(0x02020202_02020202, 0x03030303_03030303).AsByte(); | ||
|
|
||
| if (Avx512F.IsSupported && (uint)m_length >= Vector512<byte>.Count) | ||
| if (Avx512BW.IsSupported && (uint)m_length >= Vector512<byte>.Count) |
There was a problem hiding this comment.
Since you're touching this anyways, would you want to update this to use the xplat APIs instead?
That is, swap these out:
Avx512BW.IsSupported->Vector512.IsHardwareAcceleratedAvx512BW.Shuffle(x, y)->Vector512.Shuffle(x, y)Avx512F.And(x, y)->x & yAvx512BW.Min(x, y)->Vector512.Min(x, y)Avx512F.Store(x, y)->y.Store(x)
Similar changes could also be made to the Avx2 path (using Vector256) and so on.
This change to Avx512BW is notably "more correct" than Avx512F, but notably it's not technically broken today since we require AVX512F+BW+CD+DQ+VL to be provided for any AVX512 support to exist. It's still goodness to fix, but moving away from the platform specific APIs altogether would be even better.
There was a problem hiding this comment.
About Vector512.Shuffle(x, y)
runtime/src/libraries/System.Collections/src/System/Collections/BitArray.cs
Lines 928 to 930 in 1a51788
It seems that Arm64 might not have the proper instruction to support
VectorXXX.Shuffle<byte>(x, y). Could it happen that VectorXXX.IsHardwareAccelerated is true but VectorXXX.Shuffle<byte>(x, y) has poor performance on some platforms?
There was a problem hiding this comment.
It seems that Arm64 might not have the proper instruction to support VectorXXX.Shuffle(x, y). C
That comment is very outdated, VectorXXX.Shuffle covers the usage and does the correct thing.
Could it happen that VectorXXX.IsHardwareAccelerated is true but VectorXXX.Shuffle(x, y) has poor performance on some platforms?
Not for the way we're using it here. Theoretically it could happen if you were on a machine from 2005 and didn't have SSSE3 but hit the Vector128.IsHardwareAccelerated code path. This isn't really a concern for anyone using the current versions of .NET and could be adjusted later if it was a concern.
There was a problem hiding this comment.
Today, Vector256.IsHardwareAccelerated strictly implies AVX2 support, and Vector512.IsHardwareAccelerated strictly implies AVX-512F+CD+DQ+BW+VL. Arm64 support for larger vectors comes in the form of Scalable Vector Extensions (SVE), which does not allow acceleration of the fixed-width vector types.
The Vector128 code in BitArray is outdated and could also use the xplat intrinsics.
There was a problem hiding this comment.
Shuffle attempts to normalize behavior across platforms, including zeroing out of range indices and allowing full vector (cross-lane) shuffle/permute. If you don't need that behavior, you can use the new Vector512.ShuffleNative, which will emit the expected vpshufb.
vpermb is actually AVX512_VBMI, not just AVX-512F+CD+DQ+BW+VL
This is not a problem. The baseline ISA set is required for any acceleration of Vector512, but JIT can opportunistically use any ISA supported by the hardware. One of the benefits of the xplat vector APIs is that JIT can optimize or polyfill using whatever instructions are available.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Since you're touching this anyways, would you want to update this to use the xplat APIs instead?
@tannergooding , should this PR be merged or closed in the interim?
There was a problem hiding this comment.
Lets take it, as it is technically correct and will ensure valid behavior in the case other runtimes use the same libraries and have their own distinct support. (There's no bug on RyuJIT today due to how we've defined things to work for ourselves)
|
/ba-g unrelated build failures, simply updated an Isa.IsSupported check to be more correct |
runtime/src/libraries/System.Collections/src/System/Collections/BitArray.cs
Line 855 in 7e297e6
runtime/src/libraries/System.Collections/src/System/Collections/BitArray.cs
Line 860 in 7e297e6
It's using
Avx512BW, so it should checkAvx512BW.IsSupportedfirst. CheckingAvx512F.IsSupportedis not enough, as there are (old) CPUs supporting Avx512F but not Avx512BW according to https://en.wikipedia.org/w/index.php?title=AVX-512&oldid=1281313848#CPUs_with_AVX-512.