JIT ARM64-SVE: add test coverage for *Faulting* cases to make sure inactive lanes are zeroed out (#105580)#106139
Conversation
…active lanes are zeroed out (dotnet#105580)
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
@tannergooding @kunalspathak - could you take a look please |
|
Fixes #105580. |
| where TElem : INumberBase<TElem> | ||
| { | ||
| for (var i = 0; i < firstOp.Length; i++) | ||
| TElem[] mask = new TElem[result.Length]; |
There was a problem hiding this comment.
can you please fix the alignment?
| private static readonly int LargestVectorSize = {LargestVectorSize}; | ||
|
|
||
| private static readonly int RetElementCount = Unsafe.SizeOf<{RetVectorType}<{RetBaseType}>>() / sizeof({RetBaseType}); | ||
| private static readonly int Op1ElementCount = Unsafe.SizeOf<{Op1VectorType}<{Op1BaseType}>>() / sizeof({Op1BaseType}); |
There was a problem hiding this comment.
i was hoping to see similar validation in other FirstFaulting.template.
There was a problem hiding this comment.
The other two FirstFaulting templates we have at the moment are SveGatherVectorFirstFaultingVectorBases.template and SveGatherVectorFirstFaultingIndices.template. Both have a similar validation already. Please check usage of the following variables.
You may also note that the ValidateFirstFaultingResult methods accept load masks as first parameter:
| return false; | ||
| } | ||
|
|
||
| public static bool CheckLoadVectorFirstFaultingBehavior<TMem, TElem, TFault>(TMem[] firstOp, TElem[] result, Vector<TFault> faultResult) |
There was a problem hiding this comment.
I believe this comment doesn't reference the line you intended. Could you check?
| return false; | ||
| } | ||
|
|
||
| public static bool CheckLoadVectorFirstFaultingBehavior<TMem, TElem, TFault>(TMem[] firstOp, TElem[] result, Vector<TFault> faultResult) |
There was a problem hiding this comment.
I believe this comment doesn't reference the line you intended. Could you check?
| {Op1VectorType}<{Op1BaseType}> loadMask = Sve.CreateTrueMask{Op1BaseType}(SveMaskPattern.All); | ||
| var op1 = {LoadIsa}.Load{Op1VectorType}(loadMask, ({Op1BaseType}*)(_dataTable.inArray1Ptr)); | ||
|
|
||
| // Force op1 (mask) to have the first and last element to be active. |
There was a problem hiding this comment.
any reason why this is needed?
There was a problem hiding this comment.
i still see this and was wondering why is it needed?
// Force loadMask to have the first and last element to be active.
There was a problem hiding this comment.
instead, you could use the Helpers.GetRandomMask*()
There was a problem hiding this comment.
This is required specifically for FirstFaulting APIs. The same was done for SveGatherVectorFirstFaulting by @TIHan here:
|
|
||
|
|
||
| private static TElem GetLoadVectorExpectedResultByIndex<TMem, TElem>(int index, TMem[] firstOp, TElem[] result) | ||
| private static TElem GetLoadVectorExpectedResultByIndex<TMem, TElem>(int index, TElem[] mask, TMem[] data, TElem[] result) |
There was a problem hiding this comment.
I believe this belongs to the following line, please fix me if this wasn't you intent: https://github.com/dotnet/runtime/pull/106139/files#diff-bb20484d905baf7f831f60e736751de79fffd6a4e09569cf0cfb86798104f7c4R8533
This is to support the public static bool CheckLoadVectorBehavior<TMem, TElem>(TMem[] data, TElem[] result) overload which is still used to validate results of some scenarios from SveLoadVectorFirstFaultingTest.template, i.e. RunBasicScenario_Load() or RunReflectionScenario_Load(). These scenarios populate the load mask with Sve.CreateTrueMask{RetBaseType}(SveMaskPattern.All). So yes, since this overload doesn't accept a mask parameter, it's implied that the mask is all active.
If the intent of #105580 was to make all scenarios in FirstFaulting templates including RunBasicScenario_Load() or RunReflectionScenario_Load() to use randomly populated masks, please let me know.
There was a problem hiding this comment.
If the intent of #105580 was to make all scenarios in FirstFaulting templates including RunBasicScenario_Load() or RunReflectionScenario_Load() to use randomly populated masks, please let me know.
The intent of #105580 is to make sure that inactive lanes does not fault (which we are already validating) and stays zero in destination. We should regardless use randomly populated mask, except at cases where it is restricted. The latr can be done as a follow-up PR.
There was a problem hiding this comment.
The intent of #105580 is to make sure that inactive lanes does not fault (which we are already validating) and stays zero in destination.
This is done by providing overloads for ValidateFirstFaultingResult() and helper methods which accept mask as a parameter.
We should regardless use randomly populated mask, except at cases where it is restricted.
This is done for RunBasicScenario_LoadFirstFaulting() from src/tests/JIT/HardwareIntrinsics/Arm/Shared/SveLoadVectorFirstFaultingTest.template by using _mask instead of Sve.CreateTrueMask{RetBaseType}(SveMaskPattern.All) in the method and populating it with Helpers.getMask*() instead of TestLibrary.Generator.Get*() (see the changes to src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_Arm.cs).
kunalspathak
left a comment
There was a problem hiding this comment.
All FirstFaulting APIs pass the stress testing:
I think it will be good to merge this PR once all the remaining FF APIs are implemented. With that, we will be able to get a final pass if this validation works on all FF APIs.
|
|
||
| if (mask[i] == TElem.Zero) | ||
| { | ||
| return TFault.One; |
There was a problem hiding this comment.
if mask[i] == 0, shouldn't this lane should also be TFault.Zero?
There was a problem hiding this comment.
This part calculates the expected FFR value.
Inactive elements will not cause a read from Device memory or signal a fault, and are set to zero in the destination vector.
Since there's no fault, the FFR element is 1. You can check the pseudo code here: https://docsmirror.github.io/A64/2023-06/ldff1b_z_p_br.html
There was a problem hiding this comment.
ah, thanks for pointing it out.
| TElem[] mask = new TElem[result.Length]; | ||
| Array.Fill(mask, TElem.One); | ||
|
|
||
| return GetLoadVectorExpectedResultByIndex(index, mask, data, result); |
There was a problem hiding this comment.
The original test for this validation is little convulated and hence a bit hard to follow. But here, we are filling all mask and then calling a method that checks for mask[index] == TElem.Zero...wondering why can't we just skip calling GetLoadVectorExpectedResultByIndex and just do return TElem.CreateTruncating(data[index]) here?
There was a problem hiding this comment.
Using TElem.CreateTruncating(data[index]); leads to code duplication. Should one wish to change anything at TElem GetLoadVectorExpectedResultByIndex<TMem, TElem>(int index, TElem[] mask, TMem[] data, TElem[] result) they will have to add the change here as well which is not desirable or guaranteed.
|
@mikabl-arm - since all APIs are implemented now, can you please run all the faulting test cases (including stress)? |
kunalspathak
left a comment
There was a problem hiding this comment.
LGTM assuming all firstfaulting APIs stress test pass.
|
Waiting confirmation from @mikabl-arm before we can merge.
|
All Output@JulieLeeMSFT , @kunalspathak , the PR is good to merge. |
|
@amanasifkhalid, CI tests are still running. Please merge after CI tests pass. |
All
*FirstFaulting*APIs pass the stress testing:Output