ARM64-SVE: Add checks for MOVPRFX sequencing (#105514)#106184
ARM64-SVE: Add checks for MOVPRFX sequencing (#105514)#106184amanasifkhalid merged 21 commits intodotnet:mainfrom
Conversation
|
@dotnet/arm64-contrib @kunalspathak |
kunalspathak
left a comment
There was a problem hiding this comment.
Suggested some changes. Also, can you also confirm all stress tests are passing?
src/coreclr/jit/emitarm64.cpp
Outdated
|
|
||
| case IF_SVE_CC_2A: // <Zdn>.<T>, <V><m> | ||
| case IF_SVE_FU_2A: // <Zda>.<T>, <Zn>.<T>, #<const> | ||
| assert(secondId->idReg1() != secondId->idReg2()); |
There was a problem hiding this comment.
i was hoping to see lot more validation for registers of firstId as listed in #100743 (comment) or something similar:
There was a problem hiding this comment.
i was hoping to see lot more validation for registers of firstId
That should be all at the bottom of the function. I'll double check that I have everything, because the text you have above is a much more verbose version than what I was using.
There was a problem hiding this comment.
looks like you have most of the things covered. We need to add validation for:
- MOVPRFX cannot be the last instruction in a sequence
- The predicated instruction at PC+4 must use the merging predicate. (not sure if this is taken care by the format checks)
- MOVPRFX destination register MUST be used as the destination register in the instruction at PC+4, "and is not allowed to be used in any other position other than destructive input"
There was a problem hiding this comment.
- MOVPRFX cannot be the last instruction in a sequence
Done.
- The predicated instruction at PC+4 must use the merging predicate. (not sure if this is taken care by the format checks)
Looking at all the formats, they are all /M or unpredicated. I've added some extra checking to make sure they correspond to the correct movprfx instruction.
- MOVPRFX destination register MUST be used as the destination register in the instruction at PC+4, "and is not allowed to be used in any other position other than destructive input"
That's done in the switch. Eg:
assert(secondId->idReg1() != secondId->idReg3());
I had to do it there because you need the format to know what types the registers are.
Change-Id: Ib78085c631ade5e923cddfda4def7efb362b4587
kunalspathak
left a comment
There was a problem hiding this comment.
Added some more comments on validation.
| assert(firstId->idReg2() == secondId->idReg2()); | ||
|
|
||
| // "and have the same maximum element size (ignoring a fixed 64-bit "wide vector" operand)," | ||
| assert(firstId->idInsOpt() == secondId->idInsOpt()); |
There was a problem hiding this comment.
can this be true for unpredicated version as well?
There was a problem hiding this comment.
The unpredicated movprfx doesn't care about element sizes, it is just MOVPRFX <Zd>, <Zn>.
Change-Id: I0a421eacf4895cfb2a30aaa6c483f4a7a3f17dd1
@a74nh I'm assuming you're getting a clean run now, right? |
|
Normal testing passes with tiered and untiered. Need to check stress testing
I'm not - but it's not due to this PR. I'll raise a bug and take a look at it - I have a reasonable idea of what it might be. This PR shouldn't need to be blocked on it though. |
| HWIntrinsicImmOpHelper helper(this, intrin.op3, node); | ||
|
|
||
| for (helper.EmitBegin(); !helper.Done(); helper.EmitCaseEnd()) | ||
| for (helper.EmitBegin(); !helper.Done(); helper.EmitCaseEnd(), (targetReg != op1Reg) ? 2 : 1) |
There was a problem hiding this comment.
What is this syntax doing? It looks like the result of the ternary operator isn't being used.
There was a problem hiding this comment.
That change should have been on the line above - the HWIntrinsicImmOpHelper constructor needs to take in the number of instructions that will be generated in each stage of the loop.
Got it, thanks! |
This was due to changes in #106125. I've fixed in this PR because it's related and simple enough - that PR changed the type of Also fixed up review comments. Just checking now that all stress testing passes now. |
amanasifkhalid
left a comment
There was a problem hiding this comment.
LGTM if stress tests pass. Thanks!
Full stress testing for all APIs passing now. |
Change request is out-of-date

As per the Arm Arm,
MOVPRFXinstructions must be followed by an SVE instruction of specific type, otherwise behaviour is undefined.Add a function which checks that two instructions in sequence are valid. For now this only cares when the first instruction is a
MOVPRFX, but the function could be expanded to check for any pairs, should we need to.Function is called after code generation in a loop which cycles through all
IGs and all instructions.This ensures the stream is valid after peephole type optimisations.
Alternatively, this function could be called from
emitInsSanityCheck().EMIT_BACKWARDS_NAVIGATIONcan then be used to get the previous instruction. However, defining this adds an extra variable toinstrDesc, reducing the small immediate size from 7bits to 2bits.Alternatively, this function could be inside every
emitIns_function, allowingemitLastInsto be used for the previous instruction. However, that is a lot of call sites and some could easily be missed. In addition a peephole optimisation could still occur after this point.Currently requires #106125 for all SVE tests to pass.
AddSequentialAcross tests disabled until #106180 is fixed.
Resolves: #105514