[RyuJit/WASM] Register allocation for multiply used operands#124481
[RyuJit/WASM] Register allocation for multiply used operands#124481AndyAyersMS merged 2 commits intodotnet:mainfrom
Conversation
94a338c to
287eea4
Compare
We define two new contracts: 1) Lowering to RA: multiply-used operands must be marked with a new LIR flag, so that RA can track when their lifetime begins in a forward walk without lookahead. 2) RA to codegen: SDSU nodes with valid registers on them must be assigned those registers via "local.tee"s. Combined, these allow us to make arbitrary operands multi-use. Different choices could have been made regarding these mechanisms. '1' could have been replaced by lookahead / some other clever scheme confined to RA. '2' could have been replaced by a new LOCAL_TEE node that would be responsible for producing the registers. Since we will need this mechanism for most stores, and we need some customization for "genProduceReg" anyway due to the drop requirement, it seems to be acceptable in complexity terms to introduce these contracts. They are TP-positive. Notably, this change makes the internal register mechanism unused. It can still be useful for "true" internal registers, ones unrelated to the operands, so it is left in place for now.
287eea4 to
f0def53
Compare
|
@dotnet/jit-contrib |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Nice... a minimally intrusive way of ensuring we can reuse operands.
|
Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara |
|
LGTM |
Seems like we will need internal registers for implementing some of the interlocked operations, since they need refer to their results multiple times. Also are you thinking that producing nodes will not try and avoid work if their results are unused? Say for something like an ignored interlocked.and, we could either always produce a result and then drop it, or not produce a result and generate less code, but WasmProduceReg would need to know. |
The established pattern for such cases (where it is important for optimization purposes to distinguish used-vs-not-used) is to lower the node into a different opcode that is (Another pattern that's used is making existing nodes produce |
We define two new contracts:
that RA can track when their lifetime begins in a forward walk without lookahead.
those registers via "local.tee"s.
Combined, these allow us to make arbitrary operands multi-use.
Different choices could have been made regarding these mechanisms. '1' could have been replaced by lookahead / some other clever scheme confined to RA. '2' could have been replaced by a new LOCAL_TEE node that would be responsible for producing the registers.
Since we will need this mechanism for most stores, and we need some customization for "genProduceReg" anyway due to the drop requirement, it seems to be acceptable in complexity terms to introduce these contracts. They are TP-positive.
Notably, this change makes the internal register mechanism unused. It can still be useful for "true" internal registers, ones unrelated to the operands, so it is left in place for now.
Ref: #124298.