Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/jit-contrib |
|
Not sure what i'm doing wrong here yet. |
There was a problem hiding this comment.
Pull request overview
This PR implements infrastructure for WebAssembly block store operations in the RyuJIT compiler. It adds support for lowering block copy and initialization operations to use WASM-specific memory.copy and memory.fill instructions.
Changes:
- Adds
LowerBlockStoreimplementation for WASM target to handle block copy and initialization operations - Introduces two new WASM instructions (
memory_copyandmemory_fill) and their associated instruction format (IF_MEMCPY) - Adds emitter support for encoding and displaying the new IF_MEMCPY instruction format
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| src/coreclr/jit/lowerwasm.cpp | Implements LowerBlockStore to handle GT_STORE_BLK and GT_INIT_BLK operations, with partial infrastructure for memory.copy and memory.fill |
| src/coreclr/jit/instrswasm.h | Adds memory_copy and memory_fill instruction definitions with their opcodes |
| src/coreclr/jit/emitfmtswasm.h | Defines the IF_MEMCPY instruction format for memory operations with two memory indices |
| src/coreclr/jit/emitwasm.cpp | Implements encoding, size calculation, and display logic for IF_MEMCPY instruction format |
|
cc @adamperlin see the WasmLowering.GetSignature changes. I couldn't figure out what was wrong with the existing implementation so I rewrote it, but the rewrite isn't complete. It was generating signatures with zeroes in them and that produced invalid wasm modules. |
That's probably my fault: I revised that code to try and handle the return buffer placement. @MichalStrehovsky in the managed type system signatures is the |
If you mean the
|
The re-write looks good to me, appending to a list seems like an easier solution to ensure correctness. My only feedback would just be that we should probably add a comment (maybe in the header) explaining what the maximum number of parameters we might add when lowering the signature is (explaining the |
| GenTree* dstAddr = blkNode->Addr(); | ||
| GenTree* src = blkNode->Data(); | ||
|
|
There was a problem hiding this comment.
dstAddr is only used in a debug-only assert later in this method. In non-DEBUG builds this becomes an unused local and can trigger compiler warnings. Consider removing it, wrapping it in a debug-only helper, or using an UNUSED_VAR(dstAddr)-style pattern.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/coreclr/jit/lowerwasm.cpp:484
- The DEBUG logging should use JITDUMP/JITDUMPEXEC instead of raw printf to be consistent with existing JIT logging patterns. The suggested approach is to use
JITDUMP("node==[%06u] prev==[%06u]\n", Compiler::dspTreeID(node), Compiler::dspTreeID(prev));which eliminates the need for the#ifdef DEBUGwrapper.
JITDUMP("node==[%06u] prev==[%06u]\n", Compiler::dspTreeID(node), Compiler::dspTreeID(prev));
NYI_WASM("IR not in a stackified form");
}
// In stack order, the last operand is closest to its parent, thus put on top here.
node->VisitOperands([stack](GenTree* operand) {
stack->Push(operand);
return GenTree::VisitResult::Continue;
});
| @@ -168,48 +160,41 @@ public static WasmFuncType GetSignature(MethodDesc method) | |||
| { | |||
| explicitThis = true; | |||
| } | |||
| else | |||
| { | |||
| parameterCount += 1; // implicit this | |||
| } | |||
| } | |||
|
|
|||
| Span<WasmValueType> wasmParameters = new WasmValueType[parameterCount]; | |||
|
|
|||
| int index = 0; | |||
|
|
|||
| if (method.IsUnmanagedCallersOnly) // reverse P/Invoke | |||
| { | |||
| if (hasReturnBuffer) | |||
| { | |||
| wasmParameters[index++] = pointerType; | |||
| result.Add(pointerType); | |||
| } | |||
| } | |||
| else // managed call | |||
| { | |||
| wasmParameters[0] = pointerType; // Stack pointer parameter | |||
|
|
|||
| // Return buffer is first after this. | |||
| result.Add(pointerType); // Stack pointer parameter | |||
|
|
|||
| if (hasThis) | |||
| { | |||
| wasmParameters[index++] = pointerType; | |||
| result.Add(pointerType); | |||
| } | |||
|
|
|||
| if (hasReturnBuffer) | |||
| { | |||
| wasmParameters[index++] = pointerType; | |||
| result.Add(pointerType); | |||
| } | |||
|
|
|||
| wasmParameters[wasmParameters.Length - 1] = pointerType; // PE entrypoint parameter | |||
| } | |||
|
|
|||
| for (int i = explicitThis ? 1 : 0; i < signature.Length; i++) | |||
| { | |||
| wasmParameters[index++] = LowerType(signature[i]); | |||
| result.Add(LowerType(signature[i])); | |||
There was a problem hiding this comment.
This refactoring fixes a critical bug in the parameter ordering logic. In the old code, for managed calls with hasThis, the code would set wasmParameters[0] to SP, but then immediately overwrite it by setting wasmParameters[index++] to the this parameter (where index was still 0). This would cause the SP parameter to be lost and replaced with the this parameter. The new code correctly produces the order: SP, this (if any), return buffer (if any), regular params, PE. While this fix is correct, it represents a significant behavior change. Please verify that: 1) the old code was indeed buggy and never worked correctly for instance methods, or 2) there's an explanation for why index wouldn't be 0 in the managed call path. If this fixes a real bug, consider adding test coverage for instance methods (methods with implicit this).
|
@dotnet/jit-contrib PTAL |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Looks good overall, just curious about the GC disabling part.
Add insns for memory copy and fill Remove NYIs so stuff can flow through Cleanup Checkpoint Fix opcodes Fix WasmLowering.GetSignature Maybe-working codegen for zeroing a struct jit-format Remove unused local Consume operands Fix erroneous fallthrough Address PR feedback Remove dead call NotImplementedException ArrayBuilder and comment Make initblk loop work
Remove else case
| else | ||
| { | ||
| assert(src->OperIs(GT_IND, GT_LCL_VAR, GT_LCL_FLD)); | ||
| src->SetContained(); |
There was a problem hiding this comment.
I don't see the handling for the local contained sources in codegen.
There was a problem hiding this comment.
What do I need to do that I'm not already doing? The codegen seems to work.
There was a problem hiding this comment.
nvm, I managed to get it to fail:
Z:\runtime\src\coreclr\jit\lowerwasm.cpp:499
Assertion failed 'NYI_WASM: IR not in a stackified form' in 'Program:zeroStructLocal()' during 'Lowering nodeinfo' (IL size 10; hash 0xcc3d1d52; MinOpts)
Will investigate.
There was a problem hiding this comment.
As an exercise I tried not containing src, but then we fail to generate code for a load of TYP_STRUCT (the GT_IND), so it seems like containing here is correct.
There was a problem hiding this comment.
@AndyAyersMS this is a real bug I think. Should I fix it before merging? Or is it fine to do in a follow-up?
| break; | ||
|
|
||
| case GenTreeBlk::BlkOpKindNativeOpcode: | ||
| genConsumeOperands(blkOp); |
There was a problem hiding this comment.
Have you tested this with a non-zero init value, so that GT_INIT_VAL is present? I suspect you need to augment genConsumeRegs slightly for that case to look through to its operand.
There was a problem hiding this comment.
I haven't, and I'm not sure how I would produce that by writing C#. I'll see if I can figure something out.
|
|
||
| #ifdef TARGET_WASM | ||
| case GenTreeBlk::BlkOpKindNativeOpcode: | ||
| printf(" (memory.copy|fill)"); |
There was a problem hiding this comment.
| printf(" (memory.copy|fill)"); | |
| printf(" (memory.%s)", tree->OperIsCopyBlkOp ? "copy" : "fill"); |
We can make it nicer.
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
src/coreclr/jit/codegenwasm.cpp
Outdated
| // genCodeForStoreBlk: Produce code for a GT_STORE_BLK node. | ||
| // | ||
| // Arguments: | ||
| // tree - the node |
There was a problem hiding this comment.
The parameter name in the documentation comment is tree, but the actual parameter is named blkOp. The documentation should be updated to match the actual parameter name for consistency.
| // tree - the node | |
| // blkOp - the node |
| // memory.copy | ||
| blkNode->gtBlkOpKind = GenTreeBlk::BlkOpKindNativeOpcode; |
There was a problem hiding this comment.
When the source of a copy operation is GT_LCL_VAR or GT_LCL_FLD (not GT_IND), and BlkOpKindNativeOpcode is chosen for memory.copy, verify that the code generation correctly emits the source address. The source is marked as contained (line 223), and for contained GT_LCL_VAR/GT_LCL_FLD, genConsumeRegs only updates liveness without emitting address calculation code. For contained GT_IND, the indir's address is consumed, but GT_LCL_VAR and GT_LCL_FLD don't have an address operand. If this scenario can occur (non-GC struct copy from local variable), special handling may be needed in genCodeForStoreBlk to emit the local variable's address, similar to how genCodeForLclAddr works.
| // memory.copy | |
| blkNode->gtBlkOpKind = GenTreeBlk::BlkOpKindNativeOpcode; | |
| // For non-GC struct copies, prefer a loop-based copy when the source is | |
| // a local variable or local field. Contained GT_LCL_VAR/GT_LCL_FLD nodes | |
| // do not have an address operand, so using BlkOpKindNativeOpcode (memory.copy) | |
| // would require special codegen support to materialize the source address. | |
| if (src->OperIs(GT_LCL_VAR, GT_LCL_FLD)) | |
| { | |
| blkNode->gtBlkOpKind = GenTreeBlk::BlkOpKindLoop; | |
| } | |
| else | |
| { | |
| // memory.copy | |
| blkNode->gtBlkOpKind = GenTreeBlk::BlkOpKindNativeOpcode; | |
| } |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| break; | ||
|
|
||
| case GenTreeBlk::BlkOpKindNativeOpcode: | ||
| genConsumeOperands(blkOp); |
There was a problem hiding this comment.
For CopyBlk operations, the source address is not being emitted to the WASM stack. The WASM memory.copy instruction expects three arguments on the stack: dest_addr, src_addr, and size (in that order). Currently, genConsumeOperands only emits the destination address because Data() is contained (marked as such in LowerBlockStore). For CopyBlk, Data() is GT_IND/GT_LCL_VAR/GT_LCL_FLD, and you need to extract and emit the source address similar to how genConsumeBlockSrc works in other architectures. If Data() is GT_IND, emit its child (the address); if it's GT_LCL_VAR or GT_LCL_FLD, emit the local address using genCodeForLclAddr.
| genConsumeOperands(blkOp); | |
| genConsumeOperands(blkOp); | |
| if (isCopyBlk) | |
| { | |
| GenTree* data = blkOp->Data(); | |
| assert(data != nullptr); | |
| if (data->OperIs(GT_IND)) | |
| { | |
| GenTree* addr = data->AsIndir()->Addr(); | |
| assert(addr != nullptr); | |
| genConsumeRegs(addr); | |
| } | |
| else if (data->OperIs(GT_LCL_VAR, GT_LCL_FLD)) | |
| { | |
| genCodeForLclAddr(data->AsLclVarCommon()); | |
| } | |
| else | |
| { | |
| unreached(); | |
| } | |
| } |
| break; | ||
|
|
||
| case GenTreeBlk::BlkOpKindNativeOpcode: | ||
| genConsumeOperands(blkOp); |
There was a problem hiding this comment.
For InitBlk operations with BlkOpKindNativeOpcode, the initialization value is not being emitted to the WASM stack. The WASM memory.fill instruction expects three arguments: dest_addr, value, and size. Currently, only dest_addr and size are emitted because Data() is GT_INIT_VAL which was marked as contained in LowerBlockStore. You need to extract and emit the actual init value (the child of GT_INIT_VAL) before emitting the size, similar to how it's done in genCodeForInitBlkLoop at lines 2026-2027. Note that genConsumeOperands won't emit the value because the GT_INIT_VAL wrapper is contained.
| genConsumeOperands(blkOp); | |
| genConsumeOperands(blkOp); | |
| // For initblk (memory.fill), the initialization value is wrapped in a contained | |
| // GT_INIT_VAL node, so genConsumeOperands(blkOp) will not have emitted it. | |
| // We must explicitly emit the child of GT_INIT_VAL here so that the WASM stack | |
| // has: dest_addr, value, size before the memory.fill instruction. | |
| if (!isCopyBlk) | |
| { | |
| GenTree* dataNode = blkOp->Data(); | |
| if ((dataNode != nullptr) && dataNode->OperIs(GT_INIT_VAL)) | |
| { | |
| GenTree* initVal = dataNode->gtGetOp1(); | |
| assert(initVal != nullptr); | |
| genCodeForTreeNode(initVal); | |
| } | |
| } |
This is sufficient to compile these three managed methods to wasm functions with native
memory.fillandmemory.copyinstructions: