[cdac] add v2 ExecutionManager contract for NibbleMap change#1

max-charlamb · 2024-11-07T01:37:24Z

Depends on: dotnet#108939

Changes

Bumps ExecutionManager contract version to 2
Refactors ExecutionManager related files into folder.
Factors out nearly all of ExecutionManager_1 to ExecutionManagerBase
Adds ExecutionManager_2
Runs tests for both versions of contract.

...gnostics.DataContractReader.Contracts/Contracts/ExecutionManager/Helpers/NibbleMapHelpers.cs

src/native/managed/cdacreader/tests/ExecutionManagerTests/NibbleMapTestBuilder.cs

elinor-fung

Were you also able to test out the E2E combined with your runtime nibble map change? Well, the part of the E2E that we have working - for JIT-ed methods, the method names should display correctly for something like !clrstack (should go through the ExecutionManager and the nibble map in order to get the method corresponding to the IPs).

src/native/managed/cdacreader/tests/ExecutionManagerTests/ExecutionManagerTests.cs

...ft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManager_1.cs

...ft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManager_2.cs

...t.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/Helpers/NibbleMap_2.cs

elinor-fung · 2024-11-07T23:05:01Z

...t.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/Helpers/NibbleMap_2.cs

+
+namespace Microsoft.Diagnostics.DataContractReader.ExecutionManagerHelpers;
+
+// CoreCLR nibblemap with O(1) lookup time.


Update docs/design/datacontracts/ExecutionManager.md to add version 2 with information about this nibble map?

Added some docs and an example. I'm not sure the best way to add docs for the second version. I added a section below explaining the differences and the new algorithm.

...agnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManagerFactory.cs

max-charlamb · 2024-11-08T15:36:47Z

Were you also able to test out the E2E combined with your runtime nibble map change? Well, the part of the E2E that we have working - for JIT-ed methods, the method names should display correctly for something like !clrstack (should go through the ExecutionManager and the nibble map in order to get the method corresponding to the IPs).

Yes, I used WinDbg and debugged that with VS to verify that !sos clrstack has the correct stack (function names) and that the new nibblemap is invoked.

* JIT: Introduce `LclVarDsc::lvIsMultiRegDest` With recent work to expand returned promoted locals into `FIELD_LIST` the only "whole references" of promoted locals we should see is when stored from a multi-reg node. This is the only knowledge the backend should need for correctness purposes, so introduce a bit to track this property, and switch the backend to check this instead. The existing `lvIsMultiRegRet` is essentially this + whether the local is returned. We should be able to remove this, but it is currently used for some heuristics in old promotion, so keep it around for now. * JIT: Add some more constant folding in lowering Add folding for shifts and certain binops that are now getting produced late due to returned `FIELD_LIST` nodes. win-arm64 example: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] static ValueTask<byte> Foo() { return new ValueTask<byte>(123); } ``` ```diff G_M17084_IG02: ;; offset=0x0008 mov x0, xzr - mov w1, #1 - mov w2, wzr - mov w3, dotnet#123 - orr w2, w2, w3, LSL dotnet#16 - orr w1, w2, w1, LSL dotnet#24 - ;; size=24 bbWeight=1 PerfScore 4.00 + mov w1, #0x17B0000 + ;; size=8 bbWeight=1 PerfScore 1.00 ``` * Feedback

…otnet#114227) Presence of `.cctor` in `Thread` can cause circular dependency if Lock needs to block while Thread .cctor has not run yet. 1. Lock needs to wait on a WaitHandle 2. WaitHandle needs Thread.CurrentThread 3. if Thread's .cctor has not run yet, it needs to run. (it is unusual for this to be the first use of Thread, but the activation pattern in dotnet#113949 made it possible) 4. .cctor needs to take a Lock, so we go to `#1` Fixes: dotnet#113949

…more APIs) (#1…" (dotnet#120138) This reverts commit 1b4eff2. Fixes dotnet#120137

…ds from dotnet#27912 (Flow System.Text.Rune through more APIs)) (dotnet#120145) * Fix tests from dotnet#117168 * Add `SyncTextWriter` overloads as well * Add missing overloads to BroadcastingTextWriter * Reapply "Add methods from dotnet#27912 (Flow System.Text.Rune through more APIs) (#1…" (dotnet#120138) This reverts commit be80737. * Override the TextWrite Rune overloads in IndentedTextWriter --------- Co-authored-by: Tarek Mahmoud Sayed <tarekms@microsoft.com>

…er (dotnet#123735) From discussion, opting into enabling the crash chaining is more correct. <s>The previously registered signal action/handler aren't guaranteed to return, so we lose out on notifying shutdown and creating a dump in those cases. Specifically, PROCCreateCrashDumpIfEnabled would be the last chance to provide the managed context for the thread that crashed. e.g. On Android CoreCLR, it seems that, by default, signal handlers are already registered by Android's runtime (/apex/com.android.runtime/bin/linker64 + /system/lib64/libandroid_runtime.so). Whenever an unhandled synchronous fault occurs, the previously registered handler will not return back to invoke_previous_action and aborts the thread itself, so PROCCreateCrashDumpIfEnabled will not be hit.</s> ## Sigsegv behavior Android CoreCLR vs other platforms ### Android CoreCLR When intentionally writing to NULL (sigsegv) on Android CoreCLR, the previously registered signal handler goes down this path https://github.com/dotnet/runtime/blob/40e8c73b8f3b5f478a9bf03cf55c71d0608a8855/src/coreclr/pal/src/exception/signal.cpp#L454, and the thread aborts before hitting PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled. ### MacOS/Linux/NativeAOT(linux) On MacOS, Linux, NativeAOT (Only checked linux at time of writing), the same intentional SIGSEGV will hit https://github.com/dotnet/runtime/blob/40e8c73b8f3b5f478a9bf03cf55c71d0608a8855/src/coreclr/pal/src/exception/signal.cpp#L431-L448 instead because there is no previously registered signal handler. In those cases, PROCCreateCrashDumpIfEnabled is hit and managed callstacks are captured in the dump. ## History investigation From a github history dive, I didn't spot anything in particular requiring the previous signal handler to be invoked before PROCNotifyProcessShutdown + PROCCreateCrashDumpIfEnabled. PROCNotifyProcessShutdown was first introduced in dotnet@1433c3f. It doesn't seem to state a particular reason for invoking it after the previous signal handler. PROCCreateCrashDumpIfEnabled was added to signal.cpp in dotnet@7f9bd2c because the PROCNotifyProcessShutdown didn't create a crash dump. It doesn't state any particular reason for being invoked after the previously registered signal handler, and was probably just placed next to PROCNotifyProcessShutdown. `invoke_previous_action` was introduced in dotnet@a740f65 and was refactoring while maintaining the order. ## Android CoreCLR behavior after swapping order Locally, I have POC changes to emit managed callstacks in Android's PROCCreateCrashDumpIfEnabled. ``` 01-28 17:26:40.951 2416 2440 F DOTNET : Native crash detected; attempting managed stack trace. 01-28 17:26:40.951 2416 2440 F DOTNET : {"stack":[ 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x0","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"}, 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x78d981145973","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"}, 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x78d981145973","module":"0x0","offset":"0x73","name":"Program.ForceNativeSegv()"}, 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x78d981141b60","module":"0x0","offset":"0x70","name":"Program.Main(System.String[])"} 01-28 17:26:40.951 2416 2440 F DOTNET : ]} 01-28 17:26:40.952 2416 2440 F DOTNET : Crash dump hook completed. --------- beginning of crash 01-28 17:26:40.952 2416 2440 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 2440 (.dot.MonoRunner), pid 2416 (ulator.JIT.Test) ..... 01-28 17:26:46.882 2921 2921 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 01-28 17:26:46.882 2921 2921 F DEBUG : Build fingerprint: 'google/sdk_gphone64_x86_64/emu64xa:16/BE2A.250530.026.D1/13818094:user/release-keys' 01-28 17:26:46.882 2921 2921 F DEBUG : Revision: '0' 01-28 17:26:46.882 2921 2921 F DEBUG : ABI: 'x86_64' 01-28 17:26:46.882 2921 2921 F DEBUG : Timestamp: 2026-01-28 17:26:41.492831700-0500 01-28 17:26:46.882 2921 2921 F DEBUG : Process uptime: 20s 01-28 17:26:46.883 2921 2921 F DEBUG : Cmdline: net.dot.Android.Device_Emulator.JIT.Test 01-28 17:26:46.883 2921 2921 F DEBUG : pid: 2416, tid: 2440, name: .dot.MonoRunner >>> net.dot.Android.Device_Emulator.JIT.Test <<< 01-28 17:26:46.883 2921 2921 F DEBUG : uid: 10219 01-28 17:26:46.883 2921 2921 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000 01-28 17:26:46.883 2921 2921 F DEBUG : Cause: null pointer dereference 01-28 17:26:46.883 2921 2921 F DEBUG : Abort message: 'CoreCLR: previous handler for ' 01-28 17:26:46.883 2921 2921 F DEBUG : rax 0000000000000000 rbx 000078da87ffade0 rcx 0000000000000000 rdx 0000000000000001 01-28 17:26:46.884 1237 1297 I s.nexuslauncher: AssetManager2(0x78dd08cd9178) locale list changing from [] to [en-US] 01-28 17:26:46.903 2447 2594 I BugleNotifications: Creating notification input ids [CONTEXT im_entry_input="" im_notification_input="" im_settings_store_input="" im_final_input="" ] 01-28 17:26:46.905 2921 2921 F DEBUG : r8 00007ffcde5a8080 r9 34d9bb0e67871eb0 r10 000078ddb4111870 r11 0000000000000293 01-28 17:26:46.906 2921 2921 F DEBUG : r12 0000000000000001 r13 000078da87ffafa0 r14 0000000000000000 r15 000078da87ffaf18 01-28 17:26:46.906 2921 2921 F DEBUG : rdi 0000000000000000 rsi 0000000000000000 01-28 17:26:46.906 2921 2921 F DEBUG : rbp 000078da87ffac40 rsp 000078da87ffabc8 rip 000078ddb41118a2 01-28 17:26:46.906 2921 2921 F DEBUG : 2 total frames 01-28 17:26:46.906 2921 2921 F DEBUG : backtrace: 01-28 17:26:46.906 2921 2921 F DEBUG : #00 pc 000000000008f8a2 /apex/com.android.runtime/lib64/bionic/libc.so (memset_avx2+50) (BuildId: fcb82240218d1473de1e3d2137c0be35) 01-28 17:26:46.906 2921 2921 F DEBUG : #1 pc 0000000000049972 /memfd:doublemapper (deleted) (offset 0x111000) ``` Now theres a window to log managed callstacks before the original signal handler aborts and triggers a tombstone. ## Android Mono behavior Mono provides two embeddings APIs to configure signal and crash chaining https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/driver.c#L2864-L2894 that determine whether synchronous faults would chain https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-runtime.c#L3892-L3903 They would only chain to the previous signal handler https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-posix.c#L193-L210 only after attempting to walk native and managed stacks https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-exceptions.c#L2992-L3012 ## Alternatives If there is any particular reason to preserve the order of sa_sigaction/sa_handler with respect to PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled for CoreCLR, a config knob can be added to allow Android CoreCLR to opt into the swapped ordering behavior. This may be in the form of config property key/values https://github.com/dotnet/runtime/blob/54ca569eb62800cdb725d776e3dd2e564028594d/src/coreclr/dlls/mscoree/exports.cpp#L237-L238 or `clrconfigvalues`. That way AndroidSDK/AndroidAppBuilder may opt-in at build-time. Given that the history of the ordering didn't reveal any problems with swapping the order, we can fallback to this behavior if the order swap causes problems down the line. The other way around is more restrictive. Should we first introduce all the overhead to enable an opt-in/opt-out config knob, and later discover that no platforms need to invoke their previous handlers before PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled, it seems harder to justify removing the knob.

Max Charlamb added 3 commits November 6, 2024 20:36

add ExecutionManager contract version

3b5707c

add docs

0889e50

improve tests

a7de860

max-charlamb marked this pull request as ready for review November 7, 2024 18:24

max-charlamb changed the title ~~add ExecutionManager contract version~~ [cdac] add v2 ExecutionManager contract for NibbleMap change Nov 7, 2024

AaronRobinsonMSFT reviewed Nov 7, 2024

View reviewed changes

Max Charlamb added 2 commits November 7, 2024 14:12

_version -> Version

c81d162

comments

7bfc4e4

max-charlamb assigned max-charlamb and unassigned max-charlamb Nov 7, 2024

max-charlamb requested a review from AaronRobinsonMSFT November 7, 2024 19:25

elinor-fung reviewed Nov 8, 2024

View reviewed changes

max-charlamb mentioned this pull request Nov 8, 2024

Constant Time Lookup NibbleMap dotnet/runtime#108939

Merged

Max Charlamb added 2 commits November 8, 2024 11:11

comments

bd53df2

comments

b26197a

max-charlamb mentioned this pull request Nov 8, 2024

[cdac] add v2 ExecutionManager contract for NibbleMap change dotnet/runtime#109654

Merged

max-charlamb deleted the branch nibble-optimize June 10, 2025 16:37

max-charlamb closed this Jun 10, 2025

max-charlamb pushed a commit that referenced this pull request Oct 1, 2025

Revert "Add methods from dotnet#27912 (Flow System.Text.Rune through …

be80737

…more APIs) (#1…" (dotnet#120138) This reverts commit 1b4eff2. Fixes dotnet#120137

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[cdac] add v2 ExecutionManager contract for NibbleMap change#1

[cdac] add v2 ExecutionManager contract for NibbleMap change#1
max-charlamb wants to merge 7 commits intonibble-optimizefrom
nibble-cdac

max-charlamb commented Nov 7, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elinor-fung left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elinor-fung Nov 7, 2024

Uh oh!

max-charlamb Nov 8, 2024

Uh oh!

Uh oh!

max-charlamb commented Nov 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		namespace Microsoft.Diagnostics.DataContractReader.ExecutionManagerHelpers;

		// CoreCLR nibblemap with O(1) lookup time.

Comments

Conversation

max-charlamb commented Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elinor-fung left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elinor-fung Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

max-charlamb Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

max-charlamb commented Nov 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

max-charlamb commented Nov 7, 2024 •

edited

Loading