[cDAC] Add infrastructure to run cDAC tests using CLRMD and dumps on Helix#124621
[cDAC] Add infrastructure to run cDAC tests using CLRMD and dumps on Helix#124621max-charlamb wants to merge 34 commits intodotnet:mainfrom
Conversation
The artifactFileName used $(archiveExtension) which resolves to the downloading platform's archive format (.zip on Windows, .tar.gz on Linux). When downloading cross-platform dumps, this produces the wrong filename — e.g., a Linux agent looks for CdacDumps_windows_x64.tar.gz but the artifact was uploaded from Windows as CdacDumps_windows_x64.zip. Hardcode the correct extension for each platform's artifact so the ExtractFiles glob matches regardless of which platform downloads it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The dump artifacts are consumed cross-platform, but the upload used the platform-specific archive format (zip on Windows, tar.gz on Linux). This caused the Linux agent to fail extracting Windows .zip artifacts because unzip is not installed on the build image. Force tar.gz for all dump artifact uploads so both platforms can extract them without additional tooling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…s which does not exist on all platforms
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive infrastructure for running cDAC (Compact Data Access Component) dump-based integration tests using ClrMD and Helix. The infrastructure enables automated testing of cDAC contracts by generating crash dumps from debuggee applications and analyzing them with cDAC on various platforms.
Changes:
- Introduces a new test project (
Microsoft.Diagnostics.DataContractReader.DumpTests) with base classes for dump-based testing using ClrMD - Adds 7 debuggee applications that exercise different runtime contracts (Thread, GC, StackWalk, Loader, etc.)
- Implements MSBuild targets and Helix configuration for cross-platform dump generation and testing on CI
- Includes local development scripts for running dump tests outside of CI
Reviewed changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/DumpTests/DumpTestBase.cs | Base class for dump tests using IAsyncLifetime and runtime skip checks |
| src/native/managed/cdac/tests/DumpTests/ClrMdDumpHost.cs | ClrMD wrapper for dump loading and symbol resolution |
| src/native/managed/cdac/tests/DumpTests/Microsoft.Diagnostics.DataContractReader.DumpTests.csproj | Test project with Helix payload preparation target |
| src/native/managed/cdac/tests/DumpTests/DumpTests.targets | MSBuild targets for building debuggees and generating dumps |
| src/native/managed/cdac/tests/DumpTests/cdac-dump-helix.proj | Helix SDK project for sending dump generation and tests to Helix |
| src/native/managed/cdac/tests/DumpTests/*DumpTests.cs | Integration tests for various contracts (Thread, GC, Loader, etc.) |
| src/native/managed/cdac/tests/DumpTests/Debuggees/* | 7 debuggee console applications that crash to produce test dumps |
| src/native/managed/cdac/tests/DumpTests/RunDumpTests.ps1 | Local script for generating dumps and running tests |
| src/native/managed/cdac/tests/DumpTests/RunHelixLocally.ps1 | Script to simulate Helix workflow locally |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Thread.cs | Adds platform-conditional handling for UEWatsonBucketTrackerBuckets field |
| src/native/managed/cdac/tests/Microsoft.Diagnostics.DataContractReader.Tests.csproj | Excludes DumpTests directory from unit test project |
| eng/pipelines/runtime-diagnostics.yml | Adds DumpTest stage to CI pipeline with Helix integration |
| eng/Versions.props | Adds Microsoft.Diagnostics.Runtime version |
| eng/Subsets.props | Adds tools.cdacdumptests subset |
| <_TestsDir>$HELIX_WORKITEM_PAYLOAD/tests</_TestsDir> | ||
| <_DebuggeesDir>$HELIX_WORKITEM_PAYLOAD/debuggees</_DebuggeesDir> | ||
| <_DumpRoot>$HELIX_WORKITEM_PAYLOAD/dumps</_DumpRoot> | ||
| <_DumpGenCommands>@(_Debuggee->'mkdir -p $HELIX_WORKITEM_PAYLOAD/dumps/local/%(Identity) && DOTNET_DbgMiniDumpName=$HELIX_WORKITEM_PAYLOAD/dumps/local/%(Identity)/%(Identity).dmp $HELIX_CORRELATION_PAYLOAD/dotnet exec $HELIX_WORKITEM_PAYLOAD/debuggees/%(Identity)/%(Identity).dll || true', ' && ')</_DumpGenCommands> |
There was a problem hiding this comment.
Missing DOTNET_DbgEnableMiniDump environment variable in Unix dump generation command. The command sets DOTNET_DbgMiniDumpName but relies on the HelixPreCommand to set DOTNET_DbgEnableMiniDump=1. However, the environment variable should be explicitly set in the command to ensure it's inherited by the exec process.
The command should be:
DOTNET_DbgEnableMiniDump=1 DOTNET_DbgMiniDumpType=2 DOTNET_DbgMiniDumpName=... $HELIX_CORRELATION_PAYLOAD/dotnet exec ...
Or remove DOTNET_DbgMiniDumpName from the inline command and rely entirely on the HelixPreCommand exports.
| <_Debuggee Include="BasicThreads" /> | ||
| <_Debuggee Include="TypeHierarchy" /> | ||
| <_Debuggee Include="ExceptionState" /> | ||
| <_Debuggee Include="MultiModule" /> | ||
| <_Debuggee Include="GCRoots" /> | ||
| <_Debuggee Include="ServerGC" /> | ||
| <_Debuggee Include="StackWalk" /> |
There was a problem hiding this comment.
The debuggee list is duplicated in four locations (cdac-dump-helix.proj lines 56-62, DumpTests.targets lines 41-47, RunHelixLocally.ps1 line 94, RunDumpTests.ps1 line 112). This creates a maintenance burden and risk of inconsistency when adding or removing debuggees.
Consider defining the debuggee list in a single source of truth (e.g., DumpTests.targets) and importing it into the other files, or use MSBuild to discover debuggees dynamically by scanning the Debuggees directory for *.csproj files.
| // Create weak and strong handles | ||
| var weakRef = new WeakReference(new object()); |
There was a problem hiding this comment.
GCHandle leak: The GCHandles allocated at lines 30 and 35 are never freed before the FailFast crash. While this may be intentional for the test dump (to ensure handles are present in the dump), it's worth adding a comment explaining that the handles are deliberately not freed because the process will FailFast immediately.
Without this comment, it could be mistaken for a resource leak bug.
| // Create weak and strong handles | |
| var weakRef = new WeakReference(new object()); | |
| // Create weak and strong handles. | |
| var weakRef = new WeakReference(new object()); | |
| // Note: The GCHandles created above and below are intentionally not freed. | |
| // The process immediately terminates via Environment.FailFast so there is | |
| // no long-lived resource leak, and the dump tests rely on these handles | |
| // being present in the crash dump for GC root inspection. |
| - powershell: | | ||
| $testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName | ||
| Write-Host "TestHost directory: $testhostDir" | ||
| Write-Host "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir" | ||
| displayName: 'Find TestHost Directory' |
There was a problem hiding this comment.
The PowerShell script in the pipeline template will fail on non-Windows platforms (Linux, macOS). The Find TestHost Directory step uses PowerShell with Windows path separators and cmdlets that aren't available on Unix systems.
This should be replaced with a cross-platform solution using either bash scripts for Unix platforms or a cross-platform PowerShell Core script that works on all platforms.
| # --- Helper: set dump env vars --- | ||
| function Set-DumpEnvVars($dumpFilePath) { | ||
| $env:DOTNET_DbgEnableMiniDump = "1" | ||
| $env:DOTNET_DbgMiniDumpType = "4" |
There was a problem hiding this comment.
Inconsistent dump type configuration: RunDumpTests.ps1 uses DOTNET_DbgMiniDumpType=4, while DumpTests.targets, RunHelixLocally.ps1, and cdac-dump-helix.proj all use DOTNET_DbgMiniDumpType=2.
According to .NET documentation:
- Type 2 = Mini dump with private data (smaller)
- Type 4 = Heap dump (includes heap, larger)
For consistency and to avoid confusion, all dump generation scripts and targets should use the same dump type value. Since the majority use type 2, either update RunDumpTests.ps1 to use type 2, or if heap dumps are needed for these tests, update all other files to use type 4.
| $env:DOTNET_DbgMiniDumpType = "4" | |
| $env:DOTNET_DbgMiniDumpType = "2" |
797c1bb to
d361c2e
Compare
| <ItemGroup Condition="'$(TargetOS)' == 'windows'"> | ||
| <HelixPreCommand Include="set DOTNET_DbgEnableMiniDump=1" /> | ||
| <HelixPreCommand Include="set DOTNET_DbgMiniDumpType=4" /> | ||
| <HelixPreCommand Include="set CDAC_DUMP_ROOT=%HELIX_WORKITEM_PAYLOAD%\dumps" /> | ||
| </ItemGroup> | ||
| <ItemGroup Condition="'$(TargetOS)' != 'windows'"> | ||
| <HelixPreCommand Include="export DOTNET_DbgEnableMiniDump=1" /> | ||
| <HelixPreCommand Include="export DOTNET_DbgMiniDumpType=4" /> | ||
| <HelixPreCommand Include="export CDAC_DUMP_ROOT=$HELIX_WORKITEM_PAYLOAD/dumps" /> |
There was a problem hiding this comment.
Dump generation here uses DOTNET_DbgMiniDumpType=4, but DumpTests.targets uses DOTNET_DbgMiniDumpType=2. This inconsistency means local/CI/Helix runs can produce structurally different dumps (and type 4 can be much larger). Consider aligning on a single dump type and documenting why that value is required for these tests.
| bool created = ContractDescriptorTarget.TryCreate( | ||
| contractDescriptor, | ||
| _host.ReadFromTarget, | ||
| writeToTarget: null!, |
There was a problem hiding this comment.
ContractDescriptorTarget.TryCreate requires a non-null writeToTarget delegate, but this passes null!. If any contract path attempts to write to target memory, this will throw a NullReferenceException instead of failing gracefully. Consider passing a stub delegate that returns a failure HRESULT (e.g., -1) to make dump targets explicitly read-only.
| writeToTarget: null!, | |
| writeToTarget: (_, _) => -1, |
|
|
||
| /// <summary> | ||
| /// The target operating system of the dump, resolved from the RuntimeInfo contract. | ||
| /// May be <c>null</c> if the contract is unavailable. |
There was a problem hiding this comment.
The XML docs say TargetOS “may be null”, but the property returns string.Empty when _targetOS is null. Either update the documentation to say it may be empty/unknown, or change the property type/behavior to actually surface null.
| /// May be <c>null</c> if the contract is unavailable. | |
| /// Will be an empty string if the contract is unavailable or the OS cannot be determined. |
| - script: $(Build.SourcesDirectory)$(dir).dotnet$(dir)dotnet$(exeExt) build | ||
| $(Build.SourcesDirectory)/src/native/managed/cdac/tests/DumpTests/Microsoft.Diagnostics.DataContractReader.DumpTests.csproj | ||
| /p:SkipDumpGeneration=true | ||
| /p:PrepareHelixPayload=true | ||
| /p:HelixPayloadDir=$(Build.SourcesDirectory)/artifacts/helixPayload/cdac | ||
| -bl:$(Build.SourcesDirectory)/artifacts/log/DumpTestPayload.binlog |
There was a problem hiding this comment.
The Helix work item only generates dumps under dumps/local/... (see cdac-dump-helix.proj), but the test payload currently includes Net10 test classes that will look for dumps/net10.0/... and fail with missing dump files. Pass /p:SkipDumpVersions=net10.0 when building/preparing the DumpTests payload (or update the Helix work item to generate net10.0 dumps as well).
| <Exec Command=""$(_TestHostDotNet)" exec "$(_DebuggeeDll)"" | ||
| WorkingDirectory="$(_DumpDir)" | ||
| IgnoreExitCode="true" | ||
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=2;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> | ||
|
|
There was a problem hiding this comment.
Dump generation sets DOTNET_DbgMiniDumpType=2, but the Helix workflow and the provided PowerShell scripts use type 4. Using different dump types across workflows can cause tests to pass locally but fail on Helix (or vice versa). Align this value with the intended dump format for the contracts being tested.
| } | ||
|
|
||
| # --- Debuggees and versions --- | ||
| $allDebugees = @("BasicThreads", "TypeHierarchy", "ExceptionState", "MultiModule", "GCRoots", "ServerGC", "StackWalk") |
There was a problem hiding this comment.
Variable name typo: $allDebugees should be $allDebuggees for clarity (and to avoid propagating the misspelling in later edits).
| $allDebugees = @("BasicThreads", "TypeHierarchy", "ExceptionState", "MultiModule", "GCRoots", "ServerGC", "StackWalk") | |
| $allDebuggees = @("BasicThreads", "TypeHierarchy", "ExceptionState", "MultiModule", "GCRoots", "ServerGC", "StackWalk") | |
| $allDebugees = $allDebuggees |
| Write-Host " Running $name..." -ForegroundColor White -NoNewline | ||
|
|
||
| $env:DOTNET_DbgEnableMiniDump = "1" | ||
| $env:DOTNET_DbgMiniDumpType = "4" |
There was a problem hiding this comment.
This script sets DOTNET_DbgMiniDumpType to 4, while DumpTests.targets uses 2. Using different dump types across workflows can lead to inconsistent test behavior and dump sizes. Consider sharing a single constant/value across scripts/targets (or at least document why they differ).
| $env:DOTNET_DbgMiniDumpType = "4" | |
| # Keep in sync with DumpTests.targets (DumpMiniDumpType) | |
| $env:DOTNET_DbgMiniDumpType = "2" |
| [ConditionalFact] | ||
| public void Exception_ThreadHasCurrentException() | ||
| { | ||
| IThread threadContract = Target.Contracts.Thread; | ||
| Assert.NotNull(threadContract); | ||
|
|
||
| ThreadStoreData storeData = threadContract.GetThreadStoreData(); | ||
| Assert.True(storeData.ThreadCount > 0); | ||
| } |
There was a problem hiding this comment.
Test name doesn’t match what it verifies: this only asserts ThreadCount > 0 and does not validate any exception state. Either rename the test to reflect the assertion, or add an assertion that actually checks the presence of a current/last-thrown exception on a thread.
| public void StackWalk_HasMultipleFrames() | ||
| { | ||
| IThread threadContract = Target.Contracts.Thread; | ||
| IStackWalk stackWalk = Target.Contracts.StackWalk; | ||
|
|
||
| ThreadStoreData storeData = threadContract.GetThreadStoreData(); | ||
| ThreadData crashingThread = FindCrashingThread(threadContract, storeData); | ||
|
|
||
| IEnumerable<IStackDataFrameHandle> frames = stackWalk.CreateStackWalk(crashingThread); | ||
| List<IStackDataFrameHandle> frameList = frames.ToList(); | ||
|
|
||
| // The debuggee has Main → MethodA → MethodB → MethodC → FailFast, | ||
| // but the stack walk may include runtime helper frames and native transitions. | ||
| // We just assert there are multiple frames visible. | ||
| Assert.True(frameList.Count >= 1, | ||
| $"Expected at least 1 stack frame from the crashing thread, got {frameList.Count}"); | ||
| } |
There was a problem hiding this comment.
This test is named HasMultipleFrames but only asserts frameList.Count >= 1, which is the same guarantee as the earlier Count > 0 test. Either strengthen the assertion (e.g., require 2+ frames) or rename/remove to avoid redundant coverage.
The send-to-helix-inner-step.yml template call was missing the _Creator environment variable, causing 'Creator is required when using anonymous access' errors on .Open Helix queues. All other Helix jobs in the repo pass _Creator: dotnet-bot through the environment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| <Exec Command=""$(_TestHostDotNet)" exec "$(_DebuggeeDll)"" | ||
| WorkingDirectory="$(_DumpDir)" | ||
| IgnoreExitCode="true" | ||
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=2;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> |
There was a problem hiding this comment.
The DOTNET_DbgMiniDumpType value is inconsistent across different files. DumpTests.targets uses type 2 (lines 127, 148), while cdac-dump-helix.proj (lines 100, 105), RunDumpTests.ps1 (line 253), and RunHelixLocally.ps1 (line 163) all use type 4. This will result in different dump sizes and potentially different content between local dumps generated by DumpTests.targets and dumps generated in Helix or via the PowerShell scripts.
According to .NET documentation:
- Type 2 = MiniDumpWithPrivateReadWriteMemory (includes all read-write data but no code pages)
- Type 4 = MiniDumpFilterTriage (includes only essential data for triage)
For cDAC dump testing, you likely want consistent heap dumps with full private memory. Consider using type 4 consistently if that's the intended dump type, or type 2 if you need more complete dumps. The difference could cause tests to pass locally but fail in Helix, or vice versa.
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=2;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> | |
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=4;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> |
| - powershell: | | ||
| $testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName | ||
| Write-Host "TestHost directory: $testhostDir" | ||
| Write-Host "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir" | ||
| displayName: 'Find TestHost Directory' |
There was a problem hiding this comment.
This PowerShell script will fail on Linux and macOS platforms. The pipeline's cdacDumpPlatforms includes linux_x64, linux_arm64, which are non-Windows platforms.
The PowerShell cmdlets (Get-ChildItem, Select-Object) are not available in the standard bash shell used on Linux/macOS Helix agents. You need to add a conditional check and provide a bash equivalent script for non-Windows platforms.
For example:
- ${{ if eq(parameters.osGroup, 'windows') }}:
- powershell: |
$testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName
Write-Host "TestHost directory: $testhostDir"
Write-Host "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir"
displayName: 'Find TestHost Directory'
- ${{ if ne(parameters.osGroup, 'windows') }}:
- bash: |
testhostDir=$(find $(Build.SourcesDirectory)/artifacts/bin/testhost/net* -maxdepth 0 -type d | head -n 1)
echo "TestHost directory: $testhostDir"
echo "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir"
displayName: 'Find TestHost Directory'| - powershell: | | |
| $testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName | |
| Write-Host "TestHost directory: $testhostDir" | |
| Write-Host "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir" | |
| displayName: 'Find TestHost Directory' | |
| - ${{ if eq(variables['osGroup'], 'windows') }}: | |
| - powershell: | | |
| $testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName | |
| Write-Host "TestHost directory: $testhostDir" | |
| Write-Host "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir" | |
| displayName: 'Find TestHost Directory' | |
| - ${{ if ne(variables['osGroup'], 'windows') }}: | |
| - bash: | | |
| testhostDir=$(find "$(Build.SourcesDirectory)/artifacts/bin/testhost/" -maxdepth 1 -type d -name "net*" | head -n 1) | |
| echo "TestHost directory: $testhostDir" | |
| echo "##vso[task.setvariable variable=TestHostPayloadDir]$testhostDir" | |
| displayName: 'Find TestHost Directory' |
| <Exec Command=""$(_PublishedExe)"" | ||
| WorkingDirectory="$(_DumpDir)" | ||
| IgnoreExitCode="true" | ||
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=2;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> |
There was a problem hiding this comment.
The DOTNET_DbgMiniDumpType value is also inconsistent here. Line 148 uses type 2, but cdac-dump-helix.proj (lines 100, 105), RunDumpTests.ps1 (line 253), and RunHelixLocally.ps1 (line 163) all use type 4. This inconsistency means dumps generated for published runtimes (net10.0) will have different content than dumps generated in Helix or via the PowerShell scripts, potentially causing test failures or inconsistent behavior.
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=2;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> | |
| EnvironmentVariables="DOTNET_DbgEnableMiniDump=1;DOTNET_DbgMiniDumpType=4;DOTNET_DbgMiniDumpName=$(_DumpFile)" /> |
ddebd8b to
16babf9
Compare
| bool foundExceptionThread = false; | ||
| TargetPointer currentThread = storeData.FirstThread; | ||
| while (currentThread != TargetPointer.Null) | ||
| { | ||
| ThreadData threadData = threadContract.GetThreadData(currentThread); | ||
| if (threadData.LastThrownObjectHandle != TargetPointer.Null) | ||
| { | ||
| foundExceptionThread = true; | ||
| break; | ||
| } | ||
| currentThread = threadData.NextThread; | ||
| } |
There was a problem hiding this comment.
This thread-list walk is unbounded. If NextThread cycles or points back into the list, the test can hang indefinitely. Add an iteration limit (e.g., based on ThreadStoreData.ThreadCount) and fail if exceeded to keep Helix runs from timing out.
| <_DebuggeesDir>$HELIX_WORKITEM_PAYLOAD/debuggees</_DebuggeesDir> | ||
| <_DumpRoot>$HELIX_WORKITEM_PAYLOAD/dumps</_DumpRoot> | ||
| <_DumpGenCommands>@(_Debuggee->'mkdir -p $HELIX_WORKITEM_PAYLOAD/dumps/local/%(Identity) && DOTNET_DbgMiniDumpName=$HELIX_WORKITEM_PAYLOAD/dumps/local/%(Identity)/%(Identity).dmp $HELIX_CORRELATION_PAYLOAD/dotnet exec $HELIX_WORKITEM_PAYLOAD/debuggees/%(Identity)/%(Identity).dll || true', ' && ')</_DumpGenCommands> | ||
| <_TestCommand>$HELIX_CORRELATION_PAYLOAD/dotnet exec --runtimeconfig $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json $HELIX_WORKITEM_PAYLOAD/tests/xunit.console.dll $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults.xml -nologo</_TestCommand> |
There was a problem hiding this comment.
On Unix, each debuggee invocation ends with || true to ignore the expected FailFast exit code, but it also masks real failures (missing DLL, dotnet exec failure, etc.) and will continue to run tests with missing dumps. Consider checking that each expected dump file exists after running the debuggee and failing early with a clear error if it doesn't.
| bool created = ContractDescriptorTarget.TryCreate( | ||
| contractDescriptor, | ||
| _host.ReadFromTarget, | ||
| writeToTarget: null!, | ||
| _host.GetThreadContext, | ||
| additionalFactories: [], |
There was a problem hiding this comment.
ContractDescriptorTarget.TryCreate requires a non-null WriteToTargetDelegate, but the dump tests pass writeToTarget: null!. If any contract/target path calls WriteToTarget (now or in the future), this will crash with a NullReferenceException. Provide a stub write delegate that returns a failure code (e.g., -1) instead of passing null.
| # Prepare Helix payload: builds debuggees, copies test output + xunit.console.dll | ||
| - script: $(Build.SourcesDirectory)$(dir).dotnet$(dir)dotnet$(exeExt) build | ||
| $(Build.SourcesDirectory)/src/native/managed/cdac/tests/DumpTests/Microsoft.Diagnostics.DataContractReader.DumpTests.csproj | ||
| /p:SkipDumpGeneration=true |
There was a problem hiding this comment.
The Helix work item only generates dumps under .../dumps/local/<debuggee>/<debuggee>.dmp, but the test assembly includes *_Net10 test classes that will try to load .../dumps/net10.0/... unless CDAC_SKIP_VERSIONS is set. The payload preparation step should pass /p:SkipDumpVersions=net10.0 (as the local simulation script does) or the Helix work item should also generate net10.0 dumps; otherwise Helix will fail due to missing dump files.
| /p:SkipDumpGeneration=true | |
| /p:SkipDumpGeneration=true | |
| /p:SkipDumpVersions=net10.0 |
| public void StackWalk_HasMultipleFrames() | ||
| { | ||
| IThread threadContract = Target.Contracts.Thread; | ||
| IStackWalk stackWalk = Target.Contracts.StackWalk; | ||
|
|
||
| ThreadStoreData storeData = threadContract.GetThreadStoreData(); | ||
| ThreadData crashingThread = FindCrashingThread(threadContract, storeData); | ||
|
|
||
| IEnumerable<IStackDataFrameHandle> frames = stackWalk.CreateStackWalk(crashingThread); | ||
| List<IStackDataFrameHandle> frameList = frames.ToList(); | ||
|
|
||
| // The debuggee has Main → MethodA → MethodB → MethodC → FailFast, | ||
| // but the stack walk may include runtime helper frames and native transitions. | ||
| // We just assert there are multiple frames visible. | ||
| Assert.True(frameList.Count >= 1, | ||
| $"Expected at least 1 stack frame from the crashing thread, got {frameList.Count}"); | ||
| } |
There was a problem hiding this comment.
StackWalk_HasMultipleFrames currently asserts frameList.Count >= 1, which is the same condition already validated by StackWalk_CanWalkCrashingThread and contradicts the test name (“HasMultipleFrames”). This makes the test redundant and weak; either assert >= 2 (or a more meaningful minimum) or remove/merge the test.
| int count = 0; | ||
| TargetPointer currentThread = storeData.FirstThread; | ||
| while (currentThread != TargetPointer.Null) | ||
| { | ||
| ThreadData threadData = threadContract.GetThreadData(currentThread); | ||
| count++; | ||
| currentThread = threadData.NextThread; | ||
|
|
There was a problem hiding this comment.
The thread list walk has no iteration bound. If the dump has corruption (or the contract returns a cyclic list), this loop can hang the test run indefinitely (bad for Helix timeouts). Consider bounding iterations using storeData.ThreadCount (or a reasonable max) and failing if the bound is exceeded.
No description provided.