Skip to content

Instrumented clrjit causes random msbuild failures #67627

@EgorBo

Description

@EgorBo

dotnet-optimization pipeline is currently blocked on this issue, I managed to make a minimal standalone repro (Linux-x64):

Create a repro.sh script with the following content:

#!/bin/bash
set -e

rm -rf repro
#SDK_TYPE=-pgo

# Step 1: download the latest dotnet SDK

mkdir repro
cd repro

wget -O dotnet-sdk.tar.gz https://dotnetbuilds.azureedge.net/public/Sdk/7.0.100-preview.4.22204.1/dotnet-sdk$SDK_TYPE-7.0.100-preview.4.22204.1-linux-x64.tar.gz
mkdir dotnet-sdk && tar -xvzf dotnet-sdk.tar.gz -C dotnet-sdk

# Step 2: download OrchardCore project 

git clone https://github.com/dotnet-perf-bot/orchardcore.git
cd orchardcore/src/OrchardCore.Cms.Web

# Step 3: Try to build OrchardCore with instrumented SDK:
DOTNET_MULTILEVEL_LOOKUP=0 DOTNET_SKIP_FIRST_TIME_EXPERIENC=1 DOTNET_CLI_TELEMETRY_OPTOUT=1 ../../../dotnet-sdk/dotnet build -c Release

and run it - it should complete without errors.

Now uncomment #SDK_TYPE=-pgo line to use natively instrumented SDK and run again - observe random meaningless msbuild failures. e.g.

/home/egorbo/prj/repro/dotnet-sdk/sdk/7.0.100-preview.4.22204.1/Current/Microsoft.Common.props(29,5): error MSB4186: 
Invalid static method invocation syntax: "[MSBuild]::GetDirectoryNameOfFileAbove($(MSBuildProjectDirectory), 
'$(_DirectoryBuildPropsFile)')". Method '[MSBuild]::GetDirectoryNameOfFileAbove' not found. Static method invocation should 
be of the form: $([FullTypeName]::Method()), e.g. $([System.IO.Path]::Combine(`a`, `b`)). Check that all parameters are defined,
 are of the correct type, and are specified in the right order. 
[/home/egorbo/prj/repro/orchardcore/src/OrchardCore/OrchardCore/OrchardCore.csproj]

After some experiments I came to conclusion that it's libclrjit.so binary that causes it - I downloaded both instrumented and normal sdk and only copied libclrjit.so from the pgo to normal one and removed pgo:

Replace clrjit from pgo
#!/bin/bash
set -e

rm -rf repro
# Step 1: download the latest dotnet SDK

mkdir repro
cd repro

wget -O dotnet-sdk-pgo.tar.gz https://dotnetbuilds.azureedge.net/public/Sdk/7.0.100-preview.4.22204.1/dotnet-sdk-pgo-7.0.100-preview.4.22204.1-linux-x64.tar.gz
wget -O dotnet-sdk.tar.gz https://dotnetbuilds.azureedge.net/public/Sdk/7.0.100-preview.4.22204.1/dotnet-sdk-7.0.100-preview.4.22204.1-linux-x64.tar.gz
mkdir dotnet-sdk && tar -xvzf dotnet-sdk.tar.gz -C dotnet-sdk
mkdir dotnet-sdk-pgo && tar -xvzf dotnet-sdk-pgo.tar.gz -C dotnet-sdk-pgo

cp -f dotnet-sdk-pgo/shared/Microsoft.NETCore.App/7.0.0-preview.4.22201.3/libclrjit.so dotnet-sdk/shared/Microsoft.NETCore.App/7.0.0-preview.4.22201.3/libclrjit.so
rm -rf dotnet-sdk-pgo

# Step 2: download OrchardCore project 

git clone https://github.com/dotnet-perf-bot/orchardcore.git
cd orchardcore/src/OrchardCore.Cms.Web

# Step 3: Try to build OrchardCore with instrumented SDK:
DOTNET_MULTILEVEL_LOOKUP=0 DOTNET_SKIP_FIRST_TIME_EXPERIENC=1 DOTNET_CLI_TELEMETRY_OPTOUT=1 ../../../dotnet-sdk/dotnet build -c Release orchardcore/src/OrchardCore.Cms.Web

cc @dotnet/jit-contrib @jkotas @davidwrighton while I try to diagnose it further maybe any advices what to check - can instrumentation inside the jit cause gc holes?

I tried to run with non-instrumented sdk and GCStress=0xC - no issues

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions