[mono] Optimize startup vtable setup#101312
Merged
kg merged 5 commits intodotnet:mainfrom Apr 23, 2024
Merged
Conversation
…undant calls to it during application startup
lambdageek
reviewed
Apr 19, 2024
Member
lambdageek
left a comment
There was a problem hiding this comment.
I'll come back and do a more thorough review. At first glance this looks good.
I'd prefer that if the release build is doing a fast path and returning some answer, the debug build should do the slow path and compare that the answer it gets matches the release build's fast answer.
I usually try to repro customer issues on a local debug build and it would be quite annoying if it was giving a different result.
This was referenced Apr 19, 2024
lambdageek
reviewed
Apr 22, 2024
Verify cache in checked builds
lambdageek
approved these changes
Apr 22, 2024
lambdageek
reviewed
Apr 22, 2024
This was referenced Apr 23, 2024
AustinWise
added a commit
to AustinWise/runtime
that referenced
this pull request
Apr 23, 2024
`ENABLE_CHECKED_BUILD` is defined to mean "Enable additional checks" and is enabled in checked and debug builds. Therefore this performance optimization be enabled when `ENABLE_CHECKED_BUILD` is *not* defined. Ref: dotnet#101312
matouskozak
pushed a commit
to matouskozak/runtime
that referenced
this pull request
Apr 30, 2024
* Add new [ptr, ptr] -> ptr simdhash variant for caching * Cache mono_class_implement_interface_slow because we perform many redundant calls to it during application startup * Verify cache in checked builds
matouskozak
pushed a commit
to matouskozak/runtime
that referenced
this pull request
Apr 30, 2024
…ides (dotnet#101445) `ENABLE_CHECKED_BUILD` is defined to mean "Enable additional checks" and is enabled in checked and debug builds. Therefore this performance optimization should be enabled when `ENABLE_CHECKED_BUILD` is *not* defined. Ref: dotnet#101312
michaelgsharp
pushed a commit
to michaelgsharp/runtime
that referenced
this pull request
May 9, 2024
* Add new [ptr, ptr] -> ptr simdhash variant for caching * Cache mono_class_implement_interface_slow because we perform many redundant calls to it during application startup * Verify cache in checked builds
michaelgsharp
pushed a commit
to michaelgsharp/runtime
that referenced
this pull request
May 9, 2024
…ides (dotnet#101445) `ENABLE_CHECKED_BUILD` is defined to mean "Enable additional checks" and is enabled in checked and debug builds. Therefore this performance optimization should be enabled when `ENABLE_CHECKED_BUILD` is *not* defined. Ref: dotnet#101312
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
During startup (mostly in interpreted builds, but also a little bit in AOT) we spend a good chunk of time setting up vtables, and a lot of that time is spent in
mono_class_implement_interface_slow. Once a check enters that slow path, all checks underneath it also stay on the slow path, which can result in a (small) exponential explosion of recursive checks that scan moderately large arrays, comparing A against B. The interface inheritance chains on BCL types are quite deep now in some cases thanks to things like generic arithmetic.This PR adds a simple simdhash-based cache for mono_class_implement_interface_slow. In my testing it has a cache hit rate of ~60% during runs of System.Runtime.Tests and System.Text.Json.Tests, along with a cache hit rate of 40-50% on simpler applications. The number of expensive checks optimized out this way is fairly significant - tens of thousands on those test suites. Improvements from this should be more dramatic for more complex codebases.
The cache implementation is somewhat suboptimal - it will involve temporary allocations if multiple threads are racing to initialize vtables, and when the cache gets too big we have to clear it instead of pruning the oldest entries, which reduces the effective hit rate - but the memory usage is deterministic and based on my profiles the performance characteristics are good.
This PR also disables
verify_class_overridesfor types inside corlib unless you're building for debug - @lambdageek pointed out that we don't really need to verify corlib types since csc should never generate invalid types for code under our control. This verification is a source of some of these redundant checks, though there are still plenty even with it disabled.