Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@stephentoub
Copy link
Member

@stephentoub stephentoub commented Sep 27, 2018

In .NET Core 2.1, I added a bunch of optimizations to async methods that are based on reusing the async state machine object itself for other purposes in order to avoid related allocations. One of those optimizations was using the boxed state machine itself as the continuation object that could be queued onto a Task, and in the common case where the continuation could be executed synchronously, there would then not be any further allocations. However, if the continuation needed to be run asynchronously (e.g. because the Task required it via RunContinuationsAsynchronously), the code would allocate a new work item object and queue that to the thread pool to execute. This then also forced the state machine object to lazily allocate the Action delegate for its MoveNext method. This PR extends the system slightly to also cover that asynchronous execution case, by making the state machine box itself being queueable to the thread pool. In doing so, it avoids that AwaitTaskContinuation allocation and also avoids forcing the delegate into existence. (As is the case for other optimizations, this one is only employed when ETW logging isn't enabled; if it is enabled, we need to flow more information, and enabling that would penalize the non-logging case.)

Closes https://github.com/dotnet/coreclr/issues/20155
cc: @davidfowl, @kouvel, @tarekgh

In .NET Core 2.1, I added a bunch of optimizations to async methods that are based on reusing the async state machine object itself for other purposes in order to avoid related allocations.  One of those optimizations was using the boxed state machine itself as the continuation object that could be queued onto a Task, and in the common case where the continuation could be executed synchronously, there would then not be any further allocations.  However, if the continuation needed to be run asynchronously (e.g. because the Task required it via RunContinuationsAsynchronously), the code would allocate a new work item object and queue that to the thread pool to execute.  This then also forced the state machine object to lazily allocate the Action delegate for its MoveNext method. This PR extends the system slightly to also cover that asynchronous execution case, by making the state machine box itself being queueable to the thread pool.  In doing so, it avoids that AwaitTaskContinuation allocation and also avoids forcing the delegate into existence. (As is the case for other optimizations, this one is only employed when ETW logging isn't enabled; if it is enabled, we need to flow more information, and enabling that would penalize the non-logging case.)
@stephentoub stephentoub added the tenet-performance Performance related issue label Sep 27, 2018
@stephentoub stephentoub added this to the 3.0 milestone Sep 27, 2018
Copy link
Member

@davidfowl davidfowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stephentoub stephentoub merged commit 03ead51 into dotnet:master Sep 28, 2018
@stephentoub stephentoub deleted the atcbox branch September 28, 2018 00:13
@JeffCyr
Copy link

JeffCyr commented Sep 28, 2018

Could IThreadPoolWorkItem or a variation of it be public so other async state machine defined outside coreclr use the same technique?

@davidfowl
Copy link
Member

https://github.com/dotnet/corefx/issues/32485

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

tenet-performance Performance related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants