Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@stephentoub
Copy link
Member

When Tasks backed by delegates are created, an ExecutionContext is captured. When the task completes, its delegate is being cleared, but its ExecutionContext is not, which means if the Task is subsequently kept alive (e.g. stored in a cache), so too is its ExecutionContext, which can capture an arbitrary amount of ambient state via async locals. This commit augments the clearing of the delegate to similarly clear the ExecutionContext.

Related, async methods can also capture ExecutionContext when awaits yield, so this clears out that context as well. And as long as we're doing that, we may as well also clear the state machine state, so that any hoisted locals in the state machine aren't kept alive if the resulting task is kept alive. Not doing so previously was a conscious choice, in order to aid in debugging, but as we've heard of at least a couple of cases where it unexpectedly caused a leak, I'm going ahead and changing it.

Fixes https://github.com/dotnet/coreclr/issues/20273
Fixes https://github.com/dotnet/coreclr/issues/20244
cc: @kouvel, @tarekgh, @davidfowl, @benaadams

When Tasks backed by delegates are created, an ExecutionContext is captured.  When the task completes, its delegate is being cleared, but its ExecutionContext is not, which means if the Task is subsequently kept alive (e.g. stored in a cache), so too is its ExecutionContext, which can capture an arbitrary amount of ambient state via async locals.  This commit augments the clearing of the delegate to similarly clear the ExecutionContext.

Related, async methods can also capture ExecutionContext when awaits yield, so this clears out that context as well.  And as long as we're doing that, we may as well also clear the state machine state, so that any hoisted locals in the state machine aren't kept alive if the resulting task is kept alive.  Not doing so previously was a conscious choice, in order to aid in debugging, but as we've heard of at least a couple of cases where it unexpectedly caused a leak, I'm going ahead and changing it.
@davidfowl
Copy link
Member

I wonder if we should backport this to 2.1?

@davidfowl
Copy link
Member

Add unit tests?

@stephentoub
Copy link
Member Author

stephentoub commented Oct 7, 2018

Add unit tests?

In corefx, not in coreclr.
dotnet/corefx#32671

@stephentoub
Copy link
Member Author

stephentoub commented Oct 7, 2018

I wonder if we should backport this to 2.1?

For Task? I don't think so. This behavior has been here for close to a decade.

For async methods, I've only heard two comments about it, and it wasn't a showstopper for either. I don't think that meets the bar.

@stephentoub
Copy link
Member Author

@dotnet-bot test Ubuntu arm Cross Checked no_tiered_compilation_innerloop Build and Test please (https://github.com/dotnet/coreclr/issues/20296)

@dotnet dotnet deleted a comment from dotnet-bot Oct 7, 2018
@dotnet dotnet deleted a comment from dotnet-bot Oct 7, 2018
@dotnet dotnet deleted a comment from dotnet-bot Oct 7, 2018
@stephentoub
Copy link
Member Author

Unrelated build failures:
https://ci.dot.net/job/dotnet_coreclr/job/master/job/arm_cross_checked_ubuntu_innerloop_flow_prtest/5715/

Error compiling /ssd/j/workspace/dotnet_coreclr/master/arm_cross_checked_ubuntu_innerloop_tst_prtest/Tools/Microsoft.DotNet.Build.Tasks.dll: Could not load file or assembly 'Microsoft.Build.Framework, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'. The system cannot find the file specified.
 (Exception from HRESULT: 0x80070002)
Error: file "/ssd/j/workspace/dotnet_coreclr/master/arm_cross_checked_ubuntu_innerloop_tst_prtest/Tools/Microsoft.DotNet.Build.Tasks.dll" or one of its dependencies was not found

and

Agent went offline during the build
ERROR: Connection was broken: java.util.concurrent.TimeoutException: Ping started at 1538919082438 hasn't completed by 1538919366987
	at hudson.remoting.PingThread.ping(PingThread.java:134)
	at hudson.remoting.PingThread.run(PingThread.java:90)

@stephentoub
Copy link
Member Author

@dotnet-bot test this please

@davidfowl
Copy link
Member

For Task? I don't think so. This behavior has been here for close to a decade.

Yea, but async locals didn't exist back then. I agree that we should wait for complaints though.

@stephentoub
Copy link
Member Author

but async locals didn't exist back then

They sort of did, they were just called CallContext.LogicalSetData and CallContext.LogicalGetData. 😄

@davidfowl
Copy link
Member

They sort of did, they were just called CallContext.LogicalSetData and CallContext.LogicalGetData. 😄

Yea, I wonder if they stored complex objects in there... The HttpContext was stored in the IllogicalCallContext 😄

// In case this is a state machine box with a finalizer, suppress its finalization
// as it's now complete. We only need the finalizer to run if the box is collected
// without having been completed.
if (AsyncMethodBuilderCore.TrackAsyncMethodCompletion)
Copy link
Member

@jkotas jkotas Oct 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Unrelated to this change) Would it make sense to move this into DebugFinalizableAsyncStateMachineBox so that this is not on the golden path? And also move new DebugFinalizableAsyncStateMachineBox into its own non-inlineable method so that the DebugFinalizableAsyncStateMachineBox type is built only if you have the tracing on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move this into DebugFinalizableAsyncStateMachineBox so that this is not on the golden path?

Ideally, yes. I'd looked at that previously and didn't find a great way to do it without introducing another virtual onto the golden path (and I preferred the extra branch to the virtual invocation), but several things have changed since then, so there might be a better way now. I can look again.

And also move creation of new DebugFinalizableAsyncStateMachineBox into its own non-inlineable method so that the DebugFinalizableAsyncStateMachineBox type is built only if you have the tracing on.

You mean at https://github.com/dotnet/coreclr/pull/20294/files/08e4e4f6f222919ed6d0f0430fb6922dc9973508#diff-68dc11e2c63ba209e0f903d3824d8464R488? By "built", you're referring to JIT work? Sure, I can move that. I'll do it separately.

Copy link
Member

@jkotas jkotas Oct 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean at https://github.com/dotnet/coreclr/pull/20294/files/08e4e4f6f222919ed6d0f0430fb6922dc9973508#diff-68dc11e2c63ba209e0f903d3824d8464R488?

Yes, this one. The JIT triggers type building for all types directly referenced by the method even if they are on the cold path. It takes both time and memory - each DebugFinalizableAsyncStateMachineBox type costs several hundred bytes.

Copy link
Member Author

@stephentoub stephentoub Oct 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I'll move it out separately. Thanks.

@jkotas
Copy link
Member

jkotas commented Oct 9, 2018

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test please
@dotnet-bot test Windows_NT x64 Checked CoreFX Tests please

@stephentoub
Copy link
Member Author

@dotnet-bot test CentOS7.1 x64 Checked Innerloop Build and Test please (build never started)

@stephentoub stephentoub merged commit 39ab084 into dotnet:master Oct 9, 2018
@stephentoub stephentoub deleted the clearec branch October 9, 2018 17:44
A-And pushed a commit to A-And/coreclr that referenced this pull request Nov 20, 2018
When Tasks backed by delegates are created, an ExecutionContext is captured.  When the task completes, its delegate is being cleared, but its ExecutionContext is not, which means if the Task is subsequently kept alive (e.g. stored in a cache), so too is its ExecutionContext, which can capture an arbitrary amount of ambient state via async locals.  This commit augments the clearing of the delegate to similarly clear the ExecutionContext.

Related, async methods can also capture ExecutionContext when awaits yield, so this clears out that context as well.  And as long as we're doing that, we may as well also clear the state machine state, so that any hoisted locals in the state machine aren't kept alive if the resulting task is kept alive.  Not doing so previously was a conscious choice, in order to aid in debugging, but as we've heard of at least a couple of cases where it unexpectedly caused a leak, I'm going ahead and changing it.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants