Use pinned arrays in Sockets #34175

VSadov · 2020-03-27T00:42:39Z

Pinning scenarios in Sockets could be roughly classified into two buckets:

external buffers passed in through public APIs
internal buffers created by the library itself

This change addresses the second case - the buffers allocated internally.
Since we are in control of these buffers we can allocate them on POH. That allows skipping some of the ceremony with pinning handles.
Also reduces chances to have random long-lived pinned objects interfering with compaction.

...ies/System.Net.Sockets/src/System/Net/Sockets/ReceiveMessageOverlappedAsyncResult.Windows.cs

src/libraries/Common/src/System/Net/SocketAddress.cs

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Windows.cs

jkotas · 2020-03-30T19:38:41Z

It would be useful to get some performance numbers for this.

Do we have any networking performance benchmarks that exercise these paths? Are these hot paths in any of the techempower benchmarks?

VSadov · 2020-03-30T19:59:30Z

It would be useful to get some performance numbers for this.

Working on that.
I expected this question :-)

Are these hot paths in any of the techempower benchmarks?

These look like long(ish) term allocations/pins, typically spanning an async operation or even longer.
I'd expect that these are not "hot" paths in a throughput sense.

I did run some kestrel scenarios in debugger with these changes - to see where the external buffers come from and they typically come from either slab allocator ( <4K buffers ) or from the shared pool ( larger ones ). Not 100% sure if this is always the case, I checked only few scenarios, but it seems to be the general strategy. It would make sense just from allocation avoidance point.

If these paths are touched by the benchmarks, they could be the only cases where we pin otherwise moveable memory. (this is ignoring fast pinning for fixed/pinvoke)

Whether the effect of this change will be seen on short benchmarks - not sure. Fragmentation is typically a long term problem.

src/libraries/Common/src/System/Net/SocketAddress.cs

...stem.Net.HttpListener/src/System/Net/Windows/WebSockets/WebSocketHttpListenerDuplexStream.cs

stephentoub · 2020-03-30T21:30:54Z

Are these hot paths in any of the techempower benchmarks?

I don't think so. The socket APIs being impacted here are either a) part of the legacy APM implementations, b) part of the UDP-related APIs, or c) part of vectored sends/receives. (c) is valuable, but I don't believe it shows up in any kestrel benchmarks. (b) may eventually be useful, but we've not done a lot yet around these UDP APIs and there's a lot of other low-hanging fruit. I wouldn't spend much time at all on (a).

tmds · 2020-03-31T17:35:57Z

src/libraries/Common/src/System/Net/SocketAddress.cs

@VSadov why does it make sense to use POH for SocketAddress.Buffer?

My question is to better understand when it is better to use SOH vs POH for objects that will be pinned.

In this change I basically look for arrays pinned via GCHandleType.Pinned.
With few exceptions it is an indication that the object could be pinned for a nontrivial length of time, perhaps for its entire lifetime. SocketAddress.Buffer is one of such arrays.

mid/long term pinning in random locations could have really bad effects on Small Object Heap.
On the other hand - if the object is expected to end up in Gen1/Gen2, then there is not much downside from allocating on POH.

stephentoub · 2020-04-09T21:00:27Z

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Windows.cs

Nit: this can be simplified now to:

_wsaMessageBufferPinned ??= GC.AllocateUninitializedArray<byte>(sizeof(Interop.Winsock.WSAMsg), pinned: true);

stephentoub · 2020-04-09T21:01:15Z

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Windows.cs

Ditto:

_wsaRecvMsgWSABufferArrayPinned ??= GC.AllocateUninitializedArray<WSABuffer>(1, pinned: true);

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Windows.cs

...stem.Net.HttpListener/src/System/Net/Windows/WebSockets/WebSocketHttpListenerDuplexStream.cs

VSadov · 2020-04-18T19:58:30Z

Rebased and skipped all the changes to HttpListener

VSadov · 2020-04-18T20:26:29Z

Trying to hit this in TE benchmarks was a bit of a dud.
I can see how the changed lines could originally cause pinning problems, but I am not an expert enough in Web things to make sure the affected scenarios are frequently hit (or hit at all).

As I see from traces, there are lots of pinning handles created and destroyed (both regular and async), but that is for single buffers passed to socket layer from outside and such pinning is needed for correctness. Also, with Kestrel, what I typically see at the end of a handle is a manually managed chunk of memory from a slab in Large Object Heap. From the heap fragmentation point, such pinning is not a concern.

Performance-wise, I see no impact either.

stephentoub · 2020-04-18T21:18:55Z

there are lots of pinning handles created and destroyed (both regular and async)

Hmm. Which TE benchmark? I'm a little surprised by that. I thought they were using prepinned arrays, and then creating Memory instances from them marked as already pinned such that Memory.Pin would nop.

VSadov · 2020-04-19T00:12:30Z

One example: (regular pinned handle in DbFortunes with Postgre)

Name
 || + system.private.corelib.il!ThreadPoolWorkQueue.Dispatch
 ||  + system.private.corelib.il!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[System.Threading.Tasks.VoidTaskResult,Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol+<ProcessRequests>d__217`1[System.__Canon]].MoveNext(class System.Threading.Thread)
 ||   + system.private.corelib.il!ExecutionContext.RunFromThreadPoolDispatchLoop
 ||    + microsoft.aspnetcore.server.kestrel.core.il!Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol+<ProcessRequests>d__217`1[System.__Canon].MoveNext()
 ||     + microsoft.aspnetcore.server.kestrel.core.il!HttpProtocol.ProcessRequest
 ||      + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||       + microsoft.aspnetcore.server.kestrel.core.il!Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol+<ProcessRequest>d__218`1[System.__Canon].MoveNext()
 ||        + benchmarks!FortunesRawMiddleware.Invoke
 ||         + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||          + benchmarks!Benchmarks.Middleware.FortunesRawMiddleware+<Invoke>d__4.MoveNext()
 ||           + benchmarks!Benchmarks.Data.RawDb.LoadFortunesRows()
 ||            + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||             + benchmarks!Benchmarks.Data.RawDb+<LoadFortunesRows>d__10.MoveNext()
 ||              + npgsql!NpgsqlCommand.ExecuteDbDataReaderAsync
 ||               + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||                + npgsql!Npgsql.NpgsqlCommand+<ExecuteDbDataReaderAsync>d__97.MoveNext()
 ||                 + npgsql!NpgsqlCommand.ExecuteReaderAsync
 ||                  + npgsql!NpgsqlCommand.ExecuteReaderAsync
 ||                   + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||                    + npgsql!Npgsql.NpgsqlCommand+<ExecuteReaderAsync>d__102.MoveNext()
 ||                     + npgsql!NpgsqlDataReader.NextResultAsync
 ||                      + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||                       + npgsql!Npgsql.NpgsqlDataReader+<NextResult>d__44.MoveNext()
 ||                        + npgsql!Npgsql.NpgsqlConnector+<>c__DisplayClass160_0.<DoReadMessage>g__ReadMessageLong|0(value class Npgsql.DataRowLoadingMode,bool,bool)
 ||                         + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||                          + npgsql!Npgsql.NpgsqlConnector+<>c__DisplayClass160_0+<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
 ||                           + npgsql!Npgsql.NpgsqlReadBuffer+<>c__DisplayClass34_0.<Ensure>g__EnsureLong|0()
 ||                            + system.private.corelib.il!AsyncMethodBuilderCore.Start
 ||                             + npgsql!Npgsql.NpgsqlReadBuffer+<>c__DisplayClass34_0+<<Ensure>g__EnsureLong|0>d.MoveNext()
 ||                              + npgsql!AwaitableSocket.ReceiveAsync
 ||                               + system.net.sockets.il!Socket.ReceiveAsync
 ||                                + system.net.sockets.il!SocketAsyncEventArgs.DoOperationReceiveSingleBuffer
 ||                                 + system.net.sockets.il!SocketAsyncEventArgs.ProcessIOCPResultWithSingleBufferHandle
 ||                                  + system.private.corelib.il!System.Memory`1[System.Byte].Pin()
 ||                                   + system.private.corelib.il!System.Runtime.InteropServices.GCHandle..ctor(class System.Object,value class System.Runtime.InteropServices.GCHandleType)

VSadov · 2020-04-19T00:24:03Z

There is also async pinning, which, I think could be hard to avoid as long as Overlapped is used.

Here is an example of async pinning (in PlaintextNonPipelined):

   + coreclr!ThreadpoolMgr::CompletionPortThreadStart
   |+ coreclr!BindIoCompletionCallbackStub
   | + coreclr!BindIoCompletionCallbackStubEx
   |  + coreclr!ManagedThreadBase_DispatchOuter
   |   + coreclr!ManagedThreadBase_DispatchMiddle
   |    + coreclr!BindIoCompletionCallBack_Worker
   |     + coreclr!DispatchCallSimple
   |      + coreclr!CallDescrWorkerInternal
   |       + system.private.corelib.il!_IOCompletionCallback.PerformIOCompletionCallback
   |       |+ system.net.sockets!Socket.CompleteAccept
   |       ||+ system.private.corelib.il!System.Threading.Tasks.Task`1[System.__Canon].TrySetResult(!0)
   |       |||+ system.private.corelib.il!Task.RunContinuations
   |       ||| + system.private.corelib.il!AwaitTaskContinuation.RunOrScheduleAction
   |       |||  + system.private.corelib.il!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[System.__Canon,Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketConnectionListener+<AcceptAsync>d__13].MoveNext(class System.Threading.Thread)
   |       |||  |+ system.private.corelib.il!ExecutionContext.RunInternal
   |       |||  | + microsoft.aspnetcore.server.kestrel.transport.sockets.il!Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketConnectionListener+<AcceptAsync>d__13.MoveNext()
   |       |||  | |+ microsoft.aspnetcore.server.kestrel.transport.sockets.il!Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.SocketConnection..ctor(class System.Net.Sockets.Socket,class System.Buffers.MemoryPool`1,class System.IO.Pipelines.PipeScheduler,class Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.ISocketsTrace,value class System.Nullable`1,value class System.Nullable`1,bool)
   |       |||  | ||+ system.net.sockets!SocketAsyncEventArgs.InitializeInternals
   |       |||  | || + system.private.corelib.il!System.Threading.PreAllocatedOverlapped..ctor(class System.Threading.IOCompletionCallback,class System.Object,class System.Object)
   |       |||  | ||  + coreclr!AllocateNativeOverlapped
   |       |||  | ||   + coreclr!CreateHandleCommon
   |       |||  | ||    + coreclr!HndCreateHandle

stephentoub · 2020-04-19T00:26:40Z

There is also async pinning, which, I think could be hard to avoid as long as Overlapped is used. Here is an example of async pinning (in PlaintextNonPipelined):

In plaintext, are you seeing async pinning only in accepts or also in sends/recvs? It's the latter I'm more interested in. Accepts are rare in those benchmarks. Sends/recvs dominate.

davidfowl · 2020-04-19T00:30:25Z

Right there should be no pinning for send and receive in those benchmarks

VSadov · 2020-04-19T00:33:21Z

That was the dominant pinning stack. Other cases are less frequent.
If send/receive would pin, I'd expect to see more stacks with that.

VSadov · 2020-04-19T00:44:06Z

Yes, for plaintext they are all Accepts.
There are multiple pinning paths and it is not always an async handle, but one way or another all seems to be handling Accept.

benaadams · 2020-04-19T01:58:09Z

The original Memory<T> needs to be created with the API MemoryMarshal.CreateFromPinnedArray to skip the GCHandle creation in Memory.Pin()

VSadov · 2020-04-19T02:13:04Z

@benaadams - Also, in theory, if #33542 is implemented, MemoryMarshal could just check if an array happens to be in the pinned object heap.

VSadov · 2020-04-19T19:14:52Z

So, what do we do with this PR?

It feels like the changes would be helpful to the code that hits allocating/pinning paths inside the Socket layer. At very least it reduces the ceremony around handles used to pin those allocations.

However, seeing that benchmarks evade such codepaths, thus making the effects hard to measure, I am not as excited about the change as when this started.

stephentoub · 2020-04-19T19:58:26Z

So, what do we do with this PR?

I suggest we keep the changes in SocketAsyncEventArgs and ditch the rest. I'm interested in those specifically because a) if you're creating a SocketAsyncEventArgs, you're likely keeping it around for a while and probably pooling it, and b) the changes reduce the size of that object by eliminating all the GCHandle fields.

If we end end being interested in the others, I think it'd be good to do them as official lindividual PRs with dedicated scenario-based benchmarks to back them up. Doesn't need to be TE; those aren't necessarily real-world :-)

VSadov · 2020-04-19T20:22:34Z

@stephentoub

I suggest we keep the changes in SocketAsyncEventArgs and ditch the rest.

Sounds good!

There is some entanglement between changes though -
Ex: If SocketAddress.Buffer is no longer pinned, then SocketAsyncEventArgs._socketAddressGCHandle will need to come back and all the machinery related to that ( calls to PinSocketAddressBuffer(), validation, cleanup, ).

Basically -SocketAddress.Buffer is not by itself a part of SocketAsyncEventArgs, but having that prepinned allows to drop some code and data in SocketAsyncEventArgs

Also, if we keep SocketAddress.Buffer as prepinned, then there is no point to bring back pinning in OverlappedAsyncResult

Do we actually want to "remove changes that do not benefit SocketAsyncEventArgs and keep the rest?"

stephentoub · 2020-04-19T21:29:19Z

I'm not convinced that the changes to SocketAddress are good. There are a variety of places where temporary instances are created, are short lived, etc. Creating those as gen2 is potentially concerning, and I don't know that the unknown wins from it outweigh the unknown losses. I think it'd be better to start with just the changes isolated to SAEA. That reasonable?

VSadov · 2020-04-20T00:25:39Z

Rebased and skipped everything not related to SocketAsyncEventArgs

stephentoub

Thanks.

Dotnet-GitSync-Bot added the area-System.Net.Sockets label Mar 27, 2020

benaadams reviewed Mar 27, 2020

View reviewed changes

...ies/System.Net.Sockets/src/System/Net/Sockets/ReceiveMessageOverlappedAsyncResult.Windows.cs Outdated Show resolved Hide resolved

jkotas reviewed Mar 27, 2020

View reviewed changes

src/libraries/Common/src/System/Net/SocketAddress.cs Outdated Show resolved Hide resolved

En3Tho reviewed Mar 27, 2020

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Windows.cs Outdated Show resolved Hide resolved

VSadov force-pushed the pinned branch 4 times, most recently from 3aa8726 to e147a4e Compare March 30, 2020 01:05

VSadov changed the title ~~[WIP] use pinned arrays in more places~~ Use pinned arrays in Sockets Mar 30, 2020

VSadov marked this pull request as ready for review March 30, 2020 19:26

VSadov requested review from jkotas and stephentoub March 30, 2020 19:26

stephentoub reviewed Mar 30, 2020

View reviewed changes

tmds reviewed Mar 31, 2020

View reviewed changes

VSadov force-pushed the pinned branch from 0a76250 to 8de465c Compare April 4, 2020 21:27

stephentoub reviewed Apr 9, 2020

View reviewed changes

VSadov force-pushed the pinned branch from 8de465c to 55e650a Compare April 18, 2020 19:47

Use pinned arrays in SocketAsyncEventArgs

7f6c6f6

VSadov force-pushed the pinned branch from 55e650a to 7f6c6f6 Compare April 20, 2020 00:24

Missed one case of defensive pinning.

d3bd331

stephentoub approved these changes Apr 20, 2020

View reviewed changes

stephentoub merged commit faebda7 into dotnet:master Apr 20, 2020

jaredpar mentioned this pull request Apr 21, 2020

OSX machines are de-provisioned during CI / PR runs leading to failures #34472

Closed

karelz added this to the 5.0.0 milestone Aug 18, 2020

ghost locked as resolved and limited conversation to collaborators Dec 10, 2020

Use pinned arrays in Sockets #34175

Use pinned arrays in Sockets #34175

Uh oh!

Conversation

VSadov commented Mar 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkotas commented Mar 30, 2020

Uh oh!

VSadov commented Mar 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stephentoub commented Mar 30, 2020

Uh oh!

tmds Mar 31, 2020

Choose a reason for hiding this comment

Uh oh!

tmds Mar 31, 2020

Choose a reason for hiding this comment

Uh oh!

VSadov Mar 31, 2020

Choose a reason for hiding this comment

Uh oh!

stephentoub Apr 9, 2020

Choose a reason for hiding this comment

Uh oh!

stephentoub Apr 9, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

VSadov commented Apr 18, 2020

Uh oh!

VSadov commented Apr 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentoub commented Apr 18, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

stephentoub commented Apr 19, 2020

Uh oh!

davidfowl commented Apr 19, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

benaadams commented Apr 19, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

stephentoub commented Apr 19, 2020

Uh oh!

VSadov commented Apr 19, 2020

Uh oh!

stephentoub commented Apr 19, 2020

Uh oh!

VSadov commented Apr 20, 2020

Uh oh!

stephentoub left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

VSadov commented Mar 27, 2020 •

edited

Loading

VSadov commented Mar 30, 2020 •

edited

Loading

VSadov commented Apr 18, 2020 •

edited

Loading