SocketAsyncContext.Unix: remove Lock from IsReady #36705

tmds · 2020-05-19T15:15:41Z

triggered by #36214

ghost · 2020-05-19T15:15:46Z

Tagging subscribers to this area: @dotnet/ncl
Notify danmosemsft if you want to be subscribed.

tmds · 2020-05-19T15:17:01Z

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

                                if (_tail == null)
                                {
                                    _state = QueueState.Ready;
+                                    _sequenceNumber++;


This change is unrelated. All code paths that set _state to QueueState.Ready increment _sequenceNumber. I'm not sure why it is not done at this location, so I've added it.

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

tmds · 2020-05-19T18:19:15Z

@adamsitnik I hope to see at least the 3k rps you measured in #36214. And I hope for some more because this is now used for sends and receives.

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

tmds · 2020-05-19T19:28:20Z

@adamsitnik for benchmarking: System.Net.Sockets.dll.tar.gz

adamsitnik · 2020-05-21T09:12:40Z

The CI fails with following error:

System/Net/Sockets/SocketAsyncContext.Unix.cs(1571,49): error CS1503: (NETCORE_ENGINEERING_TELEMETRY=Build) Argument 2: cannot convert from 'out int' to 'out int?'

tmds · 2020-05-25T11:58:32Z

@adamsitnik not sure if you saw the change to make CI pass? Here is the dll: System.Net.Sockets.dll.tar.gz

adamsitnik · 2020-05-25T13:07:26Z

@tmds the results

I've uploaded jsonplatform trace to traces/36705_isready_nolock

tmds · 2020-05-25T13:24:33Z

Thanks for benchmarking Adam. For Citrine JSON platform, the improvements seems to be round the expected 3k (0.24%). Notable better improvement on ARM 32-core: 3.28%.

About the measured regressions, it doesn't make sense why a lock (2 Interlocked) can perform better than two Volatile.Reads to me.

I'm in for this change because the complexity is not so high. The tiny improvement applies to every socket operation.

adamsitnik · 2020-05-25T15:36:54Z

About the measured regressions,

If you are talking about the mono machine JSON results I think that the results produced on this machine are always quite unstable and you can ignore it

tmds · 2020-05-29T12:56:44Z

Is this good to merge?

stephentoub · 2020-06-01T16:33:36Z

Is this good to merge?

In Adam's results, there are a bunch of regressions (not just on the mono machines) that are larger than the improvement in json on citrine. That's not a concern?

tmds · 2020-06-02T06:47:07Z

In Adam's results, there are a bunch of regressions (not just on the mono machines) that are larger than the improvement in json on citrine. That's not a concern?

From the benchmark results it doesn't look like it's an every scenario wins change. Though I don't understand why two volatile reads wouldn't always be cheaper than two compareexchanges.
Per Adam's comment we can ignore the results for mono. I think overall the trends looks good then.

stephentoub · 2020-06-02T13:38:33Z

I think overall the trends looks good then.

I'm struggling to see that. Can you elaborate? There was a notable increase in one benchmark on ARM, but two of the three shown regressed on AMD, two of the three regressed on Perf 12 cores, and while there was a measurable improvement in Fortunes Platform on citrine, there was an almost commensurate regression in Fortunes Batching on citrine.

Is it possible this was just a bad run of the metrics?

While I agree in theory removing a lock should be helpful, at the same time lock-freedom generally makes code more complex and harder to reason about and maintain, and so I'd only want to take such a change if we can prove to ourselves it really moves things in the right direction meaningfully.

tmds · 2020-06-02T14:10:54Z

@stephentoub I'm basing myself mostly on ARM results and time spent in traces inside IsReady. I'm also making some assumptions about the cost of volatileread vs compareexchange, which are maybe incorrect.

@adamsitnik what do you think about these regressions? To we have an idea on how much the same benchmark varies across successive runs? Are we confident these are regressions?

I'm fine if you want to close this because the gain (0.24% on jsonplatform) is not worth the investigation. Though the gain on ARM seems worthwhile. And it will apply to any socket operation, not just TE 100% load cases.

stephentoub · 2020-06-02T14:12:47Z

I'm fine if you want to close this

I don't; I want it to be successful :) I just want to make sure we're only taking it when we can prove that it actually is.

geoffkizer · 2020-06-04T17:18:10Z

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

-            public bool IsReady(SocketAsyncContext context, out int observedSequenceNumber)
+            // IsReady returns whether an operation can be executed immediately.
+            // observedSequenceNumber must be passed to StartAsyncOperation.
+            public bool IsReady(SocketAsyncContext context, out int? observedSequenceNumber)


Can we keep observedSequenceNumber as a non-nullable int? It's only used when IsReady returns true, so we should be able to just return 0 or whatever when that's not the case. Seems like a non-nullable int would be slightly more efficient here.

it's only used when IsReady returns true

When IsReady returns false, the null is used to ensure that StartAsyncOperation will try the operation in case the queue becomes ready in the meanwhile. To not use a nullable int here, the best value to use is _sequenceNumber - 1 (when returning false), because that will fail the sequence number equals check in StartAsyncOperation.

benaadams · 2020-06-05T02:34:22Z

.IsReady does currently show up under JIT_MonExit_Portable in the traces

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

stephentoub · 2020-06-08T21:24:27Z

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

                {
-                    observedSequenceNumber = _sequenceNumber;
-                    bool isReady = (_state == QueueState.Ready) || (_state == QueueState.Stopped);
+                    observedSequenceNumber--;


Is there an enforced cap on the sequence number somewhere? I'm wondering what happens if, for example, this wraps around such that observedSequenceNumber is int.MinValue and observedSequenceNumber - 1 is int.MaxValue.

There is no cap on sequence number. The observed sequence number gets checked for equality in StartAsyncOperation. It is a wrapping counter for the nr of times the queue becomes ready. Decrementing it is the same as pretending we have 'observed' the previous time the queue became ready, so StartAsyncOperation considers IsReady state as a new thing.
I think there is nothing special at int.{Min,Max}Value because we're doing ++, -- and != operations only.

Ok, I wasn't sure if it was ever compared with <. If it's only ever compared with == seems fine. Out of curiosity, what would happen if 4B operations occurred between the time we read the value and compare it such that it fully wrapped and was again the same? I assume if that's a (super rare) problem, it's already a problem.

Out of curiosity, what would happen if 4B operations occurred between the time we read the value and compare it such that it fully wrapped and was again the same?

We'd not be aware events happened, and the socket operation will never complete. Since there is a limited nr of cases that cause increments, we don't need to worry about reaching 4B increments.

Again, not actually concerned, but trying to understand... re "Since there is a limited nr of cases that cause increments, we don't need to worry about reaching 4B increments", what would prevent one thread from stalling in between the get and the check, and other threads processing 4B operations in between?

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

SocketAsyncContext.Unix: remove Lock from IsReady

48a6f94

Dotnet-GitSync-Bot added the area-System.Net.Sockets label May 19, 2020

tmds commented May 19, 2020

View reviewed changes

stephentoub reviewed May 19, 2020

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Show resolved Hide resolved

jaredpar mentioned this pull request May 19, 2020

OSX machines are de-provisioned during CI / PR runs leading to failures #34472

Closed

Make QueueState an int

4604d72

stephentoub reviewed May 19, 2020

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Show resolved Hide resolved

tmds added 2 commits May 20, 2020 21:12

Update comment

b8bcfd0

cleanup comment

bdb78f7

tmds added 2 commits May 21, 2020 20:44

Merge remote-tracking branch 'upstream/master' into socket_isready

3b226e6

Update for rebase

b436b1b

geoffkizer reviewed Jun 4, 2020

View reviewed changes

Decrement sequenceNumber instead of using nullable int

a06db65

stephentoub reviewed Jun 8, 2020

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Show resolved Hide resolved

stephentoub reviewed Jun 8, 2020

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Outdated Show resolved Hide resolved

PR feedback

4400920

stephentoub approved these changes Jun 9, 2020

View reviewed changes

stephentoub merged commit 8558ca7 into dotnet:master Jun 9, 2020

karelz added this to the 5.0.0 milestone Aug 18, 2020

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

SocketAsyncContext.Unix: remove Lock from IsReady #36705

SocketAsyncContext.Unix: remove Lock from IsReady #36705

Uh oh!

Conversation

tmds commented May 19, 2020

Uh oh!

ghost commented May 19, 2020

Uh oh!

tmds May 19, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tmds commented May 19, 2020

Uh oh!

Uh oh!

tmds commented May 19, 2020

Uh oh!

adamsitnik commented May 21, 2020

Uh oh!

tmds commented May 25, 2020

Uh oh!

adamsitnik commented May 25, 2020

Uh oh!

tmds commented May 25, 2020

Uh oh!

adamsitnik commented May 25, 2020

Uh oh!

tmds commented May 29, 2020

Uh oh!

stephentoub commented Jun 1, 2020

Uh oh!

tmds commented Jun 2, 2020

Uh oh!

stephentoub commented Jun 2, 2020

Uh oh!

tmds commented Jun 2, 2020

Uh oh!

stephentoub commented Jun 2, 2020

Uh oh!

geoffkizer Jun 4, 2020

Choose a reason for hiding this comment

Uh oh!

tmds Jun 5, 2020

Choose a reason for hiding this comment

Uh oh!

benaadams commented Jun 5, 2020

Uh oh!

Uh oh!

stephentoub Jun 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmds Jun 9, 2020

Choose a reason for hiding this comment

Uh oh!

stephentoub Jun 9, 2020

Choose a reason for hiding this comment

Uh oh!

tmds Jun 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stephentoub Jun 9, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

stephentoub Jun 8, 2020 •

edited

Loading

tmds Jun 9, 2020 •

edited

Loading