-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Speed up Array.Reverse by using ref reassignment #17891
Conversation
|
|
d28bd02 to
fe1f7b3
Compare
|
@jkotas Do you see any issue with having |
Update Array.Reverse to call it
fe1f7b3 to
f393251
Compare
|
Quick question to reviewers who are more familiar with the JIT and GC: I played around with two different patterns here. The first, dec length
movsxd length64, length
lea last, [first + length * stride]The second, movsxd length64, length
lea last, [first + length64 * stride - stride]Is it possible in theory for the JIT to generate something like the below in the second case? movsxd length64, length
lea last, [first + length64 * stride]
add last, -strideAnd if it does generate this, is it possible that immediately after the FYI, the reason I asked the above question is that this same pattern of "get a ref to just past the end of the array then back up" also appears in |
It would make
Yes, it is possible. It is not just theory. It is pretty likely that the JIT will generate something like this on non-Intel architectures that have less rich addressing modes.
byref that points right past the end of the an array is treated by the garbage collector as if it was pointing into the array. From ECMA-335 spec: Managed pointers can point to ... the address where an element just past the end of an array would be stored (for pointer indexes into managed arrays. We take advantage of this in number of places. |
src/mscorlib/src/System/Array.cs
Outdated
| ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array); | ||
| Reverse(array, 0, array.Length); | ||
|
|
||
| // The Span ctor might perform an unnecessary type check if T is a reference type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type check would be a correctness problem for co-variant arrays. Avoiding it is not just a performance optimization; it is required for correctness.
src/mscorlib/src/System/Array.cs
Outdated
| ThrowHelper.ThrowLengthArgumentOutOfRange_ArgumentOutOfRange_NeedNonNegNum(); | ||
| if (array.Length - index < length) | ||
| if ((uint)index > (uint)array.Length) | ||
| ThrowHelper.ThrowArgumentOutOfRange_IndexException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will change exception messages. The same argument validation is in non-generic Array.Reverse - we should keep the two in sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? MSDN explicitly documents the specific exception types thrown by the non-generic method, and I'd rather not make changes to the parameter validation for these older methods. At least with the generic method the particular exception types aren't explicitly documented, so we have a little more leeway to go between ArgumentException and ArgumentOutOfRangeException in order to simplify the validation logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not realized that this is changing exception types too. Have you gone through the exercise on how breaking is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code previously threw ArgumentOutOfRangeException if length was negative and ArgumentException if length was too large. Now it throws ArgumentException in both cases.
This shouldn't break anybody, and in the past when I've participated in full framework shiproom we wouldn't have considered this a breaking change. (The reason it wouldn't be considered a breaking change is that we're not introducing a code path that now throws where previously it didn't, and we're still throwing an exception type that would have been called out in the documentation were docs to exist.)
If you have specific guidelines I should go against send them along and I'll take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The breaking change rules are documented in https://github.com/dotnet/corefx/blob/master/Documentation/coding-guidelines/breaking-change-rules.md#exceptions . Your change is throwing less specific exception in some cases. This is not listed as allowed change.
Also, note you typically pick up the new generic overloads silently with recompilation. It makes the behavior divergence between the non-generic and generic overloads more questionable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the probability of breaking anybody here is low. However, we are not doing paper cut breaks unless the benefit is significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, reverted.
|
@dotnet-bot test Windows_NT x64 Checked corefx_baseline |
src/mscorlib/src/System/Array.cs
Outdated
| i++; | ||
| j--; | ||
| } | ||
| MemoryMarshal.CreateSpan(ref Unsafe.Add(ref Unsafe.As<byte, T>(ref array.GetRawSzArrayData()), index), length).Reverse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to measure how much slower is Array.Reverse going to for small arrays.
|
@jkotas Thanks for the advice so far! I'll get a new iteration out to address your feedback and collect some firmer numbers. One more idea - we could go even further and optimize various T widths. For example, if we're running as a 64-bit process but T has width 32 bits, we know that T can't contain references, so we can do something like this. (pseudo-code, not tested against edge cases) // assume first, last are "ref ulong"
do {
ulong temp = read_unaligned(ref last);
write_unaligned(ref last) = ror64(read_unaligned(ref first), 32);
write_unaligned(ref first) = ror64(temp, 32);
first = ref Unsafe.Add(ref first, 1);
last = ref Unsafe.Add(ref last, -1);
} while (Unsafe.IsAddressLessThanOrEqualTo(first, last));Obviously this would add complexity to the So the question: is this added complexity actually worth it? Do we have evidence that - say - trying to reverse an Edit: Reversing a |
For my day-to-day work it is quite common. It's about int-counters that are sorted ascending. In some parts of the algorithms the reverse order is useful, so Although I believe |
|
@dotnet-bot test Linux-musl x64 Debug Build please |
|
Is |
|
@jkotas It's the overload that takes only the |
Additionally fix integer underflow in non-generic Array.Reverse
|
Unit tests are failing because of the changes to parameter validation. The PR dotnet/corefx#29559 addresses the failing unit tests. |
|
@dotnet-bot test Windows_NT x64 Checked corefx_baseline |
|
@dotnet-bot test Windows_NT x64 Checked Build and Test (Jit - CoreFx) please |
|
I fixed most of the error / stddev issues with my benchmark runner. Old (System.Private.CoreLib as it is today)
New (latest iteration of the PR)
|
|
@dotnet-bot test Ubuntu x64 Checked corefx_baseline |
* Speed up Array.Reverse by using ref reassignment * Optimize MemoryExtensions.Reverse Signed-off-by: dotnet-bot <[email protected]>
* Speed up Array.Reverse by using ref reassignment * Optimize MemoryExtensions.Reverse Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>
* Speed up Array.Reverse by using ref reassignment * Optimize MemoryExtensions.Reverse Signed-off-by: dotnet-bot <[email protected]>
* Speed up Array.Reverse by using ref reassignment * Optimize MemoryExtensions.Reverse Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>
This takes advantage of Roslyn's new ref reassignment feature to speed up the inner loop in
Array.Reverse. Testing shows that for large arrays the reversal time is cut by approx. 20% from the baseline.Old loop codegen:
New loop codegen:
The codegen samples above are for reversing an
int[]. I saw similar performance gains for other types. Referential types likeobject[]don't get as large a benefit due to the card table checks, but there is still some speedup.I also took the opportunity to collapse the precondition checks using the recommended patterns for (index, count) parameters.