Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5 #19790

benaadams · 2018-08-31T14:46:25Z

Extends Vectorization for 1 value (.IndexOf) to single pass vectorization for 2, 3, 4 and 5 value searches in .IndexOfAny

string.IndexOfAny(char[] anyOf) // anyOf.lengths 2,3,4,5
string.IndexOfAny(char[] anyOf, int startIndex) // anyOf.lengths 2,3,4,5
string.IndexOfAny(char[] anyOf, int startIndex, int count) // anyOf.lengths 2,3,4,5

// where T : char
IndexOfAny<T>(this Span<T> span, T value0, T value1)
IndexOfAny<T>(this Span<T> span, T value0, T value1, T value2)
IndexOfAny<T>(this Span<T> span, ReadOnlySpan<T> values)
IndexOfAny<T>(this ReadOnlySpan<T> span, T value0, T value1)
IndexOfAny<T>(this ReadOnlySpan<T> span, T value0, T value1, T value2)
IndexOfAny<T>(this ReadOnlySpan<T> span, ReadOnlySpan<T> values)

https://gist.github.com/benaadams/097ec032f7be451c22eedf877967403c

               Method |       Runtime |  Position |       Mean |
 -------------------- | ------------- | --------- |-----------:|
- String_IndexOfAny_3 | .NET Core 2.1 |         1 |   6.440 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |         1 |   7.479 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |         7 |  11.690 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |         7 |  10.799 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |         8 |  12.955 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |         8 |  11.081 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |        15 |  18.339 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |        15 |  16.896 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |        16 |  19.499 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |        16 |  16.046 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |        31 |  33.041 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |        31 |  16.844 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |        32 |  33.141 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |        32 |  16.926 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |        63 |  61.875 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |        63 |  19.649 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |        64 |  61.532 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |        64 |  19.700 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |       127 | 102.849 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |       127 |  29.007 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |       128 | 104.652 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |       128 |  25.332 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |       255 | 187.974 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |       255 |  36.632 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |       256 | 187.738 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |       256 |  36.718 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |      1023 | 692.424 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |      1023 | 103.747 ns |
- String_IndexOfAny_3 | .NET Core 2.1 |      1024 | 694.512 ns |
+ String_IndexOfAny_3 | .NET Core 3.0 |      1024 | 103.974 ns |

src/System.Private.CoreLib/shared/System/String.Searching.cs

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs

tannergooding · 2018-09-01T14:11:07Z

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs

+                char* pEndCh = pCh + length;
+
+#if !netstandard11
+                if (Vector.IsHardwareAccelerated && length >= Vector<ushort>.Count * 2)


Out of scope for this PR, but it might be nice to try this with Hardware Intrinsics as well.

tannergooding · 2018-09-01T14:15:13Z

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs

+#if !netstandard11
+                if (Vector.IsHardwareAccelerated && length >= Vector<ushort>.Count * 2)
+                {
+                    // Figure out how many characters to read sequentially until we are vector aligned


Why start with a sequential read?

You should be able to do a single unaligned read, mask out bits that would be covered by the first aligned read, and then do the vectorized check (and I'm not even sure you need to mask for this algorithm).

I've just copied the IndexOf<char>(this Span<char> span, char value0) method and used the same multi-comparison with vectorized BitwiseOr as byte to widen it to 2 and 3 chars.

So an improved approach to vectorization, whether an unaligned start or using instrinsics directly should also apply to those.

Might be worth a tracking issue?

Might be worth a tracking issue?

Probably. I'll log one sometime Tuesday, if I don't get to it before then.

Not sure it would help. The vectorization is fast at discounting characters; but not fast at then finding a match where the sequential by itself would probably be faster - rather than doing an overlapping vector at start and overlapping vector at the end.

tannergooding · 2018-09-01T14:17:45Z

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs

+                {
+                    // Get the highest multiple of Vector<ushort>.Count that is within the search space.
+                    // That will be how many times we iterate in the loop below.
+                    // This is equivalent to: length = Vector<ushort>.Count * ((int)(pEndCh - pCh) / Vector<ushort>.Count)


I really wish the the JIT did this optimization.

tannergooding

LGTM

benaadams · 2018-09-01T23:40:55Z

@danmosemsft while testing this I encounted a lot of use via FileSystemName and 5 chars

private static readonly char[] s_wildcardChars =
{
    '\"', '<', '>', '*', '?'
};

Wondering if this should be extended for just 2-3 to go up to 5, rather than moving to a loop per item at 4+?

@tannergooding would there be enough xmm registers?

tannergooding · 2018-09-02T02:35:03Z

x64 likely has enough registers, but you'd want to look at the codegen in any case, since the effectiveness of the various registers likely depends on surrounding code and the platform abi (which would define, for example, if the registers have to be saved by the caller, before transferring to the callee)

eerhardt

Looks good. Thanks @benaadams.

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs

src/System.Private.CoreLib/shared/System/MemoryExtensions.cs

ahsonkhan · 2018-09-10T21:24:44Z

Wondering if this should be extended for just 2-3 to go up to 5

I noticed that, in some cases, a naiive loop searching through 4 elements (white space in json) was faster than using the vectorized IndexOfAny, so I think extended to 4/5 might be a good idea. Especially if some elements appear a lot more than others (and we can order them as such within the loop).

tannergooding · 2018-09-20T17:21:05Z

We are good to merge this now, correct?

benaadams · 2018-09-27T19:09:30Z

@dotnet-bot test

benaadams · 2018-09-27T19:12:00Z

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

benaadams · 2018-09-27T22:12:54Z

Unrelated

MESSAGE:
Assert.Throws() Failure
Expected: typeof(System.OverflowException)
Actual: (No exception was thrown)
+++++++++++++++++++
STACK TRACE:
at System.Runtime.InteropServices.Tests.SafeBufferTests.
  Initialize_NumBytesTimesSizeOfEachElement_ThrowsOverflowException() in /mnt/j/workspace/dotnet_coreclr/master/jitstress/x64_checked_ubuntu_corefx_baseline_prtest/_/fx/src/System.Runtime.InteropServices/
  tests/System/Runtime/InteropServices/SafeBufferTests.cs:line 32

Issue filed https://github.com/dotnet/corefx/issues/32520

benaadams · 2018-09-28T03:27:57Z

Assert failure(PID 5308 [0x000014bc], Thread: 8072 [0x1f88]): 
CONTRACT VIOLATION by GetValueTypeForGUID 
at "d:\j\workspace\x64_checked_w---d7295605\src\vm\interoputil.cpp" @ 2862

benaadams · 2018-09-28T03:28:35Z

@tannergooding think so

benaadams · 2018-10-10T03:18:35Z

Added tests for 4,5 values dotnet/corefx#32734 so they can be also vectorized

benaadams · 2018-10-10T04:55:32Z

Added vectorization for 4 and 5 values

benaadams · 2018-10-10T05:19:21Z

src/System.Private.CoreLib/shared/System/MemoryExtensions.cs

+                ref var valueRef = ref Unsafe.As<T, char>(ref MemoryMarshal.GetReference(values));
+                if (values.Length == 5)
+                {
+                    // Length 5 is a common length for FileSystemName expression (", <, >, *, ?) and in preference to 2 as it has an explicit overload


@JeremyKuhne helpful to have this first? FileSystemName switches between two different length arrays 5 and 2

Sorry, missed this. I hope it becomes less common for 5 in the future, but for now it is by far and away the most common case. The new default when using EnumerationOptions is the simple type (i.e. 2).

benaadams · 2018-10-10T05:25:58Z

Should wait for the tests dotnet/corefx#32734

benaadams · 2018-10-10T06:00:22Z

OSX failing on files

Test Result (5 failures / +5)
file_io.GetFileSize.test1
file_io.GetFileSizeEx.test1
file_io.SetFilePointer.test5
file_io.SetFilePointer.test6
file_io.SetFilePointer.test7

benaadams · 2018-10-13T03:00:33Z

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

benaadams · 2018-10-13T03:47:16Z

Ah, corefx tests are failing due to https://github.com/dotnet/coreclr/issues/20405

benaadams · 2018-10-17T01:10:16Z

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

benaadams · 2018-10-17T03:32:50Z

Windows_NT x64 Checked corefx_baseline issue is unrelated and previously reported https://github.com/dotnet/coreclr/issues/20373

Thread: 5960 [0x1748]): CONTRACT VIOLATION by GetValueTypeForGUID 
at "d:\j\workspace\x64_checked_w---d7295605\src\vm\interoputil.cpp" @ 2862

MODE_COOPERATIVE encountered while thread is in preemptive state.

benaadams · 2018-10-17T21:52:51Z

We are good to merge this now, correct?

@tannergooding yes, tests are now in corefx dotnet/corefx#32734 and pass with these changes. Feedback has been addressed.

tannergooding · 2018-10-22T20:32:30Z

Thanks for the work @benaadams

…otnet#19790) * Vectorize string.IndexOfAny * Vectorize string.IndexOfAny [4,5] * Feedback * Call order preference

…otnet/coreclr#19790) * Vectorize string.IndexOfAny * Vectorize string.IndexOfAny [4,5] * Feedback * Call order preference Commit migrated from dotnet/coreclr@018df68

benaadams commented Aug 31, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/String.Searching.cs Outdated Show resolved Hide resolved

krwq reviewed Aug 31, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs Outdated Show resolved Hide resolved

krwq reviewed Aug 31, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs Show resolved Hide resolved

danmoseley assigned tannergooding Sep 1, 2018

tannergooding reviewed Sep 1, 2018

View reviewed changes

tannergooding approved these changes Sep 1, 2018

View reviewed changes

benaadams mentioned this pull request Sep 1, 2018

Add specific tests for {RO}Span.IndexOfAny<char> dotnet/corefx#32069

Merged

eerhardt approved these changes Sep 10, 2018

View reviewed changes

eerhardt requested a review from ahsonkhan September 10, 2018 20:23

ahsonkhan reviewed Sep 10, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/SpanHelpers.Char.cs Outdated Show resolved Hide resolved

ahsonkhan reviewed Sep 10, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/MemoryExtensions.cs Show resolved Hide resolved

ahsonkhan approved these changes Sep 10, 2018

View reviewed changes

eerhardt mentioned this pull request Sep 11, 2018

Replace DV data type system with .NET standard type system. dotnet/machinelearning#863

Merged

benaadams mentioned this pull request Oct 10, 2018

{RO}Span<char> add IndexOfAny 4,5 tests change many tests to 6+ dotnet/corefx#32734

Merged

benaadams added 3 commits October 10, 2018 04:39

Vectorize string.IndexOfAny

847e72b

Vectorize string.IndexOfAny [4,5]

0d33e74

Feedback

36d7f6d

benaadams force-pushed the Vectorize-string.IndexOfAny branch from 656f621 to 36d7f6d Compare October 10, 2018 04:52

benaadams changed the title ~~Vectorize string.IndexOfAny [2,3]~~ Vectorize string.IndexOfAny [2,3,4,5] Oct 10, 2018

Call order preference

737de39

benaadams commented Oct 10, 2018

View reviewed changes

benaadams changed the title ~~Vectorize string.IndexOfAny [2,3,4,5]~~ [WIP] Vectorize string.IndexOfAny [2,3,4,5] Oct 10, 2018

benaadams changed the title ~~[WIP] Vectorize string.IndexOfAny [2,3,4,5]~~ [WIP] Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5 Oct 10, 2018

benaadams changed the title ~~[WIP] Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5~~ Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5 Oct 13, 2018

tannergooding merged commit 018df68 into dotnet:master Oct 22, 2018

benaadams deleted the Vectorize-string.IndexOfAny branch October 22, 2018 20:42

benaadams mentioned this pull request Nov 1, 2018

Single pass Vectorize Span<byte>.IndexOfAny(..., ReadOnlySpan<byte>) for 2,3 lengths #20738

Merged

A-And pushed a commit to A-And/coreclr that referenced this pull request Nov 20, 2018

Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5 (d…

f423807

…otnet#19790) * Vectorize string.IndexOfAny * Vectorize string.IndexOfAny [4,5] * Feedback * Call order preference

benaadams mentioned this pull request Jan 31, 2020

Error SafeBufferTests Initialize_NumBytesTimesSizeOfEachElement_ThrowsOverflowException dotnet/runtime#27497

Closed

GrabYourPitchforks mentioned this pull request Nov 12, 2020

Consider doing "naive" search in IndexOfAny if string's length is under a certain threshold dotnet/runtime#6586

Closed

Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5 #19790

Additionally Vectorize string.IndexOfAny for value lengths 2,3,4,5 #19790

Uh oh!

Conversation

benaadams commented Aug 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tannergooding left a comment

Choose a reason for hiding this comment

Uh oh!

benaadams commented Sep 1, 2018

Uh oh!

tannergooding commented Sep 2, 2018

Uh oh!

eerhardt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ahsonkhan commented Sep 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tannergooding commented Sep 20, 2018

Uh oh!

benaadams commented Sep 27, 2018

Uh oh!

benaadams commented Sep 27, 2018

Uh oh!

benaadams commented Sep 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benaadams commented Sep 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benaadams commented Sep 28, 2018

Uh oh!

benaadams commented Oct 10, 2018

Uh oh!

benaadams commented Oct 10, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benaadams commented Oct 10, 2018

Uh oh!

benaadams commented Oct 10, 2018

Uh oh!

benaadams commented Oct 13, 2018

Uh oh!

benaadams commented Oct 13, 2018

Uh oh!

benaadams commented Oct 17, 2018

Uh oh!

benaadams commented Oct 17, 2018

Uh oh!

benaadams commented Oct 17, 2018

Uh oh!

tannergooding commented Oct 22, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

benaadams commented Aug 31, 2018 •

edited

Loading

ahsonkhan commented Sep 10, 2018 •

edited

Loading

benaadams commented Sep 27, 2018 •

edited

Loading

benaadams commented Sep 28, 2018 •

edited

Loading