Adding Regex.EnumerateMatches#67794
Conversation
|
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
|
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsAdding EnumerateMatches method which returns an enumerator that can iterate over the matches in a passed-in span. The operation is performed amortized allocation free.
|
|
Here is a quick benchmark I wrote to see how this compares with the existing way to iterate over a MatchCollection using Regex.Matches: // regex pattern used is "\b\w+\b" and the input is loremIpsum 5 paragraph string.
[Benchmark(Baseline = true)]
public int MatchCollection()
{
int x = 0;
for (int i = 0; i < 1000; i++)
{
foreach (Match match in regex.Matches(loremIpsum))
{
if (match.ValueSpan[0] >= 'a' && match.ValueSpan[0] <= 'z')
x++;
}
}
return x;
}
[Benchmark]
public int MatchEnuemrator()
{
int x = 0;
ReadOnlySpan<char> span = loremIpsum.AsSpan();
for (int i = 0; i < 1000; i++)
{
foreach (ValueMatch word in regex.EnumerateMatches(span))
{
if (span.Slice(word.Index, word.Length)[0] >= 'a' && span.Slice(word.Index, word.Length)[0] <= 'z')
x++;
}
}
return x;
}And the results are:
|
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
|
@olsaarik, in the current NonBacktracking code, it would benefit from knowing that indexes are needed but not captures. Will that still be the case after your upcoming fixes? |
…rateMatches and cleaning up some code.
ad317d4 to
aba9a54
Compare
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Count.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
...braries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.EnumerateMatches.Tests.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Outdated
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.EnumerateMatches.cs
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
Outdated
Show resolved
Hide resolved
876c140 to
ac0ca12
Compare
Fixes #65011
Fixes #23602
Adding EnumerateMatches method which returns an enumerator that can iterate over the matches in a passed-in span. The operation is performed amortized allocation free.