`string_extract_if`: initial implementation#154583

GrigorenkoPV · 2026-03-30T12:49:28Z

Tracking issue: #154318

rustbot · 2026-03-30T22:55:16Z

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

Owners of files modified in this PR: libs
libs expanded to 7 candidates

rustbot · 2026-04-01T20:31:05Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Mark-Simulacrum · 2026-04-04T15:27:43Z

library/alloc/src/string/extract_if.rs

+    /// During the iteration, the underlying vector's consists of:
+    /// - A valid UTF-8 prefix (`valid_prefix.len()` bytes)
+    ///   of characters that we iterated over and didn't extract.
+    /// - A middle portion of `bytes_removed` initialized bytes that might not be valid UTF-8.


Since we only ever remove a full char, how can we break the UTF-8 property? AFAICT, the Vec should always be initialized and a valid String. The removal operations this performs should correspond directly to calling String::drain(start..end) after a sequence of false returns. That is a tiny bit less efficient since it'll re-check the start/end characters are at a UTF-8 boundary, but that check is O(1) so it shouldn't be that much slower.

I think the current implementation doesn't actually optimize to copy larger chunks only when needed, it looks like we copy each char (likely via call to memmove) which seems likely to be pretty inefficient?

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 30, 2026

This comment has been minimized.

Sign in to view

GrigorenkoPV force-pushed the string-extract-if branch from dddcb99 to 5f26749 Compare March 30, 2026 16:14

This comment has been minimized.

Sign in to view

GrigorenkoPV force-pushed the string-extract-if branch from 5f26749 to ad0847d Compare March 30, 2026 22:00

This comment has been minimized.

Sign in to view

GrigorenkoPV force-pushed the string-extract-if branch from ad0847d to ecf94cd Compare March 30, 2026 22:49

GrigorenkoPV marked this pull request as ready for review March 30, 2026 22:55

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 30, 2026

rustbot assigned Mark-Simulacrum Mar 30, 2026

rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 30, 2026

GrigorenkoPV force-pushed the string-extract-if branch from ecf94cd to ec0173f Compare April 1, 2026 20:30

This comment has been minimized.

Sign in to view

string_extract_if: initial implementation

c1d594c

GrigorenkoPV force-pushed the string-extract-if branch from ec0173f to c1d594c Compare April 1, 2026 20:31

Mark-Simulacrum reviewed Apr 4, 2026

View reviewed changes

Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`string_extract_if`: initial implementation#154583