cut: fix -s flag for newline delimiter and optimize memory allocation#11143

akervald · 2026-02-27T13:04:42Z

Fixes & Improvements

-slogic fix: Add the missing only_delimited check to properly suppress non-delimited lines.
Field-Level Streaming: Replace whole-file split().collect() with a memchr-powered loop. This shifts memory complexity from O(Total File Size) to O(Max Field Size) - as "OOM-safe" as the specification allows.
Zero-Allocation Skipping: Bypass unselected fields using BufReader::consume() to avoid heap copies.
Sequential Pointer Tracking: Replace nested loops and segments.get() lookups with a single-pass range_idx pointer that synchronizes "Skip" and "Keep" paths in one linear sweep.
Early Exit: Terminate I/O immediately once the highest requested field is processed.
Edge Case Support: Correctly handle single lines lacking a trailing newline.

Benchmarks

10,000,000 records (seq 1 10000000 > bench_input.txt), base M1 Pro chip.

Case 1: Filtered Selection with Early Exit (`-s -d $'\n' -f 2,1024,4096`)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`gcut`	330.8 ± 5.6	326.5	341.5	169.60 ± 30.28
`./cut_old`	437.1 ± 9.1	430.7	456.6	224.09 ± 40.10
`./cut_new`	2.0 ± 0.3	1.3	3.3	1.00

Result: ~224x faster than cut_old, ~170x faster than GNU cut.

Case 2: Full File Read / Base Throughput (`-s -d $'\n' -f 1-10000000`)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`gcut`	676.7 ± 5.1	673.7	689.3	3.96 ± 0.13
`./cut_old`	527.0 ± 16.8	517.9	573.0	3.08 ± 0.14
`./cut_new`	171.1 ± 5.5	168.9	192.1	1.00

Result: ~3x faster than cut_old, ~4x faster than GNU cut.

References

github-actions · 2026-02-27T13:17:04Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/tail/tail-n0f is now passing!

codspeed-hq · 2026-02-27T15:58:47Z

Merging this PR will improve performance by 45.19%

⚡ 1 improved benchmark
✅ 301 untouched benchmarks
🆕 2 new benchmarks
⏩ 42 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
🆕	Simulation	`cut_fields_newline_delim`	N/A	189.8 µs	N/A
🆕	Memory	`cut_fields_newline_delim`	N/A	67.8 KB	N/A
⚡	Memory	`cut_fields_custom_delim`	67.8 KB	46.7 KB	+45.19%

_{Comparing akervald:fix-cut-newline-s-flag (fe4e36b) with main (f335d14)}

42 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2026-02-27T16:57:33Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/unexpand/bounded-memory is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/tail-n0f is now passing!

github-actions · 2026-02-28T06:15:40Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cut/bounded-memory is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/tail-n0f is now passing!

github-actions · 2026-02-28T09:57:38Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/tail/follow-name (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/resolution (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Note: The gnu test tests/dd/no-allocate is now being skipped but was previously passing.
Note: The gnu test tests/pr/bounded-memory is now being skipped but was previously passing.
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Note: The gnu test tests/env/env-signal-handler was skipped on 'main' but is now failing.

github-actions · 2026-02-28T12:14:07Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/follow-name (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/resolution (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cut/cut-huge-range is now being skipped but was previously passing.
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Note: The gnu test tests/env/env-signal-handler was skipped on 'main' but is now failing.

github-actions · 2026-02-28T14:25:54Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/date/resolution (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/symlink (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/timeout/timeout-group is no longer failing!
Note: The gnu test tests/seq/seq-epipe is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/tail-n0f is now passing!

github-actions · 2026-02-28T16:56:14Z

GNU testsuite comparison:

GNU test failed: tests/cut/cut. tests/cut/cut is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/tail/pipe-f is now being skipped but was previously passing.
Congrats! The gnu test tests/cp/link-heap is now passing!
Congrats! The gnu test tests/seq/seq-epipe is now passing!

github-actions · 2026-02-28T18:28:49Z

GNU testsuite comparison:

Skip an intermittent issue tests/date/resolution (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/tail/retry is no longer failing!

akervald · 2026-02-28T18:56:56Z

Hi @cakebaker, the tests passed, but the benchmark failed due to an infrastructure issue. Could you please re-run that job? Thanks!

akervald · 2026-02-28T23:36:13Z

@sylvestre I noticed Attempt №3 was cancelled. Since I don't have permissions to trigger the CI/CD jobs myself, could you let me know if there’s a specific fix I need to make, or if you could re-run the checks when the environment is ready? Thanks!

github-actions · 2026-03-02T06:24:19Z

GNU testsuite comparison:

Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/rm/isatty is no longer failing!
Note: The gnu test tests/rm/many-dir-entries-vs-OOM is now being skipped but was previously passing.

akervald · 2026-03-02T20:13:16Z

@cakebaker Switching this to Draft. I've found a performance regression in the hot loop iter().any() is too slow when handling complex or numerous ranges. I'm going to optimize the range-filtering logic to ensure that we match the performance lead over GNU before I ask for a final review.

github-actions · 2026-03-03T06:35:54Z

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)

github-actions · 2026-03-03T07:17:28Z

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/symlink (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)

github-actions · 2026-03-03T10:49:48Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/symlink (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.

github-actions · 2026-03-03T12:42:55Z

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/symlink (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)

akervald · 2026-03-03T13:23:49Z

@cakebaker should be ready for a review/merge

github-actions · 2026-03-03T15:29:47Z

GNU testsuite comparison:

Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/symlink (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Note: The gnu test tests/rm/many-dir-entries-vs-OOM is now being skipped but was previously passing.

github-actions · 2026-03-03T15:44:59Z

GNU testsuite comparison:

Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/symlink (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.

github-actions · 2026-03-04T06:38:34Z

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tail/retry is no longer failing!
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.

- Fixed the -s flag incorrectly suppressing output when the delimiter is a newline. - Improved performance in cut_fields_newline_char_delim. - Updated tests to match GNU cut behavior for newline delimiters.

github-actions · 2026-03-04T07:35:54Z

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Note: The gnu test tests/cut/bounded-memory is now being skipped but was previously passing.
Note: The gnu test tests/seq/seq-epipe is now being skipped but was previously passing.

cakebaker · 2026-03-04T10:05:36Z

Thanks for your PR!

cakebaker reviewed Feb 27, 2026

View reviewed changes

Comment thread tests/by-util/test_cut.rs Outdated

cakebaker reviewed Feb 27, 2026

View reviewed changes

Comment thread tests/by-util/test_cut.rs

akervald closed this Feb 27, 2026

akervald reopened this Feb 27, 2026

akervald marked this pull request as draft February 28, 2026 10:36

akervald marked this pull request as ready for review February 28, 2026 11:58

akervald requested a review from cakebaker February 28, 2026 12:03

cakebaker reviewed Mar 1, 2026

View reviewed changes

Comment thread tests/by-util/test_cut.rs Outdated

cakebaker reviewed Mar 1, 2026

View reviewed changes

Comment thread tests/by-util/test_cut.rs Outdated

cakebaker reviewed Mar 1, 2026

View reviewed changes

Comment thread tests/by-util/test_cut.rs Outdated

akervald requested a review from cakebaker March 2, 2026 12:14

akervald marked this pull request as draft March 2, 2026 20:13

akervald marked this pull request as ready for review March 2, 2026 21:15

cakebaker reviewed Mar 3, 2026

View reviewed changes

Comment thread src/uu/cut/src/cut.rs Outdated

cakebaker reviewed Mar 3, 2026

View reviewed changes

Comment thread src/uu/cut/src/cut.rs Outdated

cakebaker reviewed Mar 3, 2026

View reviewed changes

Comment thread src/uu/cut/src/cut.rs Outdated

cakebaker reviewed Mar 3, 2026

View reviewed changes

Comment thread src/uu/cut/src/cut.rs Outdated

cut: fix -s flag for newline delimiter and improve performance

fe4e36b

- Fixed the -s flag incorrectly suppressing output when the delimiter is a newline. - Improved performance in cut_fields_newline_char_delim. - Updated tests to match GNU cut behavior for newline delimiters.

akervald requested a review from cakebaker March 4, 2026 08:30

cakebaker merged commit 9bbb58b into uutils:main Mar 4, 2026
163 checks passed

cakebaker mentioned this pull request Mar 6, 2026

cut: fix -s flag ignored when delimiter is newline #10037

Closed

BrewTestBot mentioned this pull request Mar 8, 2026

uutils-coreutils 0.7.0 Homebrew/homebrew-core#271284

Merged

1 task

moonfruit mentioned this pull request Mar 9, 2026

uutils-selected 0.7.0 moonfruit/homebrew-tap#493

Closed

Uh oh!

Conversation

akervald commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes & Improvements

Benchmarks

Case 1: Filtered Selection with Early Exit (-s -d $'\n' -f 2,1024,4096)

Case 2: Full File Read / Base Throughput (-s -d $'\n' -f 1-10000000)

References

Uh oh!

github-actions Bot commented Feb 27, 2026

Uh oh!

Uh oh!

Uh oh!

codspeed-hq Bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 45.19%

Performance Changes

Footnotes

Uh oh!

github-actions Bot commented Feb 27, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

akervald commented Feb 28, 2026

Uh oh!

akervald commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Mar 2, 2026

Uh oh!

akervald commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

akervald commented Mar 3, 2026

Uh oh!

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Mar 4, 2026

Uh oh!

github-actions Bot commented Mar 4, 2026

Uh oh!

Uh oh!

cakebaker commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akervald commented Feb 27, 2026 •

edited

Loading

Case 1: Filtered Selection with Early Exit (`-s -d $'\n' -f 2,1024,4096`)

Case 2: Full File Read / Base Throughput (`-s -d $'\n' -f 1-10000000`)

codspeed-hq Bot commented Feb 27, 2026 •

edited

Loading

akervald commented Feb 28, 2026 •

edited

Loading

akervald commented Mar 2, 2026 •

edited

Loading