Improve alloc Vec::retain_mut performance#149784
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
|
@fereidani: 🔑 Insufficient privileges: not in try users |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Improve alloc `Vec::retain_mut` performance
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (060d14c): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -3.0%, secondary 6.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 3.0%, secondary -2.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.1%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 470.452s -> 473.09s (0.56%) |
This comment has been minimized.
This comment has been minimized.
|
is it ready for another perf run? |
|
@Kivooeo Yes, Thank you! |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Improve alloc `Vec::retain_mut` performance
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (3ec64be): comparison URL. Overall result: ❌✅ regressions and improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 1.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 472.272s -> 473.265s (0.21%) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
I also applied your requested changes for #150067 , our other similar PR to improve same algorithm for |
|
Sure thing! It is done. Thank you for reviewing this PR. |
|
Alright, let's merge this... |
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing b7bcaa5 (parent) -> bd33b83 (this PR) Test differencesShow 199 test diffs199 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard bd33b83cfdbac9bffa3b04aaef95ec97827909a9 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (bd33b83): comparison URL. Overall result: ❌ regressions - no action needed@rustbot label: -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.5%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.7%, secondary -1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 474.563s -> 473.344s (-0.26%) |
Hi,
While reading the rustc source code, I noticed it uses
smallvecandthin-vecin many places. I started reviewing those crates, optimized theirretain_mutimplementation, and then realized they were using the exact same algorithm asalloc::vec::Vecwith less unsafe So now I’m back here with a PR for the standard library 😂.In my benchmarks, this version is noticeably faster when
retain_mutactually removes elements (thanks to fewer pointer operations, it just advanceswrite_index), while performing identically to the current implementation when nothing is removed.Let’s see if bors likes this change or not.