libtest: use binary search for --exact test filtering#154865

sunshowers · 2026-04-06T04:06:34Z

When --exact is passed in, use binary search for O(f log n) lookups instead of an O(n) linear scan, under the assumption that f << n (which is true for the most relevant cases).

This is important for Miri, where the interpreted execution makes the linear scan very expensive.

I measured this against a repo with 1000 empty tests, running cargo +stage1 miri nextest run test_00 (100 tests) under hyperfine:

Before (linear scan): 49.7s ± 0.6s
After (binary search): 41.9s ± 0.2s (-15.7%)

I also tried a few other variations (particularly swapping matching tests to the front of the list + truncating the list), but the index + swap_remove approach proved to be the fastest.

Questions:

To be conservative, I've assumed that test_main can potentially receive an unsorted list of tests. Is this assumption correct?

rustbot · 2026-04-06T04:06:43Z

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

Owners of files modified in this PR: libs
libs expanded to 7 candidates
Random selection from Mark-Simulacrum, ibraheemdev

The test array is sorted by name at compile time. When `--exact` is passed in, use binary search for O(f log n) lookups instead of an O(n) linear scan, under the assumption that f << n (which is true for the most relevant cases). This is important for Miri, where the interpreted execution makes the linear scan very expensive. I measured this against a repo with 1000 empty tests, running `cargo +stage1 miri nextest run test_00` (100 tests) under hyperfine: * Before (linear scan): 49.7s ± 0.6s * After (binary search): 41.9s ± 0.2s (-15.7%) I also tried a few other variations (particularly swapping matching tests to the front of the list + truncating the list), but the index + swap_remove approach proved to be the fastest. Questions: - [ ] To be conservative, I've assumed that test_main can potentially receive an unsorted list of tests. Is this assumption correct?

asder8215 · 2026-04-06T05:59:05Z

library/test/src/lib.rs

+    // invalidate indexes we haven't visited yet.
+    let mut result = Vec::with_capacity(indexes.len());
+    for &idx in indexes.iter().rev() {
+        result.push(tests.swap_remove(idx));


I have no feedback here, but I just wanted to say this is the first time I've seen swap_remove being used 👀

And it's really cool how you avoid an allocation/making a copy of the TestDescAndFn (not that you could since I don't see TestDescAndFn implement the Clone trait) through this method since the index order is sorted and deduplicated, so moving the TestDescAndFn items in backward index order would always give you your filtered TestDescAndFn in reverse order. You don't have to worry about the wrong item being taken since it gets replaced by the last item in the Vec, which is not something you are taking anyways.

I'll keep a note of this in mind if I need to do something like this in the future.

asder8215 · 2026-04-06T06:39:32Z

library/test/src/lib.rs

    let owned_tests: Vec<_> = tests.iter().map(make_owned_test).collect();
-    test_main(&args, owned_tests, None)
+    // Tests are sorted by name at compile time by mk_tests_slice.
+    let tests = TestList::new(owned_tests, TestListOrder::Sorted);


I think we can keep the test_main(&args, owned_tests, None) invocation here, and just construct the TestList inside test_main(), where test_main() would call test_main_inner(). This would reduce code duplication a bit since both test_main_static() and test_main_static_abort() seem to be the only functions that invoke test_main() and they both have a TestListOrder::Sorted. I might be incorrect though since I notice that test_main() is calling on test_main_with_exit_callback(), which denotes the test list order as unsorted, but I'm not sure why it needs to be marked as unsorted for test_main() when the test_main_static_* functions are seemingly only one using it.

Ideally, I would have prefer test_main() to keep with calling on test_main_with_exit_callback() and not need to introduce a test_main_inner() through, making it take a TestList instead of Vec<TestDescAndFn>. However, I can see that librustdoc/doctest.rs does utilize test_main_with_exit_callback(), though I don't see why a TestList object can not be constructed there.

Yeah, I definitely thought of that, as well as the related approach of making TestList be a newtype wrapper which always ensures sortedness. Happy to hear opinions before I go about making these bigger changes.

At the moment, I believe that within this repo, merged doctests pass in a potentially-unsorted list. But I wasn't sure how much these methods are callable outside the repo.

At the moment, I believe that within this repo, merged doctests pass in a potentially-unsorted list. But I wasn't sure how much these methods are callable outside the repo.

Looking at librustdoc/doctest.rs, which is the only file (as far as my Go to references are telling me and a quick text search on codebase) outside this file that calls test_main_with_exit_callback, it looks like it sorts the test list as well?

// We need to call `test_main` even if there is no doctest to run to get the output // `running 0 tests...`. if ran_edition_tests == 0 || !standalone_tests.is_empty() { standalone_tests.sort_by(|a, b| a.desc.name.as_slice().cmp(b.desc.name.as_slice())); test::test_main_with_exit_callback(&test_args, standalone_tests, None, || { let times = times.times_in_secs(); // We ensure temp dir destructor is called. std::mem::drop(temp_dir.take()); if let Some((total_time, compilation_time)) = times { test::print_merged_doctests_times(&test_args, total_time, compilation_time); } }); }

But also since these are public methods accessible in the test crate, I'm not sure if people would need to use these methods and expect the passed in test array to handle both sorted/unsorted (which I think making TestList be a newtype wrapper that ensures sortedness is a good idea). The test module docs:

//! Support code for rustc's built in unit-test and micro-benchmarking //! framework. //! //! Almost all user code will only be interested in `Bencher` and //! `black_box`. All other interactions (such as writing tests and //! benchmarks themselves) should be done via the `#[test]` and //! `#[bench]` attributes. //! //! See the [Testing Chapter](../book/ch11-00-testing.html) of the book for more //! details.

feels like it implies that people won't really use the test_main_* methods. Are these methods used in nextest?

Never mind, I do see it in librust/doctest/runner.rs here

Are these methods used in nextest?

No, nextest doesn't insert itself into the build process in any way.

How do you think I should proceed with this?

At the moment, I believe that within this repo, merged doctests pass in a potentially-unsorted list.

I was reading through, librustdoc/lib.rs, librustdoc/doctest/markdown , and librustdoc/doctest.rs, librustdoc/doctest/runner, and I think it does sort the merged doctests already.

In librustdoc/doctest.rs, that's what I'm assuming the iterative loop for run_tests(...) (lines 352-371) does for us (correct me if I'm wrong though). If that's true, then it seems like all usage of test_main* functions pass in a sorted test list.

But I wasn't sure how much these methods are callable outside the repo.
At least within librustdoc, the test list seems to be in sorted order, but because test_main* are publicly accessible functions via the test crate, it's possible for people to pass in tests that are unsorted.

How do you think I should proceed with this?

To answer this question, I prefer what you mentioned earlier here:

as well as the related approach of making TestList be a newtype wrapper which always ensures sortedness

We wouldn't need the TestListOrder enum with this approach, and can construct a TestList type in each of the test_main* functions. test_main_static* would invoke test_main underneath the hood like before. The test list being sorted internally would have to be documented for these functions as well.

Moreover, I feel like the TestList newtype should not be a publicly accessible struct, and should remain as an internal helper struct within the rust repo (pub(crate) visibility?); I can't see in what situations people would create a TestList struct and use it when all the test crate function uses TestDescAndFn (and we can't change the signature of test_main* since it would cause a breaking change -- though it is marked as experimental so not sure what policy is on that -- + means that more librustdoc code needs to be modified).

The consequence to this change is that some changes in librustdoc might need to be made (probably separately) to prevent redundant sorting, or there should be a note somewhere that there's some redundant sorting occurring in librustdoc.

Those are my thoughts and how I would probably proceed with this.

rustbot assigned Mark-Simulacrum Apr 6, 2026

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 6, 2026

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Apr 6, 2026

sunshowers mentioned this pull request Apr 6, 2026

Handle tests that should be run within the same process nextest-rs/nextest#27

Open

3 tasks

This comment has been minimized.

Sign in to view

sunshowers force-pushed the binary-search-test branch from 1e41231 to bbb9e3b Compare April 6, 2026 04:11

asder8215 reviewed Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

libtest: use binary search for --exact test filtering#154865