Streamline QueryVTableUnerased into GetQueryVTable rust-lang/rust#152841
@bors try @rust-timer queue
View on GitHub
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
View on GitHub
⌛ Trying commit 3680e69 with merge b46d288…
To cancel the try build, run the command @bors try cancel.
Workflow: https://github.com/rust-lang/rust/actions/runs/22179731736
- The
NAMEassociated constant will be removed by #152747; I've left it alone in this PR to make rebasing easier.
compiler/rustc_query_impl/src/plumbing.rs · outdated
| 633 | /// Marker type that implements [`QueryVTableGetter`] for this query. |
|
| 634 | pub(crate) enum VTableGetter {} |
|
| 635 | |
|
| 636 | impl<'tcx> QueryVTableGetter<'tcx, FLAGS> for VTableGetter { |
This is a trait with a single implementer (per query).
Would it be hard to refactor it into a free (per query) function somehow?
(Similarly to how restore_val was removed from the trait's interface.)
Using a free function seems tricky, because the generics needed per callee would get more complicated, and that's something I'm trying to cut down on.
compiler/rustc_query_impl/src/lib.rs · outdated
| 236 | fn query_dispatcher(tcx: TyCtxt<'tcx>) -> SemiDynamicQueryDispatcher<'tcx, C, FLAGS>; |
|
| 237 | |
|
| 238 | fn restore_val(value: C::Value) -> Self::UnerasedValue; |
|
| 235 | fn query_dispatcher(tcx: TyCtxt<'tcx>) -> SemiDynamicQueryDispatcher<'tcx, Self::Cache, FLAGS>; |
If the trait method produces a query dispatcher, then I'd call the trait GetQueryDispatcher, so both say what it does and follow trait naming conventions.
Or keep "vtable", but rename SemiDynamicQueryDispatcher to SemiDynamicQueryVTable because that's also what it is.
... or SemiStatic, because it's still mostly dynamic :)
The name SemiDynamicQueryDispatcher is a relic from before the QueryDispatcher trait was removed, so for consistent naming I'd prefer to move away from “dispatcher” and towards ”vtable”.
I have some ideas for how to rename SemiDynamicQueryDispatcher, and possibly reduce the number of explicit FLAGS parameters everywhere, but that needs some more experimentation while I figure out what things should look like.
One of the candidates I have in mind for SemiDynamicQueryDispatcher is QueryVTableWithFlags, but I was also thinking about using that name for a trait that hides the flags in an associated constant, so that callees would accept impl QueryVTableWithFlags and not have to have a const FLAGS generic at all.
- Thankfully most of this discussion is now irrelevant after #152791.
View on GitHub
Queued b46d288 with parent fbd6934, future comparison URL.
There is currently 1 preceding artifact in the queue.
It will probably take at least ~1.1 hours until the benchmark run finishes.
View on GitHub
Finished benchmarking commit (b46d288): comparison URL.
Overall result: ❌ regressions - please read the text below
Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.
Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.
@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression
Instruction count
Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
0.3% | [0.2%, 0.4%] | 9 |
| Regressions ❌ (secondary) |
0.3% | [0.1%, 0.5%] | 35 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
- | - | 0 |
| All ❌✅ (primary) | 0.3% | [0.2%, 0.4%] | 9 |
Max RSS (memory usage)
Results (secondary 1.0%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
- | - | 0 |
| Regressions ❌ (secondary) |
5.5% | [5.5%, 5.5%] | 1 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
-3.5% | [-3.5%, -3.5%] | 1 |
| All ❌✅ (primary) | - | - | 0 |
Cycles
Results (primary 2.2%, secondary 9.1%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
2.2% | [2.2%, 2.2%] | 1 |
| Regressions ❌ (secondary) |
9.1% | [9.1%, 9.1%] | 1 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
- | - | 0 |
| All ❌✅ (primary) | 2.2% | [2.2%, 2.2%] | 1 |
Binary size
This benchmark run did not return any relevant results for this metric.
Bootstrap: 481.318s -> 480.997s (-0.07%)
Artifact size: 397.87 MiB -> 397.77 MiB (-0.03%)
The current iteration only changes the trait itself and encode_query_results, without touching the DepKindVTable functions. I'm hoping that this will be perf-neutral.
@bors try @rust-timer queue
View on GitHub
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
View on GitHub
⌛ Trying commit 73e18de with merge f16cfc0…
To cancel the try build, run the command @bors try cancel.
Workflow: https://github.com/rust-lang/rust/actions/runs/22208763469
View on GitHub
Queued f16cfc0 with parent ef70767, future comparison URL.
There are currently 0 preceding artifacts in the queue.
It will probably take at least ~1.0 hours until the benchmark run finishes.
Finished benchmarking commit (f16cfc0): comparison URL.
Overall result: ❌✅ regressions and improvements - no action needed
Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.
@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression
Instruction count
Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
- | - | 0 |
| Regressions ❌ (secondary) |
1.5% | [1.5%, 1.5%] | 1 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
-0.7% | [-0.7%, -0.7%] | 1 |
| All ❌✅ (primary) | - | - | 0 |
Max RSS (memory usage)
Results (primary 3.7%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
10.0% | [10.0%, 10.0%] | 1 |
| Regressions ❌ (secondary) |
- | - | 0 |
| Improvements ✅ (primary) |
-2.6% | [-2.6%, -2.6%] | 1 |
| Improvements ✅ (secondary) |
- | - | 0 |
| All ❌✅ (primary) | 3.7% | [-2.6%, 10.0%] | 2 |
Cycles
Results (primary 2.5%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
2.5% | [2.5%, 2.5%] | 1 |
| Regressions ❌ (secondary) |
- | - | 0 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
- | - | 0 |
| All ❌✅ (primary) | 2.5% | [2.5%, 2.5%] | 1 |
Binary size
Results (primary 0.0%, secondary 0.0%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
0.0% | [0.0%, 0.0%] | 6 |
| Regressions ❌ (secondary) |
0.0% | [0.0%, 0.1%] | 4 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
- | - | 0 |
| All ❌✅ (primary) | 0.0% | [0.0%, 0.0%] | 6 |
Bootstrap: 496.246s -> 482.423s (-2.79%)
Artifact size: 397.86 MiB -> 397.86 MiB (-0.00%)
View on GitHub
☔ The latest upstream changes (presumably #152747) made this pull request unmergeable. Please resolve the merge conflicts.
This looks reasonable to me. The perf regression is surprising, I wonder if #152791 will make a difference?
View on GitHub
☔ The latest upstream changes (presumably #152791) made this pull request unmergeable. Please resolve the merge conflicts.
Let's see what perf looks like after removing const FLAGS.
@bors try @rust-timer queue
View on GitHub
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
View on GitHub
⌛ Trying commit 2570d18 with merge f8e0d28…
To cancel the try build, run the command @bors try cancel.
Workflow: https://github.com/rust-lang/rust/actions/runs/22332404321
View on GitHub
Queued f8e0d28 with parent b3869b9, future comparison URL.
There is currently 1 preceding artifact in the queue.
It will probably take at least ~1.0 hours until the benchmark run finishes.
compiler/rustc_query_impl/src/plumbing.rs · resolved
| 325 | 326 | } |
| 326 | 327 | |
| 327 | pub(crate) fn encode_query_results<'a, 'tcx, Q, C: QueryCache>( |
|
| 328 | query: &'tcx QueryVTable<'tcx, C>, |
Can you still pass in query here? That is simpler than calling Q::query_vtable. Is it possible to avoid the Q generic, and just have C and V?
Yeah, in hindsight the code I had was influenced by how things looked before some of the other recent cleanups.
Finished benchmarking commit (f8e0d28): comparison URL.
Overall result: ✅ improvements - no action needed
Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.
@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression
Instruction count
Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
- | - | 0 |
| Regressions ❌ (secondary) |
- | - | 0 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
-0.3% | [-0.3%, -0.3%] | 1 |
| All ❌✅ (primary) | - | - | 0 |
Max RSS (memory usage)
Results (primary 1.1%, secondary 4.5%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
1.1% | [1.1%, 1.1%] | 1 |
| Regressions ❌ (secondary) |
4.5% | [4.5%, 4.5%] | 1 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
- | - | 0 |
| All ❌✅ (primary) | 1.1% | [1.1%, 1.1%] | 1 |
Cycles
Results (secondary -3.0%)
A less reliable metric. May be of interest, but not used to determine the overall result above.
| mean | range | count | |
|---|---|---|---|
| Regressions ❌ (primary) |
- | - | 0 |
| Regressions ❌ (secondary) |
- | - | 0 |
| Improvements ✅ (primary) |
- | - | 0 |
| Improvements ✅ (secondary) |
-3.0% | [-3.0%, -3.0%] | 1 |
| All ❌✅ (primary) | - | - | 0 |
Binary size
This benchmark run did not return any relevant results for this metric.
Bootstrap: 479.41s -> 479.13s (-0.06%)
Artifact size: 395.68 MiB -> 397.73 MiB (0.52%)
These changes should have no perf effects, and the most recent perf run only shows noise, so this PR seems safe for rollup.
@bors rollup=maybe
View all comments
QueryDispatcherUnerasedis an awkward name for an awkward trait.We can make it a bit more straightforward by removing its responsibility for erasing query values, and by observing that its only real responsibility beyond that is to know how to obtain the vtable for a particlar query from
tcx.