experiment: match lowering: Change the column selection strategy in pick_test() to potentially reduce redundant tests#154753
experiment: match lowering: Change the column selection strategy in pick_test() to potentially reduce redundant tests#154753jakubadamw wants to merge 2 commits intorust-lang:mainfrom
match lowering: Change the column selection strategy in pick_test() to potentially reduce redundant tests#154753Conversation
|
Some changes occurred in match lowering cc @Nadrieril |
|
rustbot has assigned @petrochenkov. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
r? me curious to see the results! |
This comment has been minimized.
This comment has been minimized.
…dant tests
Previously, `pick_test()` always selected the first match pair of the first
candidate. This could lead to suboptimal match trees when many arms share
a value for a field that is not listed first in the pattern. For
example, in `match p { P { x: 1, y: 0 }, P { x: 2, y: 0 }, .. }`, the
compiler would switch on `x` first and then redundantly check `y == 0`
in every branch.
Now, `pick_test()` iterates over the first candidate's match pairs and picks
the one whose testable value is shared across the largest number of
consecutive candidates. In the example above, `y: 0` appears in all
arms (consecutive count = 3), so we test `y` first, check it once, and
then switch on `x` - producing a smaller match tree.
@Nadrieril, thanks! So I’ve just fixed the results by having the selection prefer left-most indices when there’s a tie around how many consecutive rows can be branched into at once, which means the behaviour is unchanged for probably most of patterns. Would you be able to initiate a perf run on this? |
|
@bors try @rust-timer queue |
|
@jakubadamw: 🔑 Insufficient privileges: not in try users |
This comment has been minimized.
This comment has been minimized.
|
@Nadrieril, are you in power to initiate the perf run by any chance? 🙂 That’d be much appreciated. |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
experiment: `match` lowering: Change the column selection strategy in `pick_test()` to potentially reduce redundant tests
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (2d09f56): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.8%, secondary 1.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 1.8%, secondary 2.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.1%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 489.264s -> 489.811s (0.11%) |
|
Oh well, at least we know. 🙂 @Nadrieril, thanks for your help, closing this. |
|
Wait, these are compile-time benches. It's not the worst diff I've seen, it could be worth it on runtime perf. I thoughts there were some runtime benches in rustc-perf but it seems they didn't run? Or are never ran? I'm not sure. |
|
I think they ran. Go to tab Runtime and tick Show non-relevant results. I don't know the accuracy of these benches tho, nor their coverage. |
|
Thank you both. Looking at the description of the runtime benchmarks, it doesn't seem like many of them would even be stressing the |
At the moment,
pick_test()always selects the first match pair of the first candidate. This could lead to suboptimal match trees when many arms share a value for a field that is not listed first in the pattern. For example, inmatch p { P { x: 1, y: 0 }, P { x: 2, y: 0 }, .. }, the compiler would switch onxfirst and then redundantly checky == 0in every branch.In this PR,
pick_test()will iterate over the first candidate's match pairs and picks the one whose testable value is shared across the largest number of consecutive candidates. In the example above,y: 0appears in all arms (consecutive count = 3), so we testyfirst, check it once, and then switch onx– producing a smaller match tree.This is just an experiment, I expect the extra scan to have at least some impact on compile time for most programs. I suppose there’s a tension here between letting the user write
matchexpressions with more control over what order of condition branches they compile to versus treating them as declarative queries where it’s the compiler’s job to find the best ordering of branches to fulfil it.