-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
skip codegen for intrinsics with big fallback bodies if backend does not need them #150605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| /// The names of intrinsics that the current codegen backend replaces | ||
| /// with its own implementations. | ||
| pub replaced_intrinsics: Vec<Symbol>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is no way to get the current codegen backend from a tcx. I wasn't sure what the best way is to make this list of symbols available to monomorphization, and went for a new field in Session -- does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about how all this should be structured to know what the best option is here.
This seems at least plausible, since at worst it stays empty and that doesn't hurt anything (other than perf).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjorn3 do you have any suggestions for how to deal with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not the biggest fan of another Session field, but don't have any other suggestions either.
|
@bors try |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
skip codegen for intrinsics with big fallback bodies if backend does not need them
This comment has been minimized.
This comment has been minimized.
4ca06da to
a170604
Compare
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (4763a83): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.5%, secondary 3.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -3.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 473.485s -> 474.195s (0.15%) |
a170604 to
57e44f5
Compare
|
@bors try |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
skip codegen for intrinsics with big fallback bodies if backend does not need them
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (c75310a): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -4.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -3.9%, secondary 15.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 471.287s -> 473.923s (0.56%) |
|
@rustbot reroll |
|
Cool idea! I'll wait a few days to give @scottmcm time to respond respond as the much more knowledgeable person. Do you know if there is a list of similarly optimised intrinsics somewhere? |
|
In principle one could go over all the intrinsics that have fallback bodies, and then check whether the LLVM backend has implementations for them. But most fallback bodies are small so the cost of monomorphizing them is tiny. Not sure if it's worth going through the entire list. I think I got all the ones that have big fallback bodies where we really don't want to pay the monomorphization cost. |
This hopefully fixes the perf regression from #148478. I only added the intrinsics with big fallback bodies to the list; it doesn't seem worth the effort of going through the entire list.
Fixes #149945
Cc @scottmcm @bjorn3