Implement scopes independent of LLVM basic blocks#7636
Closed
dotdash wants to merge 1 commit intorust-lang:masterfrom
Closed
Implement scopes independent of LLVM basic blocks#7636dotdash wants to merge 1 commit intorust-lang:masterfrom
dotdash wants to merge 1 commit intorust-lang:masterfrom
Conversation
Currently, scopes are tied to LLVM basic blocks. For each scope, there are two new basic blocks, which means two extra jumps in the unoptimized IR. These blocks aren't actually required, but only used to act as the boundary for cleanups. By keeping track of the current scope within a single basic block, we can avoid those extra blocks and jumps, shrinking the pre-optimization IR quite considerably. For example, the IR for trans_intrinsic goes from ~22k lines to ~16k lines, almost 30% less. The impact on the build times of optimized builds is rather small (about 1%), but unoptimized builds are about 11% faster. The testsuite for unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on my i7) faster. Also, in some situations this helps LLVM to generate better code by inlining functions that it previously considered to be too large. Likely because of the pointless blocks/jumps that were still present at the time the inlining pass runs. Refs rust-lang#7462
bors
added a commit
that referenced
this pull request
Jul 7, 2013
Currently, scopes are tied to LLVM basic blocks. For each scope, there are two new basic blocks, which means two extra jumps in the unoptimized IR. These blocks aren't actually required, but only used to act as the boundary for cleanups. By keeping track of the current scope within a single basic block, we can avoid those extra blocks and jumps, shrinking the pre-optimization IR quite considerably. For example, the IR for trans_intrinsic goes from ~22k lines to ~16k lines, almost 30% less. The impact on the build times of optimized builds is rather small (about 1%), but unoptimized builds are about 11% faster. The testsuite for unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on my i7) faster. Also, in some situations this helps LLVM to generate better code by inlining functions that it previously considered to be too large. Likely because of the pointless blocks/jumps that were still present at the time the inlining pass runs. Refs #7462
bors
pushed a commit
to rust-lang-ci/rust
that referenced
this pull request
Jul 25, 2022
This change attempts to resolve issue rust-lang#7636: Extract into Function does not create a generic function with constraints when extracting generic code. In `FunctionBody::analyze_container`, we now traverse the `ancestors` in search of `AnyHasGenericParams`, and attach any `GenericParamList`s and `WhereClause`s we find to the `ContainerInfo`. Later, in `format_function`, we collect all the `GenericParam`s and `WherePred`s from the container, and filter them to keep only types matching `TypeParam`s used within the newly extracted function body or param list. We can then include the new `GenericParamList` and `WhereClause` in the new function definition. This change only impacts `TypeParam`s. `LifetimeParam`s and `ConstParam`s are out of scope for this change.
bors
added a commit
to rust-lang-ci/rust
that referenced
this pull request
Jul 25, 2022
…t, r=Veykril fix: Support generics in extract_function assist This change attempts to resolve issue rust-lang#7636: Extract into Function does not create a generic function with constraints when extracting generic code. In `FunctionBody::analyze_container`, we now traverse the `ancestors` in search of `AnyHasGenericParams`, and attach any `GenericParamList`s and `WhereClause`s we find to the `ContainerInfo`. Later, in `format_function`, we collect all the `GenericParam`s and `WherePred`s from the container, and filter them to keep only types matching `TypeParam`s used within the newly extracted function body or param list. We can then include the new `GenericParamList` and `WhereClause` in the new function definition. This change only impacts `TypeParam`s. `LifetimeParam`s and `ConstParam`s are out of scope for this change. I've never contributed to this project before, but I did try to follow the style guide. I believe that this change represents an improvement over the status quo, but I think it's also fair to argue that it doesn't fully "fix" the linked issue. I'm totally open to merging this as is, or going further to try to make a more complete solution. Also: if there are other unit or integration tests I should add, please let me know where to look!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently, scopes are tied to LLVM basic blocks. For each scope, there
are two new basic blocks, which means two extra jumps in the unoptimized
IR. These blocks aren't actually required, but only used to act as the
boundary for cleanups.
By keeping track of the current scope within a single basic block, we
can avoid those extra blocks and jumps, shrinking the pre-optimization
IR quite considerably. For example, the IR for trans_intrinsic goes
from ~22k lines to ~16k lines, almost 30% less.
The impact on the build times of optimized builds is rather small (about
1%), but unoptimized builds are about 11% faster. The testsuite for
unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on
my i7) faster.
Also, in some situations this helps LLVM to generate better code by
inlining functions that it previously considered to be too large.
Likely because of the pointless blocks/jumps that were still present at
the time the inlining pass runs.
Refs #7462