-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Description
The tuple-stress benchmark appears to be ridiculously slow with NLL. Profiling suggests that the majority of costs come from the liveness constraint generation code:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 36 to 42 in 860d169
| pub(super) fn generate<'gcx, 'tcx>( | |
| cx: &mut TypeChecker<'_, 'gcx, 'tcx>, | |
| mir: &Mir<'tcx>, | |
| liveness: &LivenessResults, | |
| flow_inits: &mut FlowAtLocation<MaybeInitializedPlaces<'_, 'gcx, 'tcx>>, | |
| move_data: &MoveData<'tcx>, | |
| ) { |
Specifically, the vast majority of samples (50%) occur in the push_type_live_constraint function:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 158 to 163 in 860d169
| fn push_type_live_constraint<T>( | |
| cx: &mut TypeChecker<'_, 'gcx, 'tcx>, | |
| value: T, | |
| location: Location, | |
| ) where | |
| T: TypeFoldable<'tcx>, |
This function primarily consists of a walk over all the free regions within a type:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 170 to 172 in 860d169
| cx.tcx().for_each_free_region(&value, |live_region| { | |
| cx.constraints.liveness_set.push((live_region, location)); | |
| }); |
However, the types in question don't really involve regions (they are things like (u32, f64, u32) etc). It turns out that we have a "flags" mechanism that tracks the content of types, designed for just such a purpose. This should allow us to quickly skip. The flags are defined here, using the bitflags! macro:
Lines 418 to 419 in 860d169
| bitflags! { | |
| pub struct TypeFlags: u32 { |
The flag we are interested in HAS_FREE_REGIONS:
Lines 432 to 434 in 860d169
| /// Does this have any region that "appears free" in the type? | |
| /// Basically anything but `ReLateBound` and `ReErased`. | |
| const HAS_FREE_REGIONS = 1 << 6; |
We should be able to optimize the for_each_free_region to consult this flag and quickly skip past types that do not contain any regions. for_each_free_region is defined here:
Lines 256 to 260 in 860d169
| pub fn for_each_free_region<T,F>(self, | |
| value: &T, | |
| callback: F) | |
| where F: FnMut(ty::Region<'tcx>), | |
| T: TypeFoldable<'tcx>, |
It uses a "type visitor" to do its work:
Lines 289 to 290 in 860d169
| impl<'tcx, F> TypeVisitor<'tcx> for RegionVisitor<F> | |
| where F : FnMut(ty::Region<'tcx>) |
we want to add callback for the case of visiting types which will check this flag. Something like the following ought to do it:
fn visit_ty(&mut self, ty: Ty<'tcx>) -> bool {
if ty.flags.intersects(HAS_FREE_REGIONS) {
self.super_ty(ty)
} else {
false // keep visiting
}
}