Change HirVec<P<T>> to HirVec<T> in hir:: Expr.#37642
Conversation
|
r? @pnkfelix (rust_highfive has picked a reviewer for you, use r? to override) |
|
Interesting. Thanks for posting, @nnethercote. |
|
Vec's FromIterator
Since both of those lines use the slice iterator and then just .map() they should already be "perfect" (which never should be taken for granted, I guess). |
|
☔ The latest upstream changes (presumably #37678) made this pull request unmergeable. Please resolve the merge conflicts. |
|
IMO there isn't a lot we can do that's not a compromise (due to heap pressure dynamics) without moving to arena/SoA models for storing the HIR in the first place (replacing |
|
What's "SoA"? |
|
"Struct of Arrays", where each type has its own vector for all instances of that type, in a central struct, although it's usually used for fields, not nodes. |
9346889 to
532867a
Compare
|
Some heap allocation numbers from DHAT. old new This PR also speeds up compilation of that workload by about 4%. It also speeds up a couple of the rustc-benchmarks by 1--2%. |
|
☔ The latest upstream changes (presumably #37824) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Looks good to me. Happy to approve it after a rebase. |
|
If it's measurably better I have no real concerns - I imagine it'd eventually be |
This changes structures like this:
```
[ ExprArray | 8 | P ]
|
v
[ P | P | P | P | P | P | P | P ]
|
v
[ ExprTup | 2 | P ]
|
v
[ P | P ]
|
v
[ Expr ]
```
to this:
```
[ ExprArray | 8 | P ]
|
v
[ [ ExprTup | 2 | P ] | ... ]
|
v
[ Expr | Expr ]
```
532867a to
382d3b0
Compare
|
I rebased. r? @michaelwoerister |
|
@bors r+ |
|
📌 Commit 382d3b0 has been approved by |
Change HirVec<P<T>> to HirVec<T> in hir:: Expr.
This PR changes data structures like this:
```
[ ExprArray | 8 | P ]
|
v
[ P | P | P | P | P | P | P | P ]
|
v
[ ExprTup | 2 | P ]
|
v
[ P | P ]
|
v
[ Expr ]
```
to this:
```
[ ExprArray | 8 | P ]
|
v
[ [ ExprTup | 2 | P ] | ... ]
|
v
[ Expr | Expr ]
```
I thought this would be a win for #36799, and on a cut-down version of that workload this reduces the peak heap size (as measured by Massif) from 885 MiB to 875 MiB. However, the peak RSS as measured by `-Ztime-passes` and by `/usr/bin/time` increases by about 30 MiB.
I'm not sure why. Just look at the picture above -- the second data structure clearly takes up less space than the first. My best idea relates to unused elements in the slices. `HirVec<Expr>` is a typedef for `P<[Expr]>`. If there were any unused excess elements then I could see that memory usage would increase, because those excess elements are larger in `HirVec<Expr>` than in `HirVec<P<Expr>>`. But AIUI there are no such excess elements, and Massif's measurements corroborate that.
However, the two main creation points for these data structures are these lines from `lower_expr`:
```rust
ExprKind::Vec(ref exprs) => {
hir::ExprArray(exprs.iter().map(|x| self.lower_expr(x)).collect())
}
ExprKind::Tup(ref elts) => {
hir::ExprTup(elts.iter().map(|x| self.lower_expr(x)).collect())
}
```
I suspect what is happening is that temporary excess elements are created within the `collect` calls. The workload from #36799 has many 2-tuples and 3-tuples and when `Vec` gets doubled it goes from a capacity of 1 to 4, which would lead to excess elements. Though, having said that, `Vec::with_capacity` doesn't create excess AFAICT. So I'm not sure. What goes on inside `collect` is complex.
Anyway, in its current form this PR is a memory consumption regression and so not worth landing but I figured I'd post it in case anyone has additional insight.
This PR changes data structures like this:
to this:
I thought this would be a win for #36799, and on a cut-down version of that workload this reduces the peak heap size (as measured by Massif) from 885 MiB to 875 MiB. However, the peak RSS as measured by
-Ztime-passesand by/usr/bin/timeincreases by about 30 MiB.I'm not sure why. Just look at the picture above -- the second data structure clearly takes up less space than the first. My best idea relates to unused elements in the slices.
HirVec<Expr>is a typedef forP<[Expr]>. If there were any unused excess elements then I could see that memory usage would increase, because those excess elements are larger inHirVec<Expr>than inHirVec<P<Expr>>. But AIUI there are no such excess elements, and Massif's measurements corroborate that.However, the two main creation points for these data structures are these lines from
lower_expr:I suspect what is happening is that temporary excess elements are created within the
collectcalls. The workload from #36799 has many 2-tuples and 3-tuples and whenVecgets doubled it goes from a capacity of 1 to 4, which would lead to excess elements. Though, having said that,Vec::with_capacitydoesn't create excess AFAICT. So I'm not sure. What goes on insidecollectis complex.Anyway, in its current form this PR is a memory consumption regression and so not worth landing but I figured I'd post it in case anyone has additional insight.