-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
High memory usage compiling keccak benchmark #54208
Copy link
Copy link
Closed
Labels
A-NLLArea: Non-lexical lifetimes (NLL)Area: Non-lexical lifetimes (NLL)I-compilememIssue: Problems and improvements with respect to memory usage during compilation.Issue: Problems and improvements with respect to memory usage during compilation.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.NLL-performantWorking towards the "performance is good" goalWorking towards the "performance is good" goalP-mediumMedium priorityMedium priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Metadata
Metadata
Assignees
Labels
A-NLLArea: Non-lexical lifetimes (NLL)Area: Non-lexical lifetimes (NLL)I-compilememIssue: Problems and improvements with respect to memory usage during compilation.Issue: Problems and improvements with respect to memory usage during compilation.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.NLL-performantWorking towards the "performance is good" goalWorking towards the "performance is good" goalP-mediumMedium priorityMedium priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Type
Fields
Give feedbackNo fields configured for issues without a type.
According to perf.rust-lang.org, a "Clean" build of

keccak-checkhas amax-rssof 637 MB. Here's a Massif profile of the heap memory usage.The spike is due to a single allocation of 500,363,244 bytes here:
rust/src/librustc/middle/liveness.rs
Line 601 in 28bcffe
Each vector element is a
Users, which is a three field struct taking up 12 bytes.num_live_nodesis 16,371, andnum_varsis 2,547, and 12 * 16,371 * 2,547 = 500,363,244.I have one idea to improve this:
Usersis a triple contains twou32s and abool, which means that it is 96 bytes even though it only contains 65 bytes of data. If we split it up so we have 3 vectors instead of a vector of triples, we'd end up with 4 * 16,371 * 2,547 + 4 * 16,371 * 2,547 + 1 * 16,371 * 2,547 = 375,272,433, which is a reduction of 125,090,811 bytes. This would getmax-rssdown from 637MB to 512MB, a reduction of 20%.Alternatively, if we packed the
bools into a bitset we could get it down to 338,787,613 bytes, which is a reduction of 161,575,631 bytes. This would getmax-rssdown from 637MB to 476MB, a reduction of 25%. But it might slow things down... depends if the improved locality is outweighed by the extra instructions needs for bit manipulations.@nikomatsakis: do you have any ideas for improving this on the algorithmic side? Is this dense
num_live_nodes * num_varsrepresentation avoidable?