Skip to content

Do not deduplicate captured args while expanding format_args!#149926

Open
ShoyuVanilla wants to merge 1 commit intorust-lang:mainfrom
ShoyuVanilla:no-dedup-fmt
Open

Do not deduplicate captured args while expanding format_args!#149926
ShoyuVanilla wants to merge 1 commit intorust-lang:mainfrom
ShoyuVanilla:no-dedup-fmt

Conversation

@ShoyuVanilla
Copy link
Copy Markdown
Member

@ShoyuVanilla ShoyuVanilla commented Dec 12, 2025

View all comments

Resolves #145739

I ran crater with #149291.
While there are still a few seemingly flaky, spurious results, no crates appear to be affected by this breaking change.

The only hit from the lint was
https://github.com/multiversx/mx-sdk-rs/blob/813927c03a7b512a3c6ef9a15690eaf87872cc5c/framework/meta-lib/src/tools/rustc_version_warning.rs#L19-L30,
which performs formatting on consts of type ::semver::Version. These constants contain a nested ::semver::Identifier (Version.pre.identifier) that has a custom destructor. However, this case is not impacted by the change, so no breakage is expected.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Dec 12, 2025

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

Some changes occurred in compiler/rustc_ast_lowering/src/format.rs

cc @m-ou-se

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 12, 2025
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Dec 12, 2025

r? @spastorino

rustbot has assigned @spastorino.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rust-log-analyzer

This comment has been minimized.

@ShoyuVanilla
Copy link
Copy Markdown
Member Author

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 13, 2025
@ShoyuVanilla
Copy link
Copy Markdown
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 13, 2025
@theemathas theemathas added the I-lang-nominated Nominated for discussion during a lang team meeting. label Dec 14, 2025
@theemathas
Copy link
Copy Markdown
Contributor

Nominating as per #145739 (comment)

@traviscross traviscross added P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang T-lang Relevant to the language team needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. labels Dec 14, 2025
@traviscross
Copy link
Copy Markdown
Contributor

traviscross commented Dec 14, 2025

It'd be worth adding a test for the drop behavior.

@traviscross
Copy link
Copy Markdown
Contributor

traviscross commented Dec 14, 2025

Given that this makes more sense for the language, along with the clean crater results and the intuition that it'd be surprising if anything actually leaned on this, I propose:

@rfcbot fcp merge lang

@rust-rfcbot
Copy link
Copy Markdown
Collaborator

rust-rfcbot commented Dec 14, 2025

Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rust-rfcbot rust-rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Dec 14, 2025
@m-ou-se m-ou-se assigned m-ou-se and unassigned spastorino Dec 17, 2025
@m-ou-se
Copy link
Copy Markdown
Member

m-ou-se commented Dec 24, 2025

I don't think we should do this. It will make the generated code for println!("{x} {x}"); less efficient, as it will get two separate arguments instead of one.

I don't want to end up in a situation where it would make sense for Clippy to suggest something like:

warning: using the same placeholder multiple times is inefficient as of Rust 1.94.0
 --> src/main.rs:3:5
  |
3 |     println!("{x} {x}");
  |     ^^^^^^^^^^^^^^^^^^^
  |
help: change this to
  |
3 -     println!("{x} {x}");
3 +     println!("{x} {x}", x = x);
  |

Adding , x = x shouldn't make a difference. If adding that makes the resulting code more efficient, I strongly feel like we've done something wrong.

@rust-rfcbot concern equivalence

@nikomatsakis
Copy link
Copy Markdown
Contributor

nikomatsakis commented Mar 11, 2026

Thanks Dianne. Let me see if I can dash off a comment explaining where I stand here.

The TL;DR is that my proposal would be:

  • If you do {identifier}, we deduplicate -- or, my preference actually, expand to all syntactically equal place expressions;
  • If you do {...} anything else, we do not.

This does cost some consistency, but I think in areas where consistency isn't necessarily expected. For example, it's already the case that foo(X, X) is not the same as format!("{X} {X}") -- e.g., format captures references, it doesn't move (which is something people have gotten confused about every time I ran a Rust tutorial, side note).

The alternative gains in consistency but loses some optimization and backwards compatibility. It's not clear to me how much the optimization matters but I think that it may, particularly since we use format-args all over the place in code. I know that people often find the machinery is heavyweight in embedded land.

I don't think the vast majority of users will care whichever way we decide here, but there will be some that are surprised by const-drops running or not running, and some who are surprised that their structs are bigger than they should be. I tend to think the former will only matter to Rust supermavens, and they can learn the way that the desugaring works, it's straightforward enough. The latter is an invisible tax across Rust codebases that may impact every user.

Expanded version:

I think there's definite tension between several good Rust design principles:

Efficient by default -- the idiomatic, obvious Rust code should generate efficient things ~the same as what you would get if you did it by hand, or perhaps more efficient. To that end, if you desugared format!("{x} {x}"), it's unlikely you would store the reference to x twice. It'd be nice to have that property.

No need for a 'sufficiently smart compiler' -- we should not be leaning on super whiz-bang optimizations to get that efficient by default, just the "obvious" ones that compilers typically do, such as inlining, copy prop, CSE. The kind of thing you would do by hand automatically. (We kinda cheat on this one, sometimes, leaning on fancy alias analysis, I think that the work on minirust etc may let us out of that trap.)

Stability without stagnation -- we should try to avoid changing behavior without a strong reason.

Compiler and stdlib aren't special -- we try to expose primitives users could build themselves (or at least have a plan that they can eventually do so...).

Context-free programming -- this is a tricky one, but obviously we aim to reduce the context needed for people to understand what some code will do when it executes. Probably need to either expand or refine this to be more specific.

This one is somewhat aspirational but:

Define through desugaring -- there should be a convenient syntax and an explicit syntax; the convenient one should desugar to the explicit one in a straightforward way. That is then used to resolve non-obvious edge cases around the convenient syntax.

Looking forward, I think we want

  • format!("{...}") to support arbitrary expressions
  • f"{...}" to support arbitrary expressions

Reading over the proposed optimization, it seems to violate "compiler isn't special", in that I don't think we would ever expect to expose that kind of test to a user-defined macro. That's not the end of the world, but it seems unfortunate.

Doing no optimization violates efficient by default -- this may not matter, it's only a small thing, on its own I might say "whatever" but it also changes behavior. My inclination is to try and preserve wins when we can.

I think my proposal wins on stability and perf by default; I think it is neutral towards "define through desugaring", it's a bit more of a complex desugaring, but not wildly so, and it increases consistency with some other things (e.g., format!("{X} {X}", X = X).

The optimization loses big on stdlib isn't special and I think that's kinda worse.

@Jules-Bertholet
Copy link
Copy Markdown
Contributor

FWIW, Niko's “compiler and stdlib aren't special" argument has mostly won me over. If format_args! is “just a macro you could write yourself, it operates on the token stream like every other macro”—then having a few weird edge cases is understandable (as long as they are documented).

However, this is in tension with with the desire to support f"{...}" syntax. That would be a clear signal that formatting is a core part of the language, and we should not be cutting corners with the semantics of the core language. (For similar reasons, I would expect macro_rules! metavars to work inside f-strings, e.g. f"{$metavar}".)

@nikomatsakis
Copy link
Copy Markdown
Contributor

nikomatsakis commented Mar 18, 2026

I've given this some thought, I've also talked to @traviscross and @joshtriplett.

I'm finding that I have a hard time convincing myself one way or the other on this!

I suspect that BOTH of these are, to a first approximation, true:

  • Nobody will notice the semantic difference of duplicating vs de-duplication (i.e., evaluating places tends to be side-effect free);
  • And nobody will notice the performance difference of duplicating vs de-duplication (i.e., the occasional duplicated field will be in the noise.

If you could convince me that one of those was not true -- that somebody would notice -- that'd push me one way or the other more firmly. But I'd need some data. I'm going to try and see how often repeated variables occur in practice.

Assuming my assumptions are valid -- that neither is all that big a deal -- then you have two competing, but largely abstract, principles

  • overall simplicity -- basically that it's nice to say that {xxx} just desugars to "xxx"
  • backwards compatibility, efficient by default -- it's nice that an example like format!("index.crates.io/{pkg}/{pkg}-{version}") doesn't duplicate the field pkg, and of course the semantics around constants are relevant

I think both are important. I go back and forth on which I think are more important.

When it comes to the "fancy compiler optimization", yeah, it kind of lets you have both, but I find it overengineered for the problem, and it expands our "scope" of what it takes to achieve efficiency. If efficiency matters that much, I might rather do it the simple way of saying "we deduplicate at the string level". It is, in a way, less surprising to me.

@RalfJung
Copy link
Copy Markdown
Member

I'm finding that I have a hard time convincing myself one way or the other on this!

FWIW, I sympathize with that. I have also gone through phases of preferring either approach to this.^^ Though IMO the compiler optimizations actually provide a nice way out of this, I find them an elegant solution (have our semantic cake and eat the perf benefits, too) -- except that @m-ou-se doesn't like them, which gives me pause.

@nikomatsakis
Copy link
Copy Markdown
Contributor

So I wrote a little script (gist) to find all string literals, count the number with ANY interpolation variables (well, braces anyhow) and then count the number with repeats.

I ran it across the rust repo and got 2.5%, though that number includes tests.

I'd like to run it across crates.io.

Do with that what you will.

@nikomatsakis
Copy link
Copy Markdown
Contributor

nikomatsakis commented Mar 18, 2026

I ran the script across the top 222 crates from crates.io. I found that about 5% of strings have repeated variables...

--- Summary ---
Mode: all string literals
Strings with interpolation vars: 2417
With repeated variables: 121
Percentage: 5.0%

...that's actually more than I expected! It pushes me to think I am right to hold this concern.

@nikomatsakis
Copy link
Copy Markdown
Contributor

One more piece of data:

I found exactly ZERO instances of repeated "capital" identifiers. e.g., format!("{TAB} {TAB}") in those 222 crates. So I think we can assume that duplicating constants doesn't happen very often, much less constants with a side-effecting destructor of some kind.

@nikomatsakis
Copy link
Copy Markdown
Contributor

If we assume that 5% is common enough that we DO want avoid an extra field for local variables, then I think it's reasonable to assume we also want to avoid an extra field for fields. I don't see why {x} would be more common than {self.field}.

Deref impls can, technically, have side-effects. So while foo.bar is going to require some sensitive types/traits reasoning to deduplicate, if you want to get too precise about it. This implies that the "semantics-preserving" optimization will either get very complicated or not be able to handle this case.

This further pushes me to the conclusion that the most appealing options are

  • Deduplicate place expressions syntactically: more efficient, more complex underlying mechanism to explain desugaring.
  • Never deduplicate: less efficient, simpler desugaring, preserves the intuition that, in a format-string, you can just "drop the string stuff" and you get some expressions that execute.

I think a key variable may be how much you think it matters whether format!("{X} {X}") should be "like" foo(X, X) or if it's ok that the desugaring is a bit more complex than that. Myself, I think it's ok, I see no real evidence that saying "we first deduplicate place expressions" is really going to matter to anyone in practice or that it will be particularly hard to understand once you learn it. And I see some evidence that the perf implication is real (5% of format strings that include some duplicate variable).

@iago-lito
Copy link
Copy Markdown
Contributor

iago-lito commented Mar 19, 2026

@nikomatsakis I don't see why {x} would be more common than {self.field}.

I do. Because although I would very likely write

format!("{x} {x}")

then also if I ever stumble accross this code:

format!("{self.field} {self.field}")

I would very likely, and spontaneously, deduplicate it myself into the following to improve readability:

format!("{x} {x}", x=self.field)

Which I think is, if I'm not alone, an argument in favour of syntactic deduplication.

@traviscross
Copy link
Copy Markdown
Contributor

traviscross commented Mar 22, 2026

@nikomatsakis: I don't know. Presumably we still wouldn't want to deduplicate value expressions. I just struggle with the idea that f"{x.f} {x.f}" and f"{x.f()} {x.f()}" would have different dynamic semantics. Or -- let's get really speculative here -- assume that we someday made it possible for function calls to be place expressions. Then the dynamic semantics of f"{x.f()} {x.f()}" would depend on the signature of f. Or (again extremely speculatively), assume we later added computed fields. Then the dynamic semantics of f"{x.f} {x.f}" might depend on whether f was a real field or a computed one. Given that in no case is x.f (where a Deref::deref call is possible) a pure operation, this just seems unnecessarily strange to me.

Let's also dig into the axiom that the compiler and stdlib aren't special. In a world where functions return places, the lowering would need type information to deduplicate only place expressions. That seems more complicated to me than @dianne's AST → HIR lowering optimization. But if the desugaring doesn't deduplicate place expressions, then this isn't a problem.

@RalfJung
Copy link
Copy Markdown
Member

We already can't really syntactically distinguish place expressions from value expressions -- we need name resolution at least, which arguably is a semantic analysis (user-defined macros cannot do it). x could be a local variable (place expression) or a constant (value expression).

@traviscross
Copy link
Copy Markdown
Contributor

traviscross commented Mar 22, 2026

So I wrote a little script (gist) to find all string literals...

I'm working on a somewhat more precise analysis. I've instrumented rustc to capture details about format strings. I'll be running this through crater.

It's built on top of @dianne's branch (in #152480) so that we can see the effect of that optimization and how much it matters whether we do it at all.

So far, I've run this on a stage 2 build of the compiler and standard library (including all dependencies). In this sample, of format strings that use interpolation at all, only 0.5% have any duplicates. Of these, almost all are places, and @dianne's optimization recovers 96.1% of the size cost (i.e., all but 48 bytes total).

The cost of not doing deduplication at all (i.e., not doing dianne's optimization) on the args: &[rt::Argument<'_>] arrays referenced by the Argument structs totals to +0.41%. Note that this is calculated only over the arrays; it'd be watered down by considering the overhead of the rest of the struct. Calculated over the compiler as a whole, it'd be in the noise -- the absolute difference is only 1.2KB.

See below for the full results.


format_args! Deduplication Analysis
====================================

  Source: 19,405 invocations

BASELINE
────────
Total invocations                        19,405  (100.0%)
  Interpolating (ph > 0)                 12,610  ( 65.0%)
  Fixed strings (ph = 0)                  6,795  ( 35.0%)

  Arguments per invocation:
      0       6,795  ( 35.0%)  ████████████████████████
      1       8,633  ( 44.5%)  ██████████████████████████████
      2       2,683  ( 13.8%)  █████████
      3         825  (  4.3%)  ███
      4         294  (  1.5%)  █
      5          94  (  0.5%)  █
      6          24  (  0.1%)  █
      7          32  (  0.2%)  █
      8          14  (  0.1%)  █
     9+          11  (  0.1%)  █
    mean: 1.0  median: 1  p95: 3  max: 23

CAPTURE MECHANICS
─────────────────
  Of 12,610 interpolating invocations:
    Implicit capture                      4,973  ( 39.4%)
    Explicit named                          625  (  5.0%)
    Positional                            7,493  ( 59.4%)
    Width/precision arg                      64  (  0.5%)

DUPLICATION
───────────
  Invocations with duplicate captures:  63  (  0.5% of interpolating)

  Extra arg slots from duplication:  76 total
    mean: 1.2 per affected invocation

  Extra slots per invocation:
      0           0  (  0.0%)
      1          56  ( 88.9%)  ██████████████████████████████
      2           2  (  3.2%)  █
      3           4  (  6.3%)  ██
      4           1  (  1.6%)  █

  Multiplicity of duplicated names (65 total):
    2x                                       59  ( 90.8%)
    3x                                        1  (  1.5%)
    4+x                                       5  (  7.7%)

  Resolution of duplicated names:
    Places (recovered):         62
    Constants:                   3
    Const parameters:            0
    Other/unknown:               0

OPTIMIZATION
────────────
  Of 63 invocations with duplicates:
    Fully recovered                          60  ( 95.2%)
    Partially recovered                       0  (  0.0%)
    Not recovered                             3  (  4.8%)

  Arg slots: 73 recovered of 76 total  ( 96.1%)

  Size (argument arrays only):
    Old world (current):     298,528 bytes
    No dedup:                299,744 bytes  (+1,216  (+0.41%))
    Optimized:               298,576 bytes  (+48  (+0.02%))

  Optimization recovers 96.1% of the size cost.

@traviscross
Copy link
Copy Markdown
Contributor

traviscross commented Mar 27, 2026

Based on an instrumented top-10k crater run (in #154205, which includes 10,885 crates), here's what I found. Excluding 22 outlier crates, except where mentioned:

Only 1.8% of crates and 0.1% of format_args! invocations have any duplicates at all. Of invocations that use implicit capturing, 0.8% have duplicates.

The median per-crate cost of not deduplicating at all is zero bytes. Considering only the 194 affected crates, the median cost is 32 bytes (mean: 46 bytes). The total cost across all these crates, summed together, is under 9KB.

Including the 22 outliers, the total cost (summed across all 216 affected crates) is under 28KB.

With dianne's optimization, only 27 non-outlier and 37 total crates are affected, with a total cost (summed across all affected crates) of under 1.4KB and 7.4KB, respectively.

The outlier crate most affected by turning off deduplication is hddsgen at 7KB. Every single duplicate for this crate, though, is a place, so dianne's optimization would drop this cost to zero. This crate was first published 27 days ago.

The next-largest outlier crate is pikpaktui, first published 6 weeks ago, at 2.8KB. This one doesn't benefit at all from the optimization.

I don't mean any judgment by saying this, but these two and many earlier outlier crates I looked at seem heavily AI-generated. There's something about the way the models write format strings (and the number of them they write), at least in these outlier cases, that seems different to me than what humans do.

Anyway, the full report is below. For my part, I judge the practical cost of not deduplicating as ε-zero.


format_args! Deduplication Analysis
====================================

  Source: 356,205 unique invocations across 10,885 crates
  (721,552 raw lines; dedup ratio: 2.03x)
  Dedup method: loc-based (file:line:col)

BASELINE
────────
Total invocations                       356,205  (100.0%)
  Interpolating (ph > 0)                251,591  ( 70.6%)
  Fixed strings (ph = 0)                104,614  ( 29.4%)

  Arguments per invocation:
      0     104,614  ( 29.4%)  █████████████████
      1     185,180  ( 52.0%)  ██████████████████████████████
      2      45,765  ( 12.8%)  ███████
      3      12,532  (  3.5%)  ██
      4       4,579  (  1.3%)  █
      5       1,675  (  0.5%)  █
      6         822  (  0.2%)  █
      7         405  (  0.1%)  █
      8         206  (  0.1%)  █
     9+         427  (  0.1%)  █
    mean: 1.0  median: 1  p95: 3  max: 109

CAPTURE MECHANICS
─────────────────
  Of 251,591 interpolating invocations:
    Implicit capture                     58,571  ( 23.3%)
    Explicit named                        6,364  (  2.5%)
    Positional                          192,187  ( 76.4%)
    Width/precision arg                     662  (  0.3%)

DUPLICATION
───────────
  22 outlier crates excluded; [brackets] include all.

  Invocations with duplicate captures:
    of all                          404 [   837]  (  0.1% [  0.2%])
    of interpolating                404 [   837]  (  0.2% [  0.3%])
    of implicit-capture             404 [   837]  (  0.8% [  1.4%])

  Extra arg slots from duplication: 557 [1,736] total
    mean: 1.4 [2.1] per affected invocation

  Extra slots per invocation (all crates):
      0           0  (  0.0%)
      1         531  ( 63.4%)  ██████████████████████████████
      2         155  ( 18.5%)  █████████
      3          43  (  5.1%)  ██
      4          40  (  4.8%)  ██
      5          17  (  2.0%)  █
      6           8  (  1.0%)  █
      7           9  (  1.1%)  █
      8          10  (  1.2%)  █
     9+          24  (  2.9%)  █

  Multiplicity of duplicated names (434 [982] total):
    2x                              354 [   665]  ( 81.6% [ 67.7%])
    3x                               50 [   193]  ( 11.5% [ 19.7%])
    4+x                              30 [   124]  (  6.9% [ 12.6%])

  Resolution of duplicated names:
    Places (recovered):        365 [733]
    Constants:                  69 [249]
    Const parameters:            0 [0]
    Other/unknown:               0 [0]

OPTIMIZATION
────────────
  22 outlier crates excluded; [brackets] include all.

  Of 404 [837] invocations with duplicates:
    Fully recovered                 338 [   660]  ( 83.7% [ 78.9%])
    Partially recovered               3 [     3]  (  0.7% [  0.4%])
    Not recovered                    63 [   174]  ( 15.6% [ 20.8%])

  Arg slots: 471 [1,276] recovered of 557 [1,736] total  ( 84.6% [ 73.5%])

  Size (argument arrays only):
                                          Excl. outliers            All crates
    Old world (current stable):              5,365,040 B           5,667,232 B
    No dedup (PR #149926):             +8,912 B (+0.17%)    +27,776 B (+0.49%)
    Optimized (PR #152480):            +1,376 B (+0.03%)     +7,360 B (+0.13%)

  Optimization recovers 84.6% [73.5%] of the size cost.

PER-CRATE
─────────
  22 outlier crates excluded; [brackets] include all.

  Crates w/ duplication             194 [   216]  (  1.8% [  2.0%])
    Fully covered by opt            167 [   179]  ( 86.1% [ 82.9%])
    Partially covered                 8 [    14]  (  4.1% [  6.5%])
    Not covered                      19 [    23]  (  9.8% [ 10.6%])

  Dup format strings per affected crate (all crates):
      0           0  (  0.0%)
      1         113  ( 52.3%)  ██████████████████████████████
      2          43  ( 19.9%)  ███████████
      3           9  (  4.2%)  ██
      4           8  (  3.7%)  ██
      5           8  (  3.7%)  ██
      6           9  (  4.2%)  ██
      7           3  (  1.4%)  █
      8           8  (  3.7%)  ██
     9+          15  (  6.9%)  ████
    mean: 3.9  median: 1  p95: 12  max: 136

  Per-crate size increase (excl. outliers):
    Without optimization (PR #149926 alone):
      Crates:  194
      Mean:    46 B
      Median:  32 B
      p95:     144 B
      Max:     176 B
      Total:   8,912 B
    With dianne's optimization (PR #152480):
      Crates:  27
      Mean:    51 B
      Median:  32 B
      p95:     139 B
      Max:     176 B
      Total:   1,376 B

  Per-crate size increase (all crates):
    Without optimization (PR #149926 alone):
      Crates:  216
      Mean:    129 B
      Median:  32 B
      p95:     264 B
      Max:     7,184 B
      Total:   27,776 B
    With dianne's optimization (PR #152480):
      Crates:  37
      Mean:    199 B
      Median:  48 B
      p95:     797 B
      Max:     2,784 B
      Total:   7,360 B

  Top crates by residual cost (optimized vs. old):
  Crate                                     Total   w/Dup   Residual       Cost
  ──────────────────────────────────────── ──────  ──────  ─────────  ─────────
  scripty                                      90      11       176B       176B
  media_controller                             29       9       144B       144B
  refine                                      151       1       128B       176B
  claude_statusline_config                    158       6        96B        96B
  qserve                                       24       1        80B       128B
  dog                                          28       4        64B        64B
  llmnop                                       38       4        64B        64B
  mevlog                                      323       2        64B        64B
  sec                                          76       2        64B        64B
  secry                                        76       2        64B        64B
  yarsi                                        10       4        64B        64B
  bridge_echo                                  66       2        48B        64B
  libgo                                        23       3        48B        48B
  chectarine                                   48       1        32B        32B
  find_sqlite                                  17       2        32B        32B
  technique                                   134       1        32B        32B
  EZDB                                        335       1        16B        16B
  cargo_playdate                              237       1        16B        16B
  cargo_vstyle                                269       2        16B        32B
  crosslink                                 2,766       4        16B        64B

  Top crates by total cost (no dedup vs. old):
  Crate                                     Total   w/Dup       Cost   Residual
  ──────────────────────────────────────── ──────  ──────  ─────────  ─────────
  ask_bayes                                    52       5       176B         0B
  cadar                                       400       6       176B         0B
  refine                                      151       1       176B       128B
  scripty                                      90      11       176B       176B
  dataviz                                     103      10       160B         0B
  function_grep                                 8       5       160B         0B
  mdbook_mermaid_animate                       53       2       160B         0B
  crowbook                                    298       6       144B         0B
  media_controller                             29       9       144B       144B
  mollendorff_forge                         1,275       6       144B         0B
  tissue                                        5       2       144B         0B
  agnix_lsp                                   109       8       128B         0B
  ati                                         748       6       128B         0B
  baibot                                      525       8       128B         0B
  colorgen_nvim                                26       7       128B         0B
  gonidium                                    224       8       128B         0B
  lumen_sqlite_mcp                            204       5       128B         0B
  qserve                                       24       1       128B        80B
  subxt                                       165       5       128B         0B
  visualize_boost_pads                          3       2       128B         0B

OUTLIER ANALYSIS
────────────────
  Detection: IQR method on per-crate total cost (no dedup vs. old)
  Q1: 16 B  Q3: 84 B  IQR: 68 B  Threshold: Q3 + 1.5*IQR = 186 B

  22 outlier crates excluded from figures above:
    command                                  cost:    192 B  residual:    192 B  (8 dup fmt strings)
    actr_cli                                 cost:    208 B  residual:     16 B  (2 dup fmt strings)
    party                                    cost:    208 B  residual:      0 B  (7 dup fmt strings)
    ralph_workflow                           cost:    208 B  residual:      0 B  (8 dup fmt strings)
    michelson_ast                            cost:    224 B  residual:      0 B  (6 dup fmt strings)
    neuralnyx                                cost:    224 B  residual:      0 B  (6 dup fmt strings)
    ureeves_wasmtime                         cost:    224 B  residual:      0 B  (8 dup fmt strings)
    aws_mock                                 cost:    240 B  residual:      0 B  (8 dup fmt strings)
    fb2epub                                  cost:    240 B  residual:     80 B  (11 dup fmt strings)
    interthread                              cost:    256 B  residual:      0 B  (8 dup fmt strings)
    oxker                                    cost:    256 B  residual:     48 B  (15 dup fmt strings)
    build_script_build                       cost:    288 B  residual:      0 B  (13 dup fmt strings)
    nectar                                   cost:    320 B  residual:      0 B  (20 dup fmt strings)
    rift_lint                                cost:    352 B  residual:    352 B  (12 dup fmt strings)
    ci_config_tests                          cost:    512 B  residual:     16 B  (25 dup fmt strings)
    beast                                    cost:    688 B  residual:    480 B  (17 dup fmt strings)
    leak                                     cost:    704 B  residual:    656 B  (23 dup fmt strings)
    superttt                                 cost:  1,056 B  residual:      0 B  (2 dup fmt strings)
    lib_flutter_rust_bridge_codegen          cost:  1,136 B  residual:      0 B  (39 dup fmt strings)
    incodoc_ssg                              cost:  1,360 B  residual:  1,360 B  (25 dup fmt strings)
    pikpaktui                                cost:  2,784 B  residual:  2,784 B  (34 dup fmt strings)
    hddsgen                                  cost:  7,184 B  residual:      0 B  (136 dup fmt strings)

SUMMARY
───────
  Across 10,885 crates with 356,205 unique format_args! invocations
  (22 outlier crates excluded; [brackets] include all):

  -   0.1% [  0.2%] of format strings have duplicated captures (404 [837]).
  -   0.8% [  1.4%] of implicit-capture format strings have duplicated captures.
  - 194 [216] crates (  1.8% [  2.0%]) have at least one format string with duplication.

  - dianne's optimization recovers  84.6% [ 73.5%] of all duplicate arg slots (471 [1,276] of 557 [1,736]).
  - Residual cost: 1,376 [7,360] bytes across the ecosystem.

  - 167 [179] of 194 [216] affected crates ( 86.1% [ 82.9%]) are fully covered by the optimization.
  - Constants (not recoverable): 69 [249] unique duplicated constant names.

@Skgland
Copy link
Copy Markdown
Contributor

Skgland commented Mar 27, 2026

Based on an instrumented top-10k crater run (in #154205, which includes 10,885 crates), here's what I found.

For awareness there is a crater issue regarding top-{n} not actually testing the top-n crates:
rust-lang/crater#813

@theemathas
Copy link
Copy Markdown
Contributor

checks

Yup, it doesn't even compile majorly-used crates like serde or libc. The data is invalid. 🫠

@traviscross
Copy link
Copy Markdown
Contributor

OK. I'll schedule a full crater run then.

@traviscross
Copy link
Copy Markdown
Contributor

Looking at the data from an instrumented full crater run, the numbers (now have better confidence intervals and) aren't meaningfully different than what was found running it against an arbitrary set of 10k crates. Most importantly:

  • Without any deduplication:
    • 98.32% of crates aren't affected at all.
    • 99.80% of crates are affected by 128 or fewer bytes.
    • 99.98% of crates are affected by 1024 bytes or fewer bytes.
  • With @dianne's optimization:
    • 99.66% of crates aren't affected at all.
    • 99.97% of crates are affected by 128 or fewer bytes.
    • 99.997% of crates are affected by 1024 or fewer bytes.

I.e., the effect of not deduplicating at all remains ε-zero.

Full report
format_args! Deduplication Analysis
====================================

  Source: 21,663,853 unique invocations across 876,069 crates
  (45,189,980 raw lines; dedup ratio: 2.09x)
  Dedup method: loc-based (file:line:col)
  (3 lines skipped due to parse errors)

BASELINE
────────
Total invocations                    21,663,853  (100.0%)
  Interpolating (ph > 0)             15,175,497  ( 70.0%)
  Fixed strings (ph = 0)              6,488,356  ( 30.0%)

  Arguments per invocation:
      0   6,488,286  ( 29.9%)  ██████████████████
      1  11,091,158  ( 51.2%)  ██████████████████████████████
      2   2,828,732  ( 13.1%)  ████████
      3     785,860  (  3.6%)  ██
      4     261,722  (  1.2%)  █
      5      99,540  (  0.5%)  █
      6      47,761  (  0.2%)  █
      7      21,658  (  0.1%)  █
      8      13,847  (  0.1%)  █
     9+      25,289  (  0.1%)  █
    mean: 1.0  median: 1  p95: 3  max: 838

CAPTURE MECHANICS
─────────────────
  Of 15,175,497 interpolating invocations:
    Implicit capture                  3,490,512  ( 23.0%)
    Explicit named                      367,723  (  2.4%)
    Positional                       11,646,198  ( 76.7%)
    Width/precision arg                  43,529  (  0.3%)

DUPLICATION
───────────
  1765 outlier crates excluded; [brackets] include all.

  Invocations with duplicate captures:
    of all                       21,280 [42,873]  (  0.1% [  0.2%])
    of interpolating             21,280 [42,873]  (  0.1% [  0.3%])
    of implicit-capture          21,280 [42,873]  (  0.7% [  1.2%])

  Extra arg slots from duplication: 27,279 [87,732] total
    mean: 1.3 [2.0] per affected invocation

  Extra slots per invocation (all crates):
      0           0  (  0.0%)  
      1      30,105  ( 70.2%)  ██████████████████████████████
      2       6,443  ( 15.0%)  ██████
      3       2,595  (  6.1%)  ███
      4       1,269  (  3.0%)  █
      5         614  (  1.4%)  █
      6         446  (  1.0%)  █
      7         237  (  0.6%)  █
      8         173  (  0.4%)  █
     9+         991  (  2.3%)  █

  Multiplicity of duplicated names (23,179 [52,456] total):
    2x                           20,462 [40,925]  ( 88.3% [ 78.0%])
    3x                            1,866 [ 6,012]  (  8.1% [ 11.5%])
    4+x                             851 [ 5,519]  (  3.7% [ 10.5%])

  Resolution of duplicated names:
    Places (recovered):     19,804 [44,983]
    Constants:               3,344 [7,411]
    Const parameters:           23 [27]
    Other/unknown:               8 [35]

OPTIMIZATION
────────────
  1765 outlier crates excluded; [brackets] include all.

  Of 21,280 [42,873] invocations with duplicates:
    Fully recovered              17,860 [36,284]  ( 83.9% [ 84.6%])
    Partially recovered              67 [   334]  (  0.3% [  0.8%])
    Not recovered                 3,353 [ 6,255]  ( 15.8% [ 14.6%])

  Arg slots: 23,184 [74,431] recovered of 27,279 [87,732] total  ( 85.0% [ 84.8%])

  Size (argument arrays only):
                                          Excl. outliers            All crates
    Old world (current stable):            329,776,400 B         343,376,416 B
    No dedup (PR #149926):           +436,464 B (+0.13%)  +1,403,712 B (+0.41%)
    Optimized (PR #152480):           +65,520 B (+0.02%)   +212,816 B (+0.06%)

  Optimization recovers 85.0% [84.8%] of the size cost.

PER-CRATE
─────────
  1765 outlier crates excluded; [brackets] include all.

  Crates w/ duplication          12,984 [14,749]  (  1.5% [  1.7%])
    Fully covered by opt         10,411 [11,807]  ( 80.2% [ 80.1%])
    Partially covered               270 [   478]  (  2.1% [  3.2%])
    Not covered                   2,303 [ 2,464]  ( 17.7% [ 16.7%])

  Dup format strings per affected crate (all crates):
      0           0  (  0.0%)  
      1       8,722  ( 59.1%)  ██████████████████████████████
      2       2,506  ( 17.0%)  █████████
      3       1,026  (  7.0%)  ████
      4         615  (  4.2%)  ██
      5         354  (  2.4%)  █
      6         286  (  1.9%)  █
      7         190  (  1.3%)  █
      8         233  (  1.6%)  █
     9+         817  (  5.5%)  ███
    mean: 2.9  median: 1  p95: 9  max: 426

  Per-crate size increase (excl. outliers):
    Without optimization (PR #149926 alone):
      Crates:  12,984
      Mean:    34 B
      Median:  16 B
      p95:     96 B
      Max:     128 B
      Total:   436,464 B
    With dianne's optimization (PR #152480):
      Crates:  2,573
      Mean:    25 B
      Median:  16 B
      p95:     80 B
      Max:     128 B
      Total:   65,520 B

  Per-crate size increase (all crates):
    Without optimization (PR #149926 alone):
      Crates:  14,749
      Mean:    95 B
      Median:  32 B
      p95:     288 B
      Max:     82,912 B
      Total:   1,403,712 B
    With dianne's optimization (PR #152480):
      Crates:  2,942
      Mean:    72 B
      Median:  16 B
      p95:     240 B
      Max:     8,752 B
      Total:   212,816 B

  Cumulative per-crate cost (all crates, incl. unaffected):
    Threshold          No dedup (#149926)     Optimized (#152480)
    ────────────── ──────────────────────  ──────────────────────
    = 0 B               861,320 ( 98.32%)       873,127 ( 99.66%)
    ≤ 16 B              868,403 ( 99.12%)       875,088 ( 99.89%)
    ≤ 32 B              871,024 ( 99.42%)       875,394 ( 99.92%)
    ≤ 64 B              872,967 ( 99.65%)       875,625 ( 99.95%)
    ≤ 128 B             874,304 ( 99.80%)       875,807 ( 99.97%)
    ≤ 256 B             875,265 ( 99.91%)       875,933 ( 99.98%)
    ≤ 512 B             875,726 ( 99.96%)       875,998 ( 99.99%)
    ≤ 1,024 B           875,918 ( 99.98%)       876,039 (100.00%)

  Top crates by residual cost (optimized vs. old):
  Crate                                     Total   w/Dup   Residual       Cost
  ──────────────────────────────────────── ──────  ──────  ─────────  ─────────
  Dicklesworthstone/surface-dial-rust::...    211       8       128B       128B
  SantiagoLopezDeharo/nestrs-cli::base_...     45       8       128B       128B
  Sys-Redux/stringr-rust-text-editor::s...    110       8       128B       128B
  ar-reshaper@1.5.0::reshaping_02              23       6       128B       128B
  banga/craytracer::craytracer                 90       4       128B       128B
  bin_file@0.1.4::bin_file                    125       3       128B       128B
  canban@0.1.1::canban                         61       6       128B       128B
  globau/git-auto-commit::git_auto_commit     123       3       128B       128B
  michellviu/HULK-Compiler::parser            124       6       128B       128B
  musli-macros@0.1.4::musli_macros             60       8       128B       128B
  nelowth/noer22::noer22                      109       1       128B       128B
  odysa/mini-code::tui                         34       8       128B       128B
  opertifelipe/chatgpt-cli::chatgpt            39       4       128B       128B
  thearnavrustagi/canban::canban               63       6       128B       128B
  ts-rust-helper@0.11.0::ts_rust_helper        17       4       128B       128B
  JerryW35/bitx::bitx                          47       3       112B       112B
  Jreeves0908/imessage::imessage_database     200       8       112B       128B
  Possseidon/ficsit-math::ficsit_math          26       5       112B       112B
  Rimbick01/GPU-rustnoob::interp                3       1       112B       112B
  TheMagitian/xoxo::xoxo_client                13       3       112B       112B

  Top crates by total cost (no dedup vs. old):
  Crate                                     Total   w/Dup       Cost   Residual
  ──────────────────────────────────────── ──────  ──────  ─────────  ─────────
  0xAEQI/sigil::sigil                         529       3       128B         0B
  1776686596/soft_management::softmgr         283       2       128B         0B
  ArturKovacs/jsrs::jsrs                       50       4       128B         0B
  BadMannersXYZ/htmx-ssh-games::htmx_ss...     64       6       128B         0B
  Be-Infinitum/monetizaai-cli::monetizaai     173       4       128B         0B
  BlockBlazeDev/Rustup::test_bonanza          256       7       128B        32B
  Broken-Deer/Amethyst-Launcher-Core::c...    102       6       128B         0B
  BuzzZ80/general_relativity::general_r...     12       1       128B         0B
  C0D3-M4513R/ux3::ux3_macros                  22       6       128B         0B
  Calcoph/rs-hexpyt::rs_hexpyt                112       8       128B         0B
  Codykilpatrick/webway::spear_gen            122       8       128B         0B
  ConaryLabs/Mira::mira                     2,219       6       128B         0B
  Dicklesworthstone/frankenscipy::evide...     16       8       128B         0B
  Dicklesworthstone/frankenscipy::fsci_...    231       5       128B         0B
  Dicklesworthstone/surface-dial-rust::...    211       8       128B       128B
  EpicEric/multipaint_by_numbers::multi...     55       6       128B         0B
  Fish-o/whily::whily                          58       8       128B         0B
  Fundevoge/aoc2023::d12                        8       2       128B         0B
  HiggRn/resql::cli                             5       4       128B         0B
  IvanLi-CN/dockrev::dockrev                1,202       5       128B         0B

OUTLIER ANALYSIS
────────────────
  Detection: IQR method on per-crate total cost (no dedup vs. old)
  Q1: 16 B  Q3: 64 B  IQR: 48 B  Threshold: Q3 + 1.5*IQR = 136 B

  1765 outlier crates excluded from figures above:
    7wdigistruct/wclaw::wclaw                cost:    144 B  residual:      0 B  (3 dup fmt strings)
    AffazHussain/vendors_analysis::zeroclaw  cost:    144 B  residual:      0 B  (7 dup fmt strings)
    Boreas618/teac::teac                     cost:    144 B  residual:      0 B  (9 dup fmt strings)
    Bowarc/rosu::rosu                        cost:    144 B  residual:    144 B  (2 dup fmt strings)
    Damonpnl/Arbitrage-Event-Engine::perf... cost:    144 B  residual:      0 B  (1 dup fmt strings)
    DanConwayDev/ngit-cli::ngit              cost:    144 B  residual:      0 B  (9 dup fmt strings)
    Event-Horizon/rustytasks::rusty_tasks    cost:    144 B  residual:      0 B  (3 dup fmt strings)
    HasSak-47/generator::genlib              cost:    144 B  residual:      0 B  (9 dup fmt strings)
    Hendler/threebody::threebody_discover    cost:    144 B  residual:      0 B  (7 dup fmt strings)
    JO3ALT/kPascal::kpascal                  cost:    144 B  residual:      0 B  (5 dup fmt strings)
    LatencyIndex/usda_parser::usda_parser    cost:    144 B  residual:      0 B  (1 dup fmt strings)
    LinuxDicasPro/ALPack::ALPack             cost:    144 B  residual:      0 B  (4 dup fmt strings)
    Michcioperz/aoc2023::build_script_build  cost:    144 B  residual:      0 B  (3 dup fmt strings)
    Minigugus/skrull::skrull                 cost:    144 B  residual:      0 B  (9 dup fmt strings)
    Nazariglez/anvyl::test_runner            cost:    144 B  residual:    144 B  (5 dup fmt strings)
    PotLock/zerobuild::zerobuild             cost:    144 B  residual:      0 B  (7 dup fmt strings)
    Qiacoriander/circomspect-privacy::cir... cost:    144 B  residual:      0 B  (9 dup fmt strings)
    Schreiry/Flust::flust                    cost:    144 B  residual:      0 B  (8 dup fmt strings)
    Stefanuk12/jackbox_megapicker_patcher... cost:    144 B  residual:      0 B  (2 dup fmt strings)
    StevenBtw/graphos::query_bench           cost:    144 B  residual:      0 B  (4 dup fmt strings)
    // ...many lines omitted...
    Idan3011/vigilo::vigilo                  cost:  1,264 B  residual:  1,072 B  (40 dup fmt strings)
    wcpopup@0.9.3::wcpopup                   cost:  1,264 B  residual:  1,136 B  (11 dup fmt strings)
    uika-codegen@0.1.0::uika_codegen         cost:  1,280 B  residual:      0 B  (48 dup fmt strings)
    outlines-core@0.2.14::outlines_core      cost:  1,296 B  residual:      0 B  (15 dup fmt strings)
    aryamurray/harbour::harbour              cost:  1,328 B  residual:      0 B  (21 dup fmt strings)
    Vanaras-AI/a2g-cli::a2g                  cost:  1,344 B  residual:      0 B  (20 dup fmt strings)
    andrewthecodertx/rust-6502-emulator::... cost:  1,360 B  residual:  1,360 B  (17 dup fmt strings)
    incodoc-ssg@0.2.0::incodoc_ssg           cost:  1,360 B  residual:  1,360 B  (25 dup fmt strings)
    davnavr/wasm2rs::wasm2rs                 cost:  1,376 B  residual:      0 B  (86 dup fmt strings)
    aprender@0.27.5::aprender                cost:  1,392 B  residual:      0 B  (18 dup fmt strings)
    clap_builder@4.6.0::clap_builder         cost:  1,408 B  residual:      0 B  (67 dup fmt strings)
    marcelocantos/rustuml::rustuml_render    cost:  1,408 B  residual:     16 B  (51 dup fmt strings)
    trexio@2.5.0::build_script_build         cost:  1,408 B  residual:      0 B  (13 dup fmt strings)
    wasmtime-internal-wit-bindgen@43.0.0:... cost:  1,408 B  residual:      0 B  (25 dup fmt strings)
    dustsoftware/wit-bindgen::wit_bindgen... cost:  1,424 B  residual:      0 B  (17 dup fmt strings)
    getforma-dev/kmd::kmd                    cost:  1,424 B  residual:      0 B  (43 dup fmt strings)
    wit-bindgen-rust@0.54.0::wit_bindgen_... cost:  1,424 B  residual:      0 B  (17 dup fmt strings)
    dojo-cairo-macros@1.7.0::dojo_cairo_m... cost:  1,440 B  residual:      0 B  (10 dup fmt strings)
    prep@0.2.0::prep                         cost:  1,440 B  residual:      0 B  (13 dup fmt strings)
    ylow/SFrameRust::sframe_query            cost:  1,440 B  residual:      0 B  (7 dup fmt strings)
    pokety/rust-gestor::rust_gestor          cost:  1,456 B  residual:    208 B  (30 dup fmt strings)
    dustsoftware/clap::clap_builder          cost:  1,472 B  residual:      0 B  (68 dup fmt strings)
    lean-ctx@2.1.1::lean_ctx                 cost:  1,472 B  residual:  1,360 B  (48 dup fmt strings)
    mbid/rumpelpod::cli                      cost:  1,472 B  residual:  1,456 B  (31 dup fmt strings)
    mwillsey/egg-smol::egglog                cost:  1,472 B  residual:      0 B  (19 dup fmt strings)
    stonerfish/clap::clap_builder            cost:  1,472 B  residual:      0 B  (68 dup fmt strings)
    weavefoundry/weaveffi::weaveffi_gen_node cost:  1,472 B  residual:      0 B  (51 dup fmt strings)
    reat@0.1.0::reat                         cost:  1,488 B  residual:  1,488 B  (38 dup fmt strings)
    cairo-lang-starknet@2.17.0-rc.4::cair... cost:  1,504 B  residual:    800 B  (25 dup fmt strings)
    Dicklesworthstone/fastapi_rust::fasta... cost:  1,536 B  residual:  1,328 B  (62 dup fmt strings)
    fastapi-output@0.2.1::fastapi_output     cost:  1,536 B  residual:  1,328 B  (62 dup fmt strings)
    hedl-core@2.0.0::property_tests          cost:  1,552 B  residual:      0 B  (57 dup fmt strings)
    clear-crab/package-manager::cargo        cost:  1,568 B  residual:      0 B  (83 dup fmt strings)
    headless-test-customer/cargo::cargo      cost:  1,568 B  residual:     16 B  (77 dup fmt strings)
    rust-lagit1/cargo::cargo                 cost:  1,568 B  residual:     16 B  (77 dup fmt strings)
    n0madic/py2rust::py2rust                 cost:  1,584 B  residual:      0 B  (37 dup fmt strings)
    pilota-build@0.13.5::pilota_build        cost:  1,600 B  residual:      0 B  (31 dup fmt strings)
    soroban-wasmi@0.36.1-soroban.22.0.0::... cost:  1,600 B  residual:      0 B  (64 dup fmt strings)
    eulumdat@0.6.0::eulumdat                 cost:  1,616 B  residual:      0 B  (98 dup fmt strings)
    forgejo-cli@0.4.1::fj                    cost:  1,616 B  residual:      0 B  (46 dup fmt strings)
    holochain_scaffolding_cli@0.600.3-rc.... cost:  1,616 B  residual:      0 B  (22 dup fmt strings)
    Loreaxe/XenonRecomp_Rust::xenon_recomp   cost:  1,664 B  residual:      0 B  (40 dup fmt strings)
    kenken64/clawmacdo::clawmacdo_provision  cost:  1,680 B  residual:      0 B  (24 dup fmt strings)
    IliasElQ/Atlas::atlas                    cost:  1,696 B  residual:  1,696 B  (42 dup fmt strings)
    wasmi_ir@2.0.0-beta.2::build_script_b... cost:  1,696 B  residual:      0 B  (13 dup fmt strings)
    pgmold-sqlparser@0.60.3::sqlparser       cost:  1,728 B  residual:      0 B  (21 dup fmt strings)
    spacetimedb-codegen@1.3.0::spacetimed... cost:  1,728 B  residual:     64 B  (24 dup fmt strings)
    sqlparser@0.61.0::sqlparser              cost:  1,728 B  residual:      0 B  (21 dup fmt strings)
    magicnight/chaos-engine::chaos           cost:  1,760 B  residual:  1,664 B  (43 dup fmt strings)
    martinmares/postgres-explorer::postgr... cost:  1,760 B  residual:      0 B  (25 dup fmt strings)
    omkar806/agentic_terminal::agterm        cost:  1,760 B  residual:  1,712 B  (57 dup fmt strings)
    sqltk-parser@0.56.0-cipherstash.2::sq... cost:  1,760 B  residual:      0 B  (23 dup fmt strings)
    wansatya/akaldb::akaldb                  cost:  1,760 B  residual:      0 B  (22 dup fmt strings)
    Nevermore/prep::prep                     cost:  1,792 B  residual:      0 B  (15 dup fmt strings)
    svelte-compiler@0.1.4::svelte_compiler   cost:  1,824 B  residual:      0 B  (61 dup fmt strings)
    Saaquin/IronTrack::irontrack             cost:  1,840 B  residual:      0 B  (2 dup fmt strings)
    cargo@0.95.0::cargo                      cost:  1,888 B  residual:     16 B  (97 dup fmt strings)
    edict-cli@0.20.2::edict                  cost:  1,920 B  residual:      0 B  (4 dup fmt strings)
    rust-lang-nursery/rustup.rs::rustup      cost:  1,920 B  residual:  1,680 B  (34 dup fmt strings)
    dioxus-maplibre@0.0.7::dioxus_maplibre   cost:  2,016 B  residual:      0 B  (32 dup fmt strings)
    penta2himajin/oxidtr::oxidtr             cost:  2,032 B  residual:      0 B  (104 dup fmt strings)
    dustsoftware/wit-bindgen::wit_bindgen_go cost:  2,080 B  residual:     16 B  (19 dup fmt strings)
    wit-bindgen-go@0.54.0::wit_bindgen_go    cost:  2,080 B  residual:     16 B  (19 dup fmt strings)
    HBcao233/grammers-python::grammers_tl... cost:  2,096 B  residual:      0 B  (27 dup fmt strings)
    ccusage-rs@0.2.1::ccusage_rs             cost:  2,096 B  residual:      0 B  (8 dup fmt strings)
    actions-marketplace-validations/houge... cost:  2,128 B  residual:  2,048 B  (16 dup fmt strings)
    auto-gitmoji@0.1.2::amoji                cost:  2,176 B  residual:  2,176 B  (2 dup fmt strings)
    dustsoftware/wit-bindgen::wit_bindgen... cost:  2,192 B  residual:      0 B  (45 dup fmt strings)
    uniffi-bindgen-dart@0.1.3::uniffi_bin... cost:  2,208 B  residual:      0 B  (104 dup fmt strings)
    dustsoftware/wit-bindgen::wit_bindgen... cost:  2,256 B  residual:      0 B  (13 dup fmt strings)
    legalis-de@0.1.4::legalis_de             cost:  2,256 B  residual:      0 B  (105 dup fmt strings)
    wit-bindgen-wrpc-rust@0.10.0::wit_bin... cost:  2,256 B  residual:      0 B  (28 dup fmt strings)
    isographlabs/isograph::artifact_content  cost:  2,272 B  residual:      0 B  (32 dup fmt strings)
    agcp@1.3.0::agcp                         cost:  2,288 B  residual:  2,288 B  (3 dup fmt strings)
    tamaroning/sydbox::syd_test              cost:  2,288 B  residual:      0 B  (74 dup fmt strings)
    Strawberry-Foundations/somgr::somgr      cost:  2,304 B  residual:  2,304 B  (14 dup fmt strings)
    BetterCrusader/Volta::volta              cost:  2,320 B  residual:      0 B  (83 dup fmt strings)
    ssv@0.1.0::ssv                           cost:  2,320 B  residual:  2,320 B  (58 dup fmt strings)
    1jehuang/mermaid-rs-renderer::mermaid... cost:  2,384 B  residual:      0 B  (46 dup fmt strings)
    mermaid-rs-renderer@0.2.1::mermaid_rs... cost:  2,384 B  residual:      0 B  (46 dup fmt strings)
    wit-bindgen-moonbit@0.54.0::wit_bindg... cost:  2,432 B  residual:      0 B  (15 dup fmt strings)
    rdpe@0.1.0::rdpe                         cost:  2,544 B  residual:      0 B  (79 dup fmt strings)
    ascii-fmt@0.1.2::integration_test        cost:  2,592 B  residual:  2,512 B  (31 dup fmt strings)
    aofctl@0.4.0-beta::aofctl                cost:  2,704 B  residual:  2,576 B  (47 dup fmt strings)
    macroforge_ts@0.1.77::macroforge_ts      cost:  2,704 B  residual:      0 B  (61 dup fmt strings)
    wit-bindgen-csharp@0.54.0::wit_bindge... cost:  2,752 B  residual:      0 B  (50 dup fmt strings)
    pikpaktui@0.0.55::pikpaktui              cost:  2,784 B  residual:  2,784 B  (34 dup fmt strings)
    legalis-sg@0.1.4::legalis_sg             cost:  2,800 B  residual:      0 B  (88 dup fmt strings)
    rpu@0.3.0::rpu                           cost:  2,976 B  residual:      0 B  (6 dup fmt strings)
    dbn@0.52.0::dbn                          cost:  3,008 B  residual:    160 B  (13 dup fmt strings)
    kenken64/clawmacdo::clawmacdo            cost:  3,056 B  residual:    144 B  (51 dup fmt strings)
    panzi/progress-pride-bar::progress_pr... cost:  3,360 B  residual:      0 B  (18 dup fmt strings)
    proof-engine@0.1.1::proof_engine         cost:  3,424 B  residual:      0 B  (99 dup fmt strings)
    rusty-bind-parser@0.3.7::rusty_bind_p... cost:  3,552 B  residual:    800 B  (49 dup fmt strings)
    vest@0.1.5::vest                         cost:  3,568 B  residual:      0 B  (46 dup fmt strings)
    ros2-msg-gen@0.2.7::ros2_msg_gen         cost:  3,920 B  residual:      0 B  (11 dup fmt strings)
    safe_drive_msg@0.2.6::safe_drive_msg     cost:  3,952 B  residual:      0 B  (13 dup fmt strings)
    ferro-cli@0.1.88::ferro                  cost:  4,256 B  residual:      0 B  (31 dup fmt strings)
    dustsoftware/wit-bindgen::wit_bindgen_c  cost:  4,336 B  residual:      0 B  (29 dup fmt strings)
    herabit/swario::codegen                  cost:  4,416 B  residual:      0 B  (13 dup fmt strings)
    twestura/RMS-Preprocessor::rms_prepro... cost:  5,120 B  residual:      0 B  (165 dup fmt strings)
    hddsgen@1.0.12::hddsgen                  cost:  7,184 B  residual:      0 B  (136 dup fmt strings)
    js-component-bindgen@1.16.4::js_compo... cost:  9,728 B  residual:      0 B  (170 dup fmt strings)
    RasmusBruhn/termite-dmg::termite_dmg     cost: 10,848 B  residual:  8,752 B  (43 dup fmt strings)
    termite-dmg@0.6.0::termite_dmg           cost: 10,848 B  residual:  8,752 B  (43 dup fmt strings)
    legalis-la@0.1.4::legalis_la             cost: 11,040 B  residual:      0 B  (377 dup fmt strings)
    neofetch@0.2.0::neofetch                 cost: 46,976 B  residual:      0 B  (107 dup fmt strings)
    ahaoboy/neofetch::neofetch               cost: 46,992 B  residual:      0 B  (108 dup fmt strings)
    imazen/archmage::xtask                   cost: 82,912 B  residual:  1,904 B  (426 dup fmt strings)


SUMMARY
───────
  Across 876,069 crates with 21,663,853 unique format_args! invocations
  (1765 outlier crates excluded; [brackets] include all):

  -   0.1% [  0.2%] of format strings have duplicated captures (21,280 [42,873]).
  -   0.7% [  1.2%] of implicit-capture format strings have duplicated captures.
  - 12,984 [14,749] crates (  1.5% [  1.7%]) have at least one format string with duplication.

  - dianne's optimization recovers  85.0% [ 84.8%] of all duplicate arg slots (23,184 [74,431] of 27,279 [87,732]).
  - Residual cost: 65,520 [212,816] bytes across the ecosystem.

  - 10,411 [11,807] of 12,984 [14,749] affected crates ( 80.2% [ 80.1%]) are fully covered by the optimization.
  - Constants (not recoverable): 3,344 [7,411] unique duplicated constant names.

@m-ou-se
Copy link
Copy Markdown
Member

m-ou-se commented Apr 7, 2026

The vast majority of the projects where this is important, are small embedded projects where binary size is important. You won't find much of those on crates.io. There are plenty of library crates for embedded on crates.io, but those often don't use much string formatting themselves; that's something that usually happens mostly in the actual binary projects which are often not open source.

There are many changes we could make to Rust that don't meaningfully impact most crates, but that doesn't mean we should purposely cause regressions.

I will dread the day that I have to explain to someone that adding a 'meaningless' , X=X to their code will magically improve binary size. People (should) have higher expectations of Rust.

The cost of not doing deduplication at all (i.e., not doing dianne's optimization) on the args: &[rt::Argument<'_>] arrays referenced by the Argument structs totals to +0.41%. Note that this is calculated only over the arrays; it'd be watered down by considering the overhead of the rest of the struct. Calculated over the compiler as a whole, it'd be in the noise -- the absolute difference is only 1.2KB.

The representation of fmt::Arguments is in flux and will still undergo many more improvements. See #99012

Once we no longer use wide pointers for the arguments, no longer storing the fmt function pointer in the args array, the impact will be bigger. This gets much more significant if we add the {x.a} {x.b} feature and allow ourselves to store that as a single pointer to x rather than as two separate pointers. (Which isn't possible with the current representation, but that should change in the future.)

@nikomatsakis
Copy link
Copy Markdown
Contributor

@m-ou-se Do you have any ways we could measure that impact? Maybe we can bring in voices from the embedded working group? I don't know who's the right people to cc there, but it'd be cool if people could run their own measurements and report in.

@traviscross is it easy to identify those values as a percentage of total format args, as well? and as a percentage of total binary size?

@nikomatsakis
Copy link
Copy Markdown
Contributor

I want to see if people agree with the way I'm thinking about this.

I don't like the idea of approving this contingent on the optimization. I think the optimization is too complex to be in Rust's reference; if the impact is that significant, I'd like it to be something that we guarantee. This isn't to say I don't like the optimization, I think it's "ok", I just don't like it being a key part of the logic we stand on. I'd rather we evaluate the "de-duplicated via syntactic criteria" or "not de-duplicated", which are kind of the two extremes.

The argument in favor of "don't deduplicate" is basically "this complexity isn't worth it, the impact in real life is negligible, and it cleans up the semantics corner cases".

The argument in favor of "deduplicate" is basically "Sure, it's clean in one perspective, but it's not what people will actually want most of the time, we should do what they want, particularly since the semantics are very unlikely to be surprising in practice."

I find both of these peruasive--

I don't think the impact on semantics is all that meaningful, but I also think it's easier to explain the simpler desugaring.

I don't think the impact on binary sizes is really all that significant, but I also think it's more measurable than you might think, and I'd be curious if indeed it has outsized impact on a particular population. <-- This is why i'm asking @m-ou-se about how to loop in a few more embedded devs.

(There is also the aspect of preserving existing behavior, but this doesn't strike me as clear cut, it's not a regression in the usual sense of the term, so I'm not especially persuaded by that.)

@nikomatsakis
Copy link
Copy Markdown
Contributor

nikomatsakis commented Apr 8, 2026

To follow up on something I half said in the meeting, the other half of this is: I'd like to hear @traviscross, perhaps in another forum or perhaps here, or maybe a link to it, your take on how this impacts @joshtriplett's RFC. What is the alternate syntax you would want?

To put it in terms of process, I'd kind of like the lang-team as a whole to give vibes both on the question at hand here and on the question of "suppose we decided to keep things as they are here and josh changed the RFC, what would be the concern then". Because I think we are premising somewhat on "we should do this to unblock that RFC" and I'm not entirely convinced that RFC ought to be considered blocked.

@traviscross
Copy link
Copy Markdown
Contributor

traviscross commented Apr 8, 2026

To follow up on something I half said in the meeting, the other half of this is: I'd like to hear @traviscross, perhaps in another forum or perhaps here, your take on how this impacts @joshtriplett's RFC.

If we were to not clean up the opsem on this such that interpolated expressions under the current syntax follow our expected language semantics for expressions, then I would propose, when extending the interpolation syntax to support more expressions, that we pick a distinct syntax for it (and, for that new syntax, follow our opsem for expressions).

(The further we go into supporting arbitrary expressions, the more any divergence from our generally-intended opsem for expressions would matter in practice.)

@traviscross
Copy link
Copy Markdown
Contributor

What is the alternate syntax you would want?

At the moment, I don't have any particular proposals in mind for the specific other syntax — just that it be distinct.

@steffahn
Copy link
Copy Markdown
Member

steffahn commented Apr 8, 2026

Given that deduplication is a tool the current internal representation offers (and it sounds like what @m-ou-se is saying implies that it might get even more beneficial to use deduplication in the future?) I feel it’s suboptimal if our “best”/“nicest” (and newest) syntax for creating fmt::Arguments doesn’t support it at all anymore.

So with the benefit of the newer syntax, arguably people should usually be encouraged to re-write their code from something like

let bar = 123;

format!("{} {} {0}", long_complicated_expression(), bar);

or like

let bar = 123;

format!("{foo} {} {foo}", bar, foo=long_complicated_expression(), bar);

over to the "new" style of something like:

let foo = &long_complicated_expression();
let bar = 123;

format!("{foo} {bar} {foo}");

If that new form loses deduplication, that might be a problem for code that has not yet been re-written in this way.

I wonder @traviscross if you could - for a bigger picture (and admittedly still limited to only the current benefits available from the deduplicated case of internal representation) - somehow manage to adjust your crater test to see what effects "disabling all deduplication" may have (i.e. including in cases where currently "{123}" syntax refers back to a positional argument, or "{xyz}" syntax refers to , xyz = … arguments rather than local variables).

@RalfJung
Copy link
Copy Markdown
Member

RalfJung commented Apr 8, 2026

over to the "new" style of something like:

I would not agree with that. Introducing new let-bindings just for a format string does not strike me as something we should encourage. Personally I would recommend writing this as

format!("{foo} {bar} {foo}", foo=long_complicated_expression());

@m-ou-se
Copy link
Copy Markdown
Member

m-ou-se commented Apr 8, 2026

Some of the discussion above frames this as 'consistency vs efficiency' or 'field expressions vs binary size', but I think none of that framing is correct. We can just do both.

rust-lang/rfcs#3626 proposes this works:

println!("{z.field1} {z.field2} {z.field3}", z = SomeStruct::new());

In the future, I'd like that fmt::Arguments object to only store a single pointer to that z object1, not separate pointers to each field independently. That feels consisstent with this syntax: we only see one argument here.

Given that, it feels perfectly consistent to me to have these two be equivalent:

println!("{z.field1} {z.field2} {z.field3}", z=z);
println!("{z.field1} {z.field2} {z.field3}");

So if we want rust-lang/rfcs#3626, it is entirely consistent to not make the change proposed in this PR. Which is also the thing we need for optimal efficiency.

Footnotes

  1. This requires some more work on fmt::Arguments, but that is planned.

@RalfJung
Copy link
Copy Markdown
Member

RalfJung commented Apr 8, 2026 via email

@traviscross
Copy link
Copy Markdown
Contributor

is it easy to identify those values as a percentage of total format args, as well?

Expressed per invocation (across 21,663,853 unique invocations of format_args! in 876,069 crates):

  • Without any deduplication:

    • 99.80% of invocations aren't affected at all.
    • 99.94% are affected by 16 or fewer bytes (one arg slot).
    • 99.995% are affected by 128 or fewer bytes.
  • With @dianne's optimization:

    • 99.97% of invocations aren't affected at all.
    • 99.99% are affected by 16 or fewer bytes.
    • 99.9993% are affected by 128 or fewer bytes.
Cumulative per-invocation cost (21,663,853 invocations total):
  Threshold          No dedup (#149926)     Optimized (#152480)
  ────────────── ──────────────────────  ──────────────────────
  = 0 B            21,620,980 ( 99.80%)    21,657,264 ( 99.97%)
  ≤ 16 B           21,651,085 ( 99.94%)    21,661,904 ( 99.99%)
  ≤ 32 B           21,657,528 ( 99.97%)    21,662,718 ( 99.99%)
  ≤ 64 B           21,661,392 ( 99.99%)    21,663,444 (100.00%)
  ≤ 128 B          21,662,862 (100.00%)    21,663,712 (100.00%)
  ≤ 256 B          21,663,412 (100.00%)    21,663,796 (100.00%)
  ≤ 512 B          21,663,664 (100.00%)    21,663,829 (100.00%)
  ≤ 1,024 B        21,663,802 (100.00%)    21,663,848 (100.00%)

@m-ou-se
Copy link
Copy Markdown
Member

m-ou-se commented Apr 9, 2026

I'd like that fmt::Arguments object to only store a single pointer to that z object

How is that supposed to work, especially when the fields are behind deref coercions?

One option would be for the static part of fmt::Arguments to store fn pointers to wrapped fmt functions that take &Z rather than &Field.

Whether that's a good idea or not, I don't know yet. There are plenty of arguments for and against that approach. There might be other options.

The point is that the framing of "we need this change for rust-lang/rfcs#3626" is just wrong. We don't know yet what we need for that. That's a discussion that still needs to happen. Given that that RFC proposes that format!("{z.field1} {z.field2}", z=f()) works, we need to figure out how that will work.

This PR is about fixing #145739, about a highly unusual "internal mutability in const in format" situation (that Clippy already warns about), which is quite possibly not a bug at all, and isn't worth regressing any performance over.

But the discussion has diverged and is now about how we would need this change for rust-lang/rfcs#3626 or what would be most consistent in a potential future version of Rust that includes that new feature. But we haven't even established how that RFC would work and whether this change is needed for that feature or not.

What's the expression in English? Cart before the horse?

🛒🐴

I think we should:

  • Close #145739 as "won't fix".
  • Close this PR, since we aren't fixing that "won't fix" issue by itself (in today's Rust where we don't have field access in format placeholders).
  • Refine the design of RFC 3626 to answer questions about (number of) Deref invocations, borrowing entire objects vs fields, fields of named arguments, etc. (And have that discussion on the thread of that RFC instead of mixed with #145739.)
  • And once we actually know those answers, then possibly open a PR with this change with RFC 3626 as the motivation.

Right now it feels like we're trying to get this PR through based on a gut feeling that it might help with an RFC that we haven't worked out in enough detail yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. I-lang-nominated Nominated for discussion during a lang team meeting. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

format_args deduplicates consts with interior mutability or destructor