Skip to content

[WIP] Relative VTables for Rust#144973

Draft
PiJoules wants to merge 1 commit intorust-lang:mainfrom
PiJoules:WIP-relative-vtables
Draft

[WIP] Relative VTables for Rust#144973
PiJoules wants to merge 1 commit intorust-lang:mainfrom
PiJoules:WIP-relative-vtables

Conversation

@PiJoules
Copy link
Copy Markdown
Contributor

@PiJoules PiJoules commented Aug 5, 2025

This is a WIP patch for implementing rust-lang/compiler-team#903. It adds a new unstable flag -Zexperimental-relative-rust-abi-vtables that makes vtables PIC-friendly. This is only supported for LLVM codegen and not supported for other backends.

Early feedback on this is welcome. I'm not sure if how I implemented it is the best way of doing so since much of the actual vtable emission is heavily done during LLVM codegen. That is, the vtable to MIR looks like a normal table of pointers and byte arrays and I really only make the vtables relative on the codegen level.

Locally, I can build the stage 1 compiler and runtimes with relative vtables, but I couldn't figure out how to tell the build system to only build stage 1 binaries with this flag, so I work around this by unconditionally enabling relative vtables in rustc. The end goal I think we'd like is either something akin to multilibs in clang where the compiler chooses which runtimes to use based off compilation flags, or binding this ABI to the target and have it be part of the default ABI for that target (just like how relative vtables are the default for Fuchsia in C++ with Clang). I think the later is what target modifiers do (#136966).

Action Items:

  • I'm still experimenting with building Fuchsia with this to assert it works e2e and I still need to do some measurements to see if this is still worth pursuing.
  • More work will still be needed to ensure the correct relative intrinsics are emitted with CFI and LTO. Rn I'm experimenting on a normal build.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 5, 2025
@rust-log-analyzer

This comment has been minimized.

@PiJoules PiJoules force-pushed the WIP-relative-vtables branch from 5217fd7 to 0ace3e7 Compare August 5, 2025 22:31
@rust-log-analyzer

This comment has been minimized.

@bjorn3
Copy link
Copy Markdown
Member

bjorn3 commented Aug 6, 2025

I wonder how hard it would be to store true 32bit pointers in the const eval allocation for the vtable. That would avoid all hacks elsewhere around the size mismatch between const eval and runtime.

@PiJoules PiJoules force-pushed the WIP-relative-vtables branch from 0ace3e7 to d58809f Compare August 7, 2025 21:48
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Copy Markdown
Collaborator

bors commented Oct 8, 2025

☔ The latest upstream changes (presumably #147475) made this pull request unmergeable. Please resolve the merge conflicts.

This is a WIP patch for implementing
rust-lang/compiler-team#903. It adds a new
unstable flag `-Zexperimental-relative-rust-abi-vtables` that makes
vtables PIC-friendly. This is only supported for LLVM codegen and
not supported for other backends.

Early feedback on this is welcome. I'm not sure if how I implemented it
is the best way of doing so since much of the actual vtable emission is
heavily done during LLVM codegen. That is, the vtable to MIR looks like
a normal table of pointers and byte arrays and I really only make the
vtables relative on the codegen level.

Locally, I can build the stage 1 compiler and runtimes with relative
vtables, but I couldn't figure out how to tell the build system to only
build stage 1 binaries with this flag, so I work around this by
unconditionally enabling relative vtables in rustc. The end goal I think
we'd like is either something akin to multilibs in clang where the
compiler chooses which runtimes to use based off compilation flags, or
binding this ABI to the target and have it be part of the default ABI
for that target (just like how relative vtables are the default for
Fuchsia in C++ with Clang). I think the later is what target modifiers
do (rust-lang#136966).

Action Items:

- I'm still experimenting with building Fuchsia with this to assert it
  works e2e and I still need to do some measurements to see if this is
  still worth pursuing.
- More work will still be needed to ensure the correct relative
  intrinsics are emitted with CFI and LTO. Rn I'm experimenting on a normal
  build.
@PiJoules PiJoules force-pushed the WIP-relative-vtables branch from d58809f to 6ff8b5f Compare October 30, 2025 22:36
@rust-log-analyzer
Copy link
Copy Markdown
Collaborator

The job tidy failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
All checks passed!
checking python file formatting
27 files already formatted
checking C++ file formatting
/checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp:600:64: error: code should be clang-formatted [-Wclang-format-violations]
extern "C" LLVMValueRef LLVMBuildLoadRelative(LLVMBuilderRef B, LLVMValueRef Ptr,
                                                               ^
/checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp:603:44: error: code should be clang-formatted [-Wclang-format-violations]
  Value *call = unwrap(B)->CreateIntrinsic(
                                           ^
/checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp:604:43: error: code should be clang-formatted [-Wclang-format-violations]
      Intrinsic::load_relative, {Int32Ty}, {unwrap(Ptr), unwrap(ByteOffset)});
                                          ^

clang-format linting failed! Printing diff suggestions:
--- /checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp (actual)
+++ /checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp (formatted)
@@ -596,13 +596,14 @@
     I->setHasAllowReassoc(true);
   }
 }
 
-extern "C" LLVMValueRef LLVMBuildLoadRelative(LLVMBuilderRef B, LLVMValueRef Ptr,
+extern "C" LLVMValueRef LLVMBuildLoadRelative(LLVMBuilderRef B,
+                                              LLVMValueRef Ptr,
                                               LLVMValueRef ByteOffset) {
   Type *Int32Ty = Type::getInt32Ty(unwrap(B)->getContext());
-  Value *call = unwrap(B)->CreateIntrinsic(
-      Intrinsic::load_relative, {Int32Ty}, {unwrap(Ptr), unwrap(ByteOffset)});
+  Value *call = unwrap(B)->CreateIntrinsic(Intrinsic::load_relative, {Int32Ty},
+                                           {unwrap(Ptr), unwrap(ByteOffset)});
   return wrap(call);
 }
 
 extern "C" uint64_t LLVMRustGetArrayNumElements(LLVMTypeRef Ty) {

rerun tidy with `--extra-checks=cpp:fmt --bless` to reformat C++ code
tidy [extra_checks]: checks with external tool 'clang-format' failed
tidy [extra_checks]: FAIL
tidy: The following check failed: extra_checks
Bootstrap failed while executing `test src/tools/tidy tidyselftest --extra-checks=py,cpp,js,spellcheck`
Command `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools-bin/rust-tidy /checkout /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo /checkout/obj/build 4 /node/bin/npm --extra-checks=py,cpp,js,spellcheck` failed with exit code 1
Created at: src/bootstrap/src/core/build_steps/tool.rs:1549:23
Executed at: src/bootstrap/src/core/build_steps/test.rs:1280:29

Command has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 0:01:18
  local time: Thu Oct 30 22:41:38 UTC 2025
  network time: Thu, 30 Oct 2025 22:41:38 GMT
##[error]Process completed with exit code 1.
##[group]Run echo "disk usage:"

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Feb 21, 2026

☔ The latest upstream changes (presumably #152934) made this pull request unmergeable. Please resolve the merge conflicts.

@oxalica
Copy link
Copy Markdown
Contributor

oxalica commented Mar 27, 2026

I'm testing this patch on my random crates including some vtable-heavy ones. It reduces binary size from 1% to 5%, mainly from cutting down dynamic relocations (.rela.dyn).

However, I got some SEGFAULT at runtime due to vtable layout mismatch between const-eval and runtime (as mentioned above). That is,

fn main() {
    const X: &dyn std::fmt::Display = &42i32; // absolute fnptr vtable
    println!("{X}"); // assume it is relative, oops
}

I also got a weird compile error with no further information when compiling rust-analyzer 2026-03-23, not sure if it is also caused by the layout mismatch.

error: failed to parse bitcode for LTO module: Invalid cast (Producer: 'LLVM22.1.0-rust-1.95.0-nightly' Reader: 'LLVM 22.1.0-rust-1.95.0-nightly')

error: could not compile `hir-def` (lib) due to 1 previous error

@PiJoules
Copy link
Copy Markdown
Contributor Author

I'm testing this patch on my random crates including some vtable-heavy ones. It reduces binary size from 1% to 5%, mainly from cutting down dynamic relocations (.rela.dyn).

However, I got some SEGFAULT at runtime due to vtable layout mismatch between const-eval and runtime (as mentioned above). That is,

fn main() {
    const X: &dyn std::fmt::Display = &42i32; // absolute fnptr vtable
    println!("{X}"); // assume it is relative, oops
}

I also got a weird compile error with no further information when compiling rust-analyzer 2026-03-23, not sure if it is also caused by the layout mismatch.

error: failed to parse bitcode for LTO module: Invalid cast (Producer: 'LLVM22.1.0-rust-1.95.0-nightly' Reader: 'LLVM 22.1.0-rust-1.95.0-nightly')

error: could not compile `hir-def` (lib) due to 1 previous error

My colleague Erick has a more up-to-date verison of this at main...erickt:rust:relative-vtables which should include support for building the runtimes and (hopefully) has fixes for the merge conflicts I didn't have time to address here, so you might get more luck trying that out. (Fair warning: some of those updates there were vibe-coded, but they do seem to get rustc and runtime tests passing and we can build a bunch of downstream rust projects with it.) I'll eventually come back and clean this PR up, but we're still trying to collect some numbers on the side.

It could be we missed a few cases though. If there are any runtime assumptions about the vtable ABI, then those will need to be changed as well. Same for const-eval which I'm not sure if I remember tackling in my initial patch.

@erickt
Copy link
Copy Markdown
Contributor

erickt commented Mar 28, 2026

@oxalica - thanks for trying it out! Which crates are you testing it with? As @PiJoules said, we've got this patch passing the Rust test suite, and working with servo, tokio, ripgrep, and chrome, and also showing between 0.25 to 4%-ish savings. I just need to get come performance numbers before resuming talks with the compiler team. I'd be happy to see if I can reproduce the segfaults.

@oxalica
Copy link
Copy Markdown
Contributor

oxalica commented Mar 28, 2026

Thanks both of you for the work!
@erickt

@oxalica - thanks for trying it out! Which crates are you testing it with?

A public one is palc which use &dyn heavily for field dispatching, and this patch gives 3-4% size reduction in benchmarks that only does parsing. I also want to test rust-analyzer due to their extensive usage of &dyn Database but I got the compiler error above.

As @PiJoules said, we've got this patch passing the Rust test suite, and working with servo, tokio, ripgrep, and chrome, and also showing between 0.25 to 4%-ish savings.
I'd be happy to see if I can reproduce the segfaults.

I'm testing this PR rebased onto 99246f4 which is the latest non-conflicting commit. It may be a bit out-of-date. If there are more updates (that fixes merge conflict), it would be good to push into this PR to make testing easier.

The crash happens on reqwest::get("https://[..]").await.unwrap(), during rustls initialization. I minimized it to the code snippet above, which is a mismatch between const-time vtable and runtime dyn method call.

I just need to get come performance numbers before resuming talks with the compiler team.

To me the runtime cost of vtable calls does not matter much. It is already assumed that vtable call would be slow due to the non-inline-able call and branch misprediction, and another <1cycle add instruction is nothing. The main intention of my use case is to reduce code size and the startup cost. Absolute relocations increase the work to be done during dynamic linking before main, and reduce memory share (more data in process-private .data.rel.ro instead of system-wide-shared .rodata). These footprints are not usually measured by performance tools.

@erickt
Copy link
Copy Markdown
Contributor

erickt commented Apr 15, 2026

@oxalica - I just pushed up a new version in https://github.com/erickt/rust/tree/relative-vtables that's rebased on top of rust. The big thing with it is that it tries to solve the problem that relative vtables does a breaking ABI change between the compiler and the standard library. So now you need to:

  1. Change your bootstrap.toml to add:
...
[rust]
experimental-relative-vtables = true
...
  1. Do a stage2 build (since stage1 uses the stage0 standard library to build host tools). I'd love to get rid of needing to do it though.

I haven't had a chance to test it against palc and rust-analyzer yet, I'll see if I can get that done tomorrow.

@erickt
Copy link
Copy Markdown
Contributor

erickt commented Apr 15, 2026

Got some benchmark numbers with the help of Gemini.

The stage2-with-rel is a stage2 compiled with relative vtables enabled, stage2-no-rel is without. I also benchmarked the compilation times as a rough estimate of the performance impact, which seems roughly the same. This is on a virtual machine though, with possibly other things running at the same time, so it's not a full validation of the performance impact.

1. rust Components

rustc stage2 library: librustc_driver-*.so

Section stage2-with-rel stage2-no-rel Delta
Total Size 147,812,784 bytes 146,520,056 bytes +1,292,728 bytes (+0.88%)
.data.rel.ro 1.8 MB 2.8 MB -977 KB (-35%)
.rela.dyn 2.3 MB 3.3 MB -959 KB (-28%)
.rodata 5.44 MB 4.93 MB +510 KB (+10.3%)
.text 67.5 MB 66.6 MB +906 KB (+1.3%)

rustc stage2 tools binary: rust-analyzer

Section stage2-with-rel stage2-no-rel Delta
Total Size 55,765,360 bytes 55,901,680 bytes -136,320 bytes (-0.24%)
.data.rel.ro 765 KB 1,273 KB -508 KB (-39%)
.rela.dyn 1,207 KB 2,132 KB -925 KB (-43%)
.rodata 2,811 KB 2,541 KB +270 KB (+10.6%)
.text 27,601 KB 27,337 KB +264 KB (+0.9%)

rustc stage2 tools binary: rust-analyzer-proc-macro-srv

Section stage2-with-rel stage2-no-rel Delta
Total Size 1,949,040 bytes 1,950,336 bytes -1,296 bytes (-0.07%)
.data.rel.ro 17 KB 26 KB -9 KB (-34%)
.rela.dyn 45 KB 57 KB -12 KB (-21%)
.rodata 59 KB 54 KB +5 KB (+9.2%)
.text 985 KB 985 KB No change

2. External Projects (Vendored Offline Builds)

ripgrep

  • Build Time:

    • stage2-with-rel: 15.27s
    • stage2-no-rel: 14.70s
  • Section Sizes:

Section stage2-with-rel stage2-no-rel Delta
Total Size 25,416,024 bytes 25,595,336 bytes -179,312 bytes (-0.70%)
.data.rel.ro 134,840 bytes 173,624 bytes -38,784 bytes (-22.34%)
.rela.dyn 43,440 bytes 62,232 bytes -18,792 bytes (-30.20%)
.rodata 852,884 bytes 812,756 bytes +40,128 bytes (+4.94%)
.text 2,467,737 bytes 2,460,761 bytes +6,976 bytes (+0.28%)

servo (servoshell)

  • Build Time:

    • stage2-with-rel: 5m 26.40s
    • stage2-no-rel: 5m 27.26s
  • Section Sizes:

Section stage2-with-rel stage2-no-rel Delta
Total Size 223,778,568 bytes 224,778,112 bytes -999,544 bytes (-0.44%)
.data.rel.ro 36,051,840 bytes 45,565,304 bytes -9,513,464 bytes (-20.88%)
.rela.dyn 15,221,688 bytes 18,552,240 bytes -3,330,552 bytes (-17.95%)
.rodata 32,986,680 bytes 32,627,504 bytes +359,176 bytes (+1.10%)
.text 111,263,946 bytes 111,399,786 bytes -135,840 bytes (-0.12%)

palc (Unittest Binary)

  • Build Time (Tests, No-Run):

    • stage2-with-rel: 11.80s
    • stage2-no-rel: 11.50s
  • Section Sizes:

Section stage2-with-rel stage2-no-rel Delta
Total Size 1,042,808 bytes 1,041,304 bytes +1,504 bytes (+0.14%)
.data.rel.ro 27,120 bytes 31,304 bytes -4,184 bytes (-13.37%)
.rela.dyn 47,400 bytes 54,264 bytes -6,864 bytes (-12.65%)
.rodata 57,892 bytes 55,564 bytes +2,328 bytes (+4.19%)
.text 601,554 bytes 600,098 bytes +1,456 bytes (+0.24%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants