Skip to content

Conversation

@wesleywiser
Copy link
Member

Currently, aarch64-unknown-none generates static position dependent binaries which must be loaded at a fixed address in memory. The openVMM project uses the *-unknown-none targets for an embedded-like environment but needs the binary to be loadable to any memory address. Currently, the x86_64-unknown-none target is configured to allow this by default but the aarch64-unknown-none target is not.

This commit changes the defaults for the aarch64-unknown-none target to enable static-PIE binaries which aligns the target more closely with the corresponding x86_64-unknown-none target. If users prefer the prior behavior, they can request that via -Crelocation-model=static.

cc @BartMassey as lead maintainer for this target in REWG

@rustbot
Copy link
Collaborator

rustbot commented Dec 4, 2025

These commits modify compiler targets.
(See the Target Tier Policy.)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 4, 2025
@rustbot
Copy link
Collaborator

rustbot commented Dec 4, 2025

r? @chenyukang

rustbot has assigned @chenyukang.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sensible to me if target maintainers are on board. Feel free to r=me if so.

View changes since this review

@jieyouxu
Copy link
Member

jieyouxu commented Dec 4, 2025

(Holding off on tagging this with relnotes in case target maintainers don't want to make this change.)

@jieyouxu jieyouxu assigned jieyouxu and unassigned chenyukang Dec 4, 2025
@smmalis37
Copy link
Contributor

Should the same change be made to aarch64-unknown-none-softfloat too?

@rust-log-analyzer

This comment has been minimized.

@BartMassey
Copy link
Contributor

If this is to be done, it should be done to aarch64-unknown-none-softfloat also.

I assume the goal is compiling libcore PIE by default? (Otherwise for user programs I guess -C relocation-model=pie would solve the problem?) I sure can't wait for build-std to stabilize!

I'm not sure who this might affect: let me add it to the agenda for the REWG Meeting next Tuesday if that's OK and we'll take it from there once I've discussed it with folks. I think it shouldn't hurt users who are currently using static executables, apart from maybe minor code size impact (one way or the other)?

@jieyouxu
Copy link
Member

jieyouxu commented Dec 4, 2025

@rustbot blocked (on target maintainers having a discussion)

@rustbot rustbot added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 4, 2025
@BartMassey
Copy link
Contributor

On the REWG agenda for next week.

@jonathanpallant
Copy link
Contributor

I have concerns here about who is going to setup/adjust the Global Offset Table that PIE will introduce.

Let me go and ping the RustedFirmware-A maintainers, who use this target, and the Ferrocene people, who qualify (and test) this target for safety-critical use-cases.

@bjorn3
Copy link
Member

bjorn3 commented Dec 4, 2025

I have concerns here about who is going to setup/adjust the Global Offset Table that PIE will introduce.

You can still tell the linker to produce a static executable I think, in which case the GOT would be statically initialized. And linker relaxation may even get rid of part of the GOT accesses entirely. If you don't change anything and let it produce a static-PIE executable, I believe the GOT is statically initialized to the value it would need to have if you load the executable at the same place as indicated in the ELF file (aka the relocation slide is 0). If you actually take advantage of the ability to relocate the executable at runtime, then you will need to write a loader that applies the GOT relocations. This could be the bootloader or inline assembly linked into the executable itself that is used as entrypoint. Or if you are very careful and don't mind the UB, you can write the relocation code in rust too. https://github.com/sunfishcode/origin/blob/main/src/relocate.rs

@thejpster
Copy link
Contributor

So, it looks like this change would add a new .got section, which if you don't tell your linker script to do anything special with it, seems to end up in RAM along with .data and .bss, and yes, it appears to be full of valid memory addresses.

So worst case, programs get larger. But to be honest, AArch64 machines often have quite a lot of memory.

@CJKay
Copy link

CJKay commented Dec 4, 2025

A few questions to help me understand the impact on RF-A and projects like it. Assuming we control code generation for the entire program, and that we do not explicitly enable PIC for any part of it:

  1. Before this change, can we safely discard the GOT/PLT in the linker script?
  2. Before this change, can we safely ignore the GOT/PLT in start-up code?
  3. After this change, can we safely discard the GOT/PLT in the linker script?
  4. After this change, can we safely ignore the GOT/PLT in start-up code?
  5. After this change, can we safely discard the GOT/PLT in the linker script when passing -C relocation-model=static?
  6. After this change, can we safely ignore the GOT/PLT in start-up code when passing -C relocation-model=static?

But to be honest, AArch64 machines often have quite a lot of memory.

@thejpster Ehh... they tend to have quite a lot of DRAM, but SRAM is a different story.

@Darksonn
Copy link
Contributor

Darksonn commented Dec 4, 2025

It looks like Rust for Linux already passes -Crelocation-model=static.

cc @ojeda

@jonathanpallant
Copy link
Contributor

Does passing that option require build-std?

@ojeda
Copy link
Contributor

ojeda commented Dec 4, 2025

It looks like Rust for Linux already passes -Crelocation-model=static.

cc @ojeda

+1, we do, unconditionally.

@bjorn3
Copy link
Member

bjorn3 commented Dec 4, 2025

@CJKay AFAIK:

  1. yes
  2. yes
  3. no
  4. yes
  5. only when using -Zbuild-std or equivalent to recompile the standard library with -Crelocation-model=static
  6. yes

For the PLT in particular however, the linker should be able to replace all PLT calls with direct calls for symbols that are neither exported nor imported as would probably be the case for your average kernel. Replacing PLT calls with direct calls doesn't require rewriting instructions, only changing the target address of PLT call relocations. Replacing GOT accesses however does require rewriting instructions. In a bunch of cases linker relaxations should be possible, but I don't know if it covers all GOT accesses. And it is probably going to depend on the exact linker version and compiler version. If you discard the GOT or PLT section while there are references to it I think you get a linker error, but it may also be just a warning (which rustc discards).

@CJKay
Copy link

CJKay commented Dec 4, 2025

Thanks, @bjorn3. I'm curious about (3) and (4):

  1. After this change, can we safely discard the GOT/PLT in the linker script?

no

  1. After this change, can we safely ignore the GOT/PLT in start-up code?

yes

How is it that we can ignore the GOT in this case? Wouldn't that leave it with relative addresses? Doesn't its very presence indicate that there are symbol references which need relocating?

@bjorn3
Copy link
Member

bjorn3 commented Dec 4, 2025

How is it that we can ignore the GOT in this case? Wouldn't that leave it with relative addresses? Doesn't its very presence indicate that there are symbol references which need relocating?

If you load the ELF file with an ASLR slide of 0 like you would have needed to do currently due to the static relocation model, then you don't need to apply relocations as the linker already put the correct values everywhere. Only when you use a non-zero ASLR slide do you need to actually apply relocations. But before this PR you wouldn't be able to use a non-zero ASLR slide anyway.

Currently, `aarch64-unknown-none` generates static position *dependent*
binaries which must be loaded at a fixed address in memory. The
[openVMM](https://github.com/microsoft/openvmm) project uses the
`*-unknown-none` targets for an embedded-like environment but needs the
binary to be loadable to any memory address. Currently, the
`x86_64-unknown-none` target is configured to allow this by default but
the `aarch64-unknown-none` target is not.

This commit changes the defaults for the `aarch64-unknown-none` target
to enable static-PIE binaries which aligns the target more closely with
the corresponding `x86_64-unknown-none` target. If users prefer the
prior behavior, they can request that via `-Crelocation-model=static`.
@wesleywiser wesleywiser force-pushed the aarch64-unknown-none_static-pie branch from a4cd181 to e5212e6 Compare December 4, 2025 20:45
@wesleywiser
Copy link
Member Author

wesleywiser commented Dec 4, 2025

@BartMassey and @jonathanpallant, I appreciate you pinging the various maintainers!

I agree that build-std would help but I think there's also a consistency argument to be made and users probably expect similar defaults across the various *-unknown-none targets. Since you can still get a position dependent binary out of one with libcore built as PIC, I think this default makes more sense than the current status quo.

Thanks @bjorn3 for answering those questions!

I've updated this PR to make the same change to aarch64-unknown-none-softfloat as well.

@CJKay
Copy link

CJKay commented Dec 5, 2025

If you load the ELF file with an ASLR slide of 0 like you would have needed to do currently due to the static relocation model, then you don't need to apply relocations as the linker already put the correct values everywhere.

Understood, thanks. I have a couple of concerns with this:

  1. Backwards-compatibility relies on the assumption that the linker statically applies dynamic relocations by default. Is this guaranteed to be the case? It was historically not true for LLD, as we discovered with TF-A (RF-A's C-based predecessor).

  2. The GOT incurs both a size and performance penalty; if we introduce one where it isn't necessary then IMO that contravenes "don't pay for what you don't use", as well as the precedent set by the major AArch64 C toolchain distributions (PIE disabled by default). I'm actually a bit surprised that x86_64-unknown-none enables it by default... I guess it was configured with kernels in mind, rather than firmware?

I don't think there's a satisfactory answer to this right now; either everybody is limited to static relocation, or everybody pays the penalty of a GOT. The firmware ecosystem can maybe stomach the overhead until -Z build-std is stabilised, but at a minimum, to avoid breakage, there needs to be a guarantee the linker will apply dynamic relocations.

@jieyouxu jieyouxu added the O-AArch64 Armv8-A or later processors in AArch64 mode label Dec 5, 2025
@bjorn3
Copy link
Member

bjorn3 commented Dec 5, 2025

It was historically not true for LLD

I wasn't aware of that.

The GOT incurs both a size and performance penalty

If you use -Crelocation-model=static without -Zbuild-std, only libcore and if you use it liballoc pay this price. And it is possible that enabling LTO is enough to get rid of the penalty entirely, though I haven't checked. If you care about size and/or performance, you would have enabled LTO anyway.

@CJKay
Copy link

CJKay commented Dec 5, 2025

If you use -Crelocation-model=static without -Zbuild-std, only libcore and if you use it liballoc pay this price.

It isn't particularly uncommon to discard the GOT altogether. That may be a case of "holding it wrong", but it's worth knowing that there are projects out there that will break after this change - they will need a heads-up.

@thejpster
Copy link
Contributor

It appears the GOT will be inserted after whatever section comes last in your linker script.

That final section might be a block of CCRAM you were using for your stack. You would be surprised to discover there's now 16 bytes of GOT at the end of it. If you did indeed discover it. The worrying thing is that it might all compile, but your stuff just moves. Or you use the GOT as your stack.

@thejpster
Copy link
Contributor

If anyone wants an example to play with, try https://github.com/ferrous-systems/rust-training/tree/main/example-code/qemu-aarch64v8a. Just delete the rust-toolchain.toml file and it'll build with regular Rust. Add -Crelocation-model=pie to the rust flags in .cargo/config.toml and observe how global_uart::UART gets a reference in the GOT, which is located after .bss.

@CJKay
Copy link

CJKay commented Dec 10, 2025

It appears the GOT will be inserted after whatever section comes last in your linker script.

In BFD (and presumably LLD as well) the rules for orphan allocation are a bit complicated. IMO the most predictable way to deal with orphans is to just forbid them completely (with --orphan-handling=error).

@jonathanpallant
Copy link
Contributor

jonathanpallant commented Dec 10, 2025

Ooooh, so if we add that linker option to the target config, will that mean that failing to place the .got input section into an output section would be a hard error? Hard errors are much better than weird run-time issues and I could live with this target change if users got a hard error.

Also I want to note the results of the Embeddded Devices Working Group meeting last night which were:

  • silent stable-to-stable breakage of a target is to be avoided (as in, it still builds but it doesn't run)
  • we broke targets before:
    • switching from BFD to rust-lld as the default linker and it caused pain, but it was tolerable because it was very obvious your build was broken and people could then work out how to fix it
    • bumping glibc requirements at Tier 1 also break some users, but again, it's usually very obviously a build failure
  • in PIE mode you get an extra eight bytes of RAM used in the GOT per static variable
  • we wondered if adding two new targets was an option? (maybe aarch64-unknown-none-pie and aarch64-unknown-none-softfloat.pie)
  • we're excited about build-std making all this go away :)

@jonathanpallant
Copy link
Contributor

I tried --orphan-handling=error and it failed the link because my linker script did not place .comment, or .debug_str or a dozen other debug sections. So adding that is definitely going to break everyone.

@bjorn3
Copy link
Member

bjorn3 commented Dec 10, 2025

Has anyone checked if -Crelocation-model=pie + -Zdefault-visibility=protected or -Zdefault-visibility=hidden still produces a GOT? Protected visibility as default should be fine on all targets that don't support dynamic linking. We only don't use it by default due to an ld.bfd bug that has been fixed recently around exporting statics with protected visibility from dylibs. -Zdefault-visibility should only apply to non-no_mangle items. There is also a target spec option for hidden visibility that I believe applies to all symbols.

@thejpster
Copy link
Contributor

I still get a GOT even with -Zdefault-visibility=protected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

O-AArch64 Armv8-A or later processors in AArch64 mode S-blocked Status: Blocked on something else such as an RFC or other implementation work. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.