-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Fix d32 usage in Arm target specs #149512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix d32 usage in Arm target specs #149512
Conversation
|
These commits modify compiler targets. |
|
r? @davidtwco rustbot has assigned @davidtwco. Use |
|
r? compiler Rerolling my boss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also compiler/rustc_target/src/spec/targets/armv7a_nuttx_eabihf.rs which doesn't have a comment like this but maybe should be consistent in terms of target features?
Cc @no1wudi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then there are thumbv7em_none_eabihf and thumbv7em_nuttx_eabihf which currently use the vfp4d16sp target feature which doesn't even exist as a Rust feature... no idea what to do with those, should that just become vfp4 since d16 is the default? Does the sp part mean there's something extra special going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vfp4d16sp in thumbv7em_nuttx_eabihf is derived from thumbv7em_none_eabihf , I guess this is from clang?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It refers somehow to this multiclass feature thing. "_D16_SP" is "with only 16 d-registers and no double precision". 🤷
Does the Rust vfp4 feature imply double-precision? Who actually understands this zoo of target features? (Or is it maybe documented somewhere? The LLVM source code is not readable unfortunately.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the Rust vfp4 feature imply double-precision
Ah, yes, yes it does. I didn't spot that vfp3/4 are defined with a defm rather than a def so they use the def "" record of the multiclass. I've check with an LLVM engineer and I think we should use vfp3d16 instead of +vfp3,-d32 - the -d32 disables vfp3 (but leaves vfp2 up)
The LLVM dependencies expanded should be (maybe best read bottom to top):
vfp4 -> vfp3 + fp16 + vfp4d16 + vfp4sp
vfp4d16 -> vfp3d16 + fp16 + fp64 + vfp4d16sp
vfp4sp -> vfp3sp + fp16 + d32 + vfp4d16sp
vfp4d16sp -> vfp3d16sp + fp16
vfp3 -> vfp2 + vfp3d16 + vfp3sp
vfp3d16 -> vfp2 + fp64 + vfp3d16sp
vfp3sp -> vfp2 + d32 + vfp3d16sp
vfp3d16sp -> vfp2sp
vfp2 -> vfp2sp + fp64
vfp2sp -> fpregs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we want all that combinatorial explosion in Rust. It would make much more sense to say that vfp4 (etc) just means only the d16 part, and then if people want the d32 part they can separately enable that. That is the status quo as well.
IOW, your vfp3 implies d32 and that's exactly what we don't want any more, I think. (Also your vfp2 does not imply d32 which seems inconsistent.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list above is just the LLVM features expanded - yes it's inconsistent but it's the way it is for backwards compatibility reasons. I agree we should probably expose something simpler.
I think there's a bunch of tradeoffs with how we could model this in Rust - your solution would trip up many engineers from C who expect vfp3 to behave the same in both. Arm are target maintainers of a few 32 bit arm targets now so we should update this set of features in the future and look at getting it stabilised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's the usual trade-off when the C situation is a mess -- do we copy the same mess for Rust so all Rust people also have to deal with it, or do we do something cleaner in Rust and risk confusing people that deal with both.
|
r? @RalfJung |
60d77bc to
4ed68f0
Compare
Fixes #149399 - after checking with an LLVM engineer Rust's feature implications do correctly map to LLVM's. The target specs originally had +vfp3,+d16, but were mistakenly fixed to +vfp3,-d32 which disables vfp3 again.
Some targets specify +vfp2,-d32, and since vfp2 shouldn't imply d32 the -d32 is unneeded.
The list of Arm features is quite old and since Arm is now a target maintainer of many of them we'll go in and update them. We should probably add vfp3d16 and similar as rust has no way to express these right now after d16 was removed.
The LLVM features expand like this:
-neonmight be unnecessary too in many of these cases, but some default CPUs that Rust specifies will turn Neon on so that needs a bit more care. I can't see any LLVM cpus that enable D32.Old description: