Add checks for gpu-kernel calling conv #149991

Flakebi · 2025-12-14T14:30:37Z

The gpu-kernel calling convention has several restrictions that were not enforced by the compiler until now.
Add the following restrictions:

Cannot be async
Cannot be called
Cannot return values, return type must be () or !
Arguments should be simple, i.e. passed by value. More complicated types can work when you know what you are doing, but it is rather unintuitive, one needs to know ABI/compiler internals.
Export name should be unmangled, either through no_mangle or export_name. Kernels are searched by name on the CPU side, having a mangled name makes it hard to find and probably almost always unintentional.

Tracking issue: #135467
amdgpu target tracking issue: #135024

@workingjubilee, these should be all the restrictions we talked about a year ago.

cc @RDambrosio016 @kjetilkjeka for nvptx

rustbot · 2025-12-14T14:30:41Z

r? @WaffleLapkin

rustbot has assigned @WaffleLapkin.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-12-14T15:09:40Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

WaffleLapkin · 2025-12-15T13:20:57Z

r? workingjubilee

As I'm completely missing context

rustbot · 2025-12-15T13:21:02Z

workingjubilee is currently at their maximum review capacity.
They may take a while to respond.

workingjubilee

The AST-level code looks good.

Some details on messaging here. I'm not committed to a precise message on these, which is why it's a bit "multiple choice" here, just wondering if these could be improved. In one or two cases it is a must-change.

Should we be enforcing a maximum number of arguments, also? Probably not if there's no cross-driver consensus on that, but maybe?

View changes since this review

compiler/rustc_hir_typeck/messages.ftl

compiler/rustc_lint/src/gpukernel_abi.rs

workingjubilee · 2025-12-16T16:09:21Z

compiler/rustc_lint/src/gpukernel_abi.rs

+    /// This lint is issued when it detects a probable mistake in a signature.
+    IMPROPER_GPU_KERNEL_ARG,
+    Warn,
+    "simple arguments of gpu-kernel functions"


This line should capture the reason for the lint, not what it is checking for, so something like

Suggested change

"simple arguments of gpu-kernel functions"

"GPU kernel entry points have a limited calling convention"

👍 I’ll call it ABI instead of calling convention, it seems like Rust calls it ABI in most places. (Unless there’s a difference I’m missing)

Technically ABI is a superset of calling convention, notionally.

I'm cool with people abusing the term a bit when it's clear what it means from context and is more concise, such as in diagnostic messaging here.

rustbot · 2025-12-16T16:14:20Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

Flakebi

Thanks for the review!

Some context on the internal workings:

On the CPU side, a program passes arguments to a kernel
The “API” takes these arguments and writes them into GPU memory
The kernel on the GPU gets a pointer to this memory
When the kernel accesses arguments, it reads from this memory

(I think nvidia and amd work the same here. I’m not too familiar with nvidia, but this seems to suggest so: https://github.com/Rust-GPU/rust-cuda/blob/44c44baf6fb738d5ffec25aac5db8af02514e890/crates/rustc_codegen_nvvm/src/abi.rs#L60)

So, number of arguments or size of arguments doesn’t really matter, it’s all memory anyways.
And, we could make struct arguments work (maybe, I didn’t look into the details), but Rust would need to take them by value, currently it changes them to pass by pointer.

View changes since this review

compiler/rustc_hir_typeck/messages.ftl

compiler/rustc_lint/src/gpukernel_abi.rs

Flakebi · 2025-12-17T17:06:14Z

compiler/rustc_lint/src/gpukernel_abi.rs

+    /// This lint is issued when it detects a probable mistake in a signature.
+    IMPROPER_GPU_KERNEL_ARG,
+    Warn,
+    "simple arguments of gpu-kernel functions"


👍 I’ll call it ABI instead of calling convention, it seems like Rust calls it ABI in most places. (Unless there’s a difference I’m missing)

workingjubilee

Some context on the internal workings:

On the CPU side, a program passes arguments to a kernel

The “API” takes these arguments and writes them into GPU memory

The kernel on the GPU gets a pointer to this memory

When the kernel accesses arguments, it reads from this memory

Ah, yeah. I know of the general idea here, though I am foggy on specifics in many cases.

And even in some hypothetical where the driver isn't writing them into the GPU memory, it would have to invoke some dark magic on the GPU to put it directly into registers anyways, which is just saying "did you know: some computer hardware has memory-mapped registers?"

( I only am making that comment because I vaguely remember one GPU driver involving something similar. )

At some point we are left with doing codegen to accept arguments and we can expect that to have some target-specific nuances, even if it's just stack alignment.

For structs, I'd rather we avoid thinking too hard about the struct question for cases where they aren't just repr(transparent) primitives until we have tinkered with things a bit more. If repr(C) structs are widely supported and fully intentionally that could be worth tackling.

View changes since this review

compiler/rustc_lint/src/gpukernel_abi.rs

workingjubilee · 2025-12-30T22:40:06Z

compiler/rustc_lint/src/gpukernel_abi.rs

+                | ty::Slice(_)
+                | ty::Str
+                | ty::Tuple(_)
+                | ty::UnsafeBinder(_) => false,


I believe the answer for ty::UnsafeBinder should be the recursive answer to this question by default (I think the implied result is quite useless given the answer to other things, but that's fine).

Thus this should... probably use an impl of TypeFolder to apply the recursive traversal, I think? I do not consider that required, but I thought I should note it because this seems like a sorta-classic ..super_fold.. case, where we want to only consider the "true" type instead of binders and such: https://rustc-dev-guide.rust-lang.org/ty-fold.html

workingjubilee · 2025-12-30T23:01:24Z

compiler/rustc_lint/src/gpukernel_abi.rs

+                | ty::CoroutineClosure(_, _)
+                | ty::CoroutineWitness(..)
+                | ty::Dynamic(_, _)
+                | ty::Error(_)


We should either return or continue if the type is so erroneous.

workingjubilee · 2025-12-30T23:05:29Z

compiler/rustc_lint/src/gpukernel_abi.rs

+                | ty::Infer(_)
+                | ty::Never
+                | ty::Param(_)


normalizing should have handled Infer and Param cases I think?

The `gpu-kernel` calling convention has several restrictions that were not enforced by the compiler until now. Add the following restrictions: 1. Cannot be async 2. Cannot be called 3. Cannot return values, return type must be `()` or `!` 4. Arguments should be primitives, i.e. passed by value. More complicated types can work when you know what you are doing, but it is rather unintuitive, one needs to know ABI/compiler internals. 5. Export name should be unmangled, either through `no_mangle` or `export_name`. Kernels are searched by name on the CPU side, having a mangled name makes it hard to find and probably almost always unintentional.

Flakebi · 2026-01-01T17:36:20Z

I tried rewriting the lint pass using TypeFolder (plus fixing the other comments).

rustbot assigned WaffleLapkin Dec 14, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 14, 2025

This was referenced Dec 14, 2025

Tracking Issue for the gpu-kernel ABI #135467

Open

Tracking Issue for amdgcn target #135024

Open

Add inline asm support for amdgpu #149793

Open

Flakebi force-pushed the gpu-kernel-cc branch from b6cb0c2 to 0a9373c Compare December 14, 2025 15:09

Flakebi force-pushed the gpu-kernel-cc branch from 0a9373c to 1e9b1dc Compare December 14, 2025 15:10

This comment has been minimized.

Sign in to view

Flakebi force-pushed the gpu-kernel-cc branch from 1e9b1dc to 10b32d6 Compare December 14, 2025 20:41

This comment has been minimized.

Sign in to view

Flakebi force-pushed the gpu-kernel-cc branch from 10b32d6 to 88ad16c Compare December 15, 2025 13:17

rustbot assigned workingjubilee and unassigned WaffleLapkin Dec 15, 2025

workingjubilee requested changes Dec 16, 2025

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 16, 2025

Flakebi force-pushed the gpu-kernel-cc branch from 88ad16c to 01f1e1b Compare December 17, 2025 17:21

Flakebi commented Dec 17, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

Flakebi force-pushed the gpu-kernel-cc branch 2 times, most recently from 6e7c9a0 to 6868f66 Compare December 20, 2025 15:06

workingjubilee reviewed Dec 30, 2025

View reviewed changes

Flakebi force-pushed the gpu-kernel-cc branch from 6868f66 to 33add36 Compare January 1, 2026 17:35

	"simple arguments of gpu-kernel functions"
	"GPU kernel entry points have a limited calling convention"

Uh oh!

Add checks for gpu-kernel calling conv #149991

Are you sure you want to change the base?

Add checks for gpu-kernel calling conv #149991

Conversation

Flakebi commented Dec 14, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Dec 14, 2025

Uh oh!

rustbot commented Dec 14, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

WaffleLapkin commented Dec 15, 2025

Uh oh!

rustbot commented Dec 15, 2025

Uh oh!

workingjubilee left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

workingjubilee Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Flakebi Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

workingjubilee Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

rustbot commented Dec 16, 2025

Uh oh!

Flakebi left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Flakebi Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

workingjubilee left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

workingjubilee Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

workingjubilee Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

workingjubilee Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Flakebi commented Jan 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Flakebi commented Dec 14, 2025 •

edited by rustbot

Loading

workingjubilee left a comment •

edited

Loading

Flakebi left a comment •

edited by rustbot

Loading

workingjubilee left a comment •

edited by rustbot

Loading