Enhancing spatial safety: Better array-bounds checking in C (and Linux)
The C language has historically suffered from a lack of proper bounds-checking on all kinds of arrays. The Linux Kernel Self-Protection Project has been addressing this issue for several years. In this presentation, we will learn about the most recent hardening efforts to resolve the problem of bounds-checking, particularly for fixed-size and flexible arrays.
We will explore the different mechanisms being used to harden key APIs like memcpy() against buffer overflows, which includes the use of some interesting built-in compiler functions. We will also talk about a couple of recent compiler options like -fstrict-flex-arrays and -Wflex-array-member-not-at-end, as well as the new counted_by attribute released in Clang 18 and GCC 15, which helps us gain run-time bounds-checking coverage on flexible arrays.
Overall, we will discuss how various challenges have been overcome, and highlight the innovations developed to solve the problem of array bounds-checking in both C and the Linux kernel once and for all.
I’ll go back to Japan for Open Source Summit Japan and the Linux Plumbers Conference this December. In the meantime, the slides are below if you’d like to check them out. Thanks! 🙂
I presented at the Open Source Summit Europe in Amsterdam this afternoon. 🇳🇱🐧🛡⚔️
Upstream Kernel Hardening: Progress on enabling -Wflex-array-member-not-at-end
The -Wflex-array-member-not-at-end compiler option was introduced in GCC 14. At the time, it revealed around 60,000 warnings in the upstream Linux kernel. While many of these were duplicates, about 650 are unique and require individual auditing and attention. These issues span different categories and vary in complexity, which adds to the challenge of globally enabling this compiler option in the upstream Linux kernel.
In this presentation, we’ll share the progress we’ve made on this work as part of the Kernel Self-Protection Project (KSPP) over the past few months. We’ll go over the challenges we’ve encountered, show concrete code examples, and demonstrate how to fix these kinds of problems. We’ll also discuss why enabling this option is important for the kernel, and how we plan to complete this work in the near future.
Whether you’re a seasoned kernel developer or someone looking to start contributing upstream, this presentation will introduce useful helpers and strategies you can use to fix existing code or implement new functionality, and in doing so, help us harden the upstream Linux kernel for the benefit of everyone.
Originally, I opted for the __struct_group() approach to fix these issues.
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index 73516e263627..34c11644d5d7 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -227,12 +227,15 @@ enum ars_masks {
*/
struct nd_cmd_pkg {
- __u64 nd_family; /* family of commands */
- __u64 nd_command;
- __u32 nd_size_in; /* INPUT: size of input args */
- __u32 nd_size_out; /* INPUT: size of payload */
- __u32 nd_reserved2[9]; /* reserved must be zero */
- __u32 nd_fw_size; /* OUTPUT: size fw wants to return */
+ /* New members MUST be added within the __struct_group() macro below. */
+ __struct_group(nd_cmd_pkg_hdr, __hdr, /* no attrs */,
+ __u64 nd_family; /* family of commands */
+ __u64 nd_command;
+ __u32 nd_size_in; /* INPUT: size of input args */
+ __u32 nd_size_out; /* INPUT: size of payload */
+ __u32 nd_reserved2[9]; /* reserved must be zero */
+ __u32 nd_fw_size; /* OUTPUT: size fw wants to return */
+ );
unsigned char nd_payload[]; /* Contents of call */
};
The idea behind this approach is to _separate_ the flexible-array member from the rest of the members in the flexible structure, while at the same time a new type for the “header part” of the flexible structure is created. We then proceed to use the newly created type to replace the type of the objects causing trouble –the ones causing the -Wflex-array-member-not-at-end warnings– in the composite structures.
However, the approach briefly described above is usually better suited for fixing instances of structures with a flexible-array member in the middle that are intended to live on the heap –I’ll go into more detail about this approach in a future post.
So, after some time, I realized that using the DEFINE_RAW_FLEX() helper was probably a better approach in this particular case, since basically all the instances triggering the -Wflex-array-member-not-at-end warnings were on-stack objects.
The DEFINE_FLEX() and DEFINE_RAW_FLEX() macros were specifically designed to define automatic (on-stack) objects of a flexible structure type, where the size of the flexible-array member is known at compile time.
So, I submitted a new version of the patch above, this time using DEFINE_RAW_FLEX(), and without any changes in the UAPI header because we didn’t need to change struct d_cmd_pkg anymore:
Can this keep the C99 init-style with something like (untested):
_DEFINE_FLEX(struct nd_cmd_pkg, nd_cmd, nd_payload,
sizeof(struct nd_intel_get_security_state), {
.pkg = {
.nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
.nd_family = NVDIMM_FAMILY_INTEL,
.nd_size_out =
sizeof(struct nd_intel_get_security_state),
.nd_fw_size =
sizeof(struct nd_intel_get_security_state),
},
});
?
At the moment, the short answer was: no –the helper was not able to do that “cleanly.”
However, Dan’s question along with another issue I was recently trying to address, led me to take another look at the internals of _DEFINE_FLEX():
391 /**
392 * _DEFINE_FLEX() - helper macro for DEFINE_FLEX() family.
393 * Enables caller macro to pass (different) initializer.
394 *
395 * @type: structure type name, including "struct" keyword.
396 * @name: Name for a variable to define.
397 * @member: Name of the array member.
398 * @count: Number of elements in the array; must be compile-time const.
399 * @initializer: initializer expression (could be empty for no init).
400 */
401 #define _DEFINE_FLEX(type, name, member, count, initializer...) \
402 _Static_assert(__builtin_constant_p(count), \
403 "onstack flex array members require compile-time..."); \
404 union { \
405 u8 bytes[struct_size_t(type, member, count)]; \
406 type obj; \
407 } name##_u initializer; \
408 type *name = (type *)&name##_u
The “macro wizardry” behind this helper allows for the allocation of the total space needed for an instance of a flexible structure along with its flexible-array member on the stack. As mentioned above, there are situations where the size of the flexible-array member is known at compile time. In these cases, we can rely on the DEFINE_FLEX() family of helpers to declare the necessary objects, ensuring that any flexible-array member remains the last member of the composite structure –not placed in the middle.
Dan noticed that DEFINE_FLEX() and DEFINE_RAW_FLEX() are just wrappers for _DEFINE_FLEX(), and that these wrappers don’t allow for external static initialization:
However, _DEFINE_FLEX() _does_ have the infrastructure to initialize the members of the TYPE object created by this helper –with just a _small_ tweak: the obj member created by the helper at line 406 must be exposed to the call site, as shown in DEFINE_FLEX() at line 443 above when COUNTER is set to COUNT.
So I replied the following:
The code below works - however, notice that in this case we should
go through 'obj', which is an object defined in _DEFINE_FLEX().
_DEFINE_FLEX(struct nd_cmd_pkg, nd_cmd, nd_payload,
sizeof(struct nd_intel_get_security_state), = {
.obj = {
.nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
.nd_family = NVDIMM_FAMILY_INTEL,
.nd_size_out =
sizeof(struct nd_intel_get_security_state),
.nd_fw_size =
sizeof(struct nd_intel_get_security_state),
},
});
Then, in a subsequent e-mail I commented:
Now, I can modify the helper like this:
diff --git a/include/linux/overflow.h b/include/linux/overflow.h
index 69533e703be5..170d3cfe7ecc 100644
--- a/include/linux/overflow.h
+++ b/include/linux/overflow.h
@@ -404,7 +404,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
union { \
u8 bytes[struct_size_t(type, member, count)]; \
type obj; \
- } name##_u initializer; \
+ } name##_u = { .obj initializer }; \
type *name = (type *)&name##_u
/**
and then we can use the helper as follows:
_DEFINE_FLEX(struct nd_cmd_pkg, nd_cmd, nd_payload,
sizeof(struct nd_intel_get_security_state), = {
.nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
.nd_family = NVDIMM_FAMILY_INTEL,
.nd_size_out =
sizeof(struct nd_intel_get_security_state),
.nd_fw_size =
sizeof(struct nd_intel_get_security_state),
});
OK, I'll go and update the helper.
-Gustavo
And that’s exactly what I did in the patch below. With that change, the _DEFINE_FLEX() helper now works correctly and can statically initialize struct members without exposing any internals. 🙂
diff --git a/include/linux/overflow.h b/include/linux/overflow.h
index f33d74dac06f2b..7b7be27ca11318 100644
--- a/include/linux/overflow.h
+++ b/include/linux/overflow.h
@@ -396,7 +396,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
* @name: Name for a variable to define.
* @member: Name of the array member.
* @count: Number of elements in the array; must be compile-time const.
- * @initializer: initializer expression (could be empty for no init).
+ * @initializer: Initializer expression (e.g., pass `= { }` at minimum).
*/
#define _DEFINE_FLEX(type, name, member, count, initializer...) \
_Static_assert(__builtin_constant_p(count), \
@@ -404,7 +404,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
union { \
u8 bytes[struct_size_t(type, member, count)]; \
type obj; \
- } name##_u initializer; \
+ } name##_u = { .obj initializer }; \
type *name = (type *)&name##_u
/**
@@ -444,7 +444,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
* elements in array @member.
*/
#define DEFINE_FLEX(TYPE, NAME, MEMBER, COUNTER, COUNT) \
- _DEFINE_FLEX(TYPE, NAME, MEMBER, COUNT, = { .obj.COUNTER = COUNT, })
+ _DEFINE_FLEX(TYPE, NAME, MEMBER, COUNT, = { .COUNTER = COUNT, })
Also, as seen in the patch above, I also fixed the DEFINE_FLEX() wrapper.
So, here is a short example of how to use this helper to statically initialize members in a structure –let’s assume FIXED_SIZE == 1:
In case you don’t need to initialize any members to specific values, just pass = {} as argument, or probably you should just use DEFINE_RAW_FLEX() or DEFINE_FLEX() instead.
While in Australia 🇦🇺, I had the honor of being invited to give a guest talk to graduate and master’s students at The University of Adelaide. It was a truly special experience because it was my first time presenting at a university, and one I deeply value as a meaningful milestone in my career. 🙂🙏🏼
Enhancing spatial safety: Better array-bounds checking in C (and Linux) (University of Adelaide –Guest talk)
The C language has historically suffered from a lack of proper bounds-checking on all kinds of arrays. The Kernel Self-Protection Project has been addressing this issue for several years. In this presentation, we will learn about the most recent hardening efforts to resolve the problem of bounds-checking, particularly for fixed-size and flexible arrays.
We will explore the different mechanisms being used to harden key APIs like memcpy() against buffer overflows, which includes the use of some interesting built-in compiler functions. We will also talk about a couple of recent compiler options like -fstrict-flex-arrays and -Wflex-array-member-not-at-end, as well as the new counted_by attribute released in Clang-18 a year ago, which helps us gain run-time bounds-checking coverage on flexible arrays.
Overall, we will discuss how various challenges have been overcome, and highlight the innovations developed to solve the problem of array bounds-checking in both C and the Linux kernel once and for all.
Earlier this year, I traveled to Australia 🇦🇺 to present for the second consecutive year at the Everything Open conference in Adelaide. I was so happy to be back in Australia – it was a great experience to travel to the other side of the world once again to speak about upstream Linux kernel hardening and share the work we do in the Kernel Self-Protection Project. ⚔️🛡️🐧
Huge thanks to the organizers for inviting me to present! 🙂🙌🏽
Enhancing spatial safety in the Linux kernel: Fixing thousands of -Wfamnae warnings
The introduction of the new -Wflex-array-member-not-at-end compiler option, released in GCC-14, has revealed approximately 60,000 warnings in the Linux kernel. Among them, some legitimate bugs have been uncovered.
In this presentation, we will explore in detail the different strategies we are employing to resolve all these warnings. These methods have already helped us resolve about 30% of them. Our ultimate goal in the Kernel Self-Protection Project is to globally enable this option in mainline, further enhancing the security of the upstream Linux kernel in the spatial safety domain.
Additionally, we will briefly review the recent history of hardening efforts that have led to the unveiling of these tens of thousands of warnings. This process illustrates the extensive and gradual nature of hardening the kernel, highlighting the challenges and persistence required to enhance its security. Looking ahead, after enabling this compiler option in mainline, I will briefly discuss the next challenge the Kernel Self-Protection Project will likely focus on.
This year I had the amazing experience of traveling to Sweden 🇸🇪 (and Denmark 🇩🇰 ) for the first time to present at the Lund Linux Conference (https://lundlinuxcon.org/?page=current). 🐧 🗣️ 🎙️
It’s a small but very neat conference. Both the audience and the organizers were awesome, they made me feel right at home, and I even made some cool new friends! I really hope to return next year. 😃
The videos of the presentations were uploaded a few days ago, so here’s my talk about upstream Linux kernel hardening. 🐧 ⚔️ 🛡️ Thanks!
The maintainer has already taken this patch, and it will soon land in mainline and a couple of stable trees. 😃🐧
Here’s a link to the slides and video from my latest presentation at Linux Plumbers Conference, where I discuss the ongoing efforts to globally enable the -Wflex-array-member-not-at-end compiler option in the upstream Linux kernel:
I’m really excited to share that my first presentation of 2025 will be in Adelaide, Australia. 🙌🏽😃
This will be my second time speaking at the Everything Open conference, and I’m really happy to be invited back. As always, feel free to say hi if you see me around!
I’ll be speaking about some of the recent progress on upstream Linux kernel hardening. ⚔️🛡️🐧
Here’s a link to the slides and video from my latest presentation on the ongoing efforts to globally enable the -Wflex-array-member-not-at-end compiler option in the upstream Linux kernel: