I received the following request this evening and it absolutely made my day. 🙂
▫️ “I’m a mechatronics student and I discovered the project from this block post: “https://embeddedor.com/blog/2024/09/28/one-simple-and-rewarding-way-to-contribute-to-the-linux-kernel-fix-coverity-issues/” and as such I’m instersted in contributing as well”
More than a year ago, I wrote a blog post to guide people to resources on how to start contributing to the Linux kernel by fixing Coverity issues. Every few weeks, I receive a request like the one above for access to the Coverity scans, and it makes my day every single time. It makes me really happy that more and more newcomers are finding inspiration in that blog post. 🙂
Someone might ask ‘Why Coverity in particular?’ The answer is quite simple: that’s how I started my career in the Linux kernel almost ten years ago. (That’s not to say this is the best possible way to start contributing to the kernel. This often depends on people’s previous experience and background. 😉)
I can therefore attest that it’s an effective way to gain experience contributing across the entire kernel tree, because it exposes people to all sorts of technical and social challenges, both of which are essential for learning to navigate the, at times, wild waters of the Linux kernel community.
You can check out the blog post via the following links. And if you think someone else might be interested or get inspired by it, please feel free to share it with your network. ✌🏼:
Now, for someone asking, ‘Why do people even have to ask for access to the Coverity scans for the kernel?’ The answer is straightforward: Black Duck manages the Coverity project directly through a request-for-access scheme, and others and I are only admins for the -rc and linux-next scans. See this for more information: https://scan.coverity.com/about, or contact Black Duck directly. 🙂
First time in South Korea. Three talks in two days. Over 200 minutes of public speaking. Two packed rooms. Made new connections. (My luggage arrived four days after me. 😅)
This week was very intense, and I’ll never forget this first visit to Seoul. I’m a bit exhausted right now, but really grateful.
Thanks, Korea! 🙏🏼🇰🇷♥️
See the abstracts and slides from my presentations below.
Enhancing spatial safety: Better array-bounds checking in C (and Linux)
The C language has historically suffered from a lack of proper bounds-checking on all types of arrays. The Linux Kernel Self-Protection Project has been addressing this issue for several years. In this presentation, we’ll learn about the most recent hardening efforts to resolve the problem of bounds-checking, particularly for fixed-size and flexible arrays.
We’ll explore the different mechanisms being used to harden key APIs like memcpy() against buffer overflows, which includes the use of some interesting built-in compiler functions. We’ll also talk about a couple of recent compiler options like -fstrict-flex-arrays and -Wflex-array-member-not-at-end, as well as the new counted_by attribute introduced in Clang 18 and GCC 15, which helps us gain run-time bounds-checking coverage on flexible arrays.
Overall, we’ll discuss how various challenges have been overcome, and highlight the innovations developed to solve the problem of array bounds-checking in both C and the Linux kernel once and for all.
I delivered the above (90-minute) presentation on November 4 and 5. The slides are basically the same for both sessions.
Below is the video of the presentation I gave on Nov 5. They haven’t uploaded the recording of the presentation on Nov 4, but as soon as it’s up, I’ll add it to my Presentations page. I personally liked that presentation better because the room was packed and people asked a lot of questions and made some comments.
Upstream Kernel Hardening: Progress on enabling -Wflex-array-member-not-at-end
The -Wflex-array-member-not-at-end compiler option was introduced in GCC 14. It warns about flexible-array members in the middle of composite structures. At the time, it revealed around 60,000 warnings in the upstream Linux kernel. While the vast majority of these are duplicates, about 650 are unique and require individual auditing and resolution. These issues fall into various categories and differ in complexity, which adds to the challenge of globally enabling this flag upstream.
In this presentation, we’ll share the progress we’ve made on this work as part of the Linux Kernel Self-Protection Project (KSPP) over the last year. We’ll go over the challenges we’ve encountered, show concrete code examples, and demonstrate how to fix these kinds of problems. We’ll also discuss why enabling this option is important for the kernel, and how we plan to complete this work in the near future.
Whether you’re a seasoned kernel developer or someone looking to start contributing upstream, this presentation will introduce useful helpers and strategies you can use to fix existing code or implement new functionality, and in doing so, help us harden the Linux kernel for the benefit of everyone.
Enhancing spatial safety: Better array-bounds checking in C (and Linux)
The C language has historically suffered from a lack of proper bounds-checking on all kinds of arrays. The Linux Kernel Self-Protection Project has been addressing this issue for several years. In this presentation, we will learn about the most recent hardening efforts to resolve the problem of bounds-checking, particularly for fixed-size and flexible arrays.
We will explore the different mechanisms being used to harden key APIs like memcpy() against buffer overflows, which includes the use of some interesting built-in compiler functions. We will also talk about a couple of recent compiler options like -fstrict-flex-arrays and -Wflex-array-member-not-at-end, as well as the new counted_by attribute released in Clang 18 and GCC 15, which helps us gain run-time bounds-checking coverage on flexible arrays.
Overall, we will discuss how various challenges have been overcome, and highlight the innovations developed to solve the problem of array bounds-checking in both C and the Linux kernel once and for all.
I’ll go back to Japan for Open Source Summit Japan and the Linux Plumbers Conference this December. In the meantime, the slides are below if you’d like to check them out. Thanks! 🙂
I presented at the Open Source Summit Europe in Amsterdam this afternoon. 🇳🇱🐧🛡⚔️
Upstream Kernel Hardening: Progress on enabling -Wflex-array-member-not-at-end
The -Wflex-array-member-not-at-end compiler option was introduced in GCC 14. At the time, it revealed around 60,000 warnings in the upstream Linux kernel. While many of these were duplicates, about 650 are unique and require individual auditing and attention. These issues span different categories and vary in complexity, which adds to the challenge of globally enabling this compiler option in the upstream Linux kernel.
In this presentation, we’ll share the progress we’ve made on this work as part of the Kernel Self-Protection Project (KSPP) over the past few months. We’ll go over the challenges we’ve encountered, show concrete code examples, and demonstrate how to fix these kinds of problems. We’ll also discuss why enabling this option is important for the kernel, and how we plan to complete this work in the near future.
Whether you’re a seasoned kernel developer or someone looking to start contributing upstream, this presentation will introduce useful helpers and strategies you can use to fix existing code or implement new functionality, and in doing so, help us harden the upstream Linux kernel for the benefit of everyone.
Originally, I opted for the __struct_group() approach to fix these issues.
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index 73516e263627..34c11644d5d7 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -227,12 +227,15 @@ enum ars_masks {
*/
struct nd_cmd_pkg {
- __u64 nd_family; /* family of commands */
- __u64 nd_command;
- __u32 nd_size_in; /* INPUT: size of input args */
- __u32 nd_size_out; /* INPUT: size of payload */
- __u32 nd_reserved2[9]; /* reserved must be zero */
- __u32 nd_fw_size; /* OUTPUT: size fw wants to return */
+ /* New members MUST be added within the __struct_group() macro below. */
+ __struct_group(nd_cmd_pkg_hdr, __hdr, /* no attrs */,
+ __u64 nd_family; /* family of commands */
+ __u64 nd_command;
+ __u32 nd_size_in; /* INPUT: size of input args */
+ __u32 nd_size_out; /* INPUT: size of payload */
+ __u32 nd_reserved2[9]; /* reserved must be zero */
+ __u32 nd_fw_size; /* OUTPUT: size fw wants to return */
+ );
unsigned char nd_payload[]; /* Contents of call */
};
The idea behind this approach is to _separate_ the flexible-array member from the rest of the members in the flexible structure, while at the same time a new type for the “header part” of the flexible structure is created. We then proceed to use the newly created type to replace the type of the objects causing trouble –the ones causing the -Wflex-array-member-not-at-end warnings– in the composite structures.
However, the approach briefly described above is usually better suited for fixing instances of structures with a flexible-array member in the middle that are intended to live on the heap –I’ll go into more detail about this approach in a future post.
So, after some time, I realized that using the DEFINE_RAW_FLEX() helper was probably a better approach in this particular case, since basically all the instances triggering the -Wflex-array-member-not-at-end warnings were on-stack objects.
The DEFINE_FLEX() and DEFINE_RAW_FLEX() macros were specifically designed to define automatic (on-stack) objects of a flexible structure type, where the size of the flexible-array member is known at compile time.
So, I submitted a new version of the patch above, this time using DEFINE_RAW_FLEX(), and without any changes in the UAPI header because we didn’t need to change struct d_cmd_pkg anymore:
Can this keep the C99 init-style with something like (untested):
_DEFINE_FLEX(struct nd_cmd_pkg, nd_cmd, nd_payload,
sizeof(struct nd_intel_get_security_state), {
.pkg = {
.nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
.nd_family = NVDIMM_FAMILY_INTEL,
.nd_size_out =
sizeof(struct nd_intel_get_security_state),
.nd_fw_size =
sizeof(struct nd_intel_get_security_state),
},
});
?
At the moment, the short answer was: no –the helper was not able to do that “cleanly.”
However, Dan’s question along with another issue I was recently trying to address, led me to take another look at the internals of _DEFINE_FLEX():
391 /**
392 * _DEFINE_FLEX() - helper macro for DEFINE_FLEX() family.
393 * Enables caller macro to pass (different) initializer.
394 *
395 * @type: structure type name, including "struct" keyword.
396 * @name: Name for a variable to define.
397 * @member: Name of the array member.
398 * @count: Number of elements in the array; must be compile-time const.
399 * @initializer: initializer expression (could be empty for no init).
400 */
401 #define _DEFINE_FLEX(type, name, member, count, initializer...) \
402 _Static_assert(__builtin_constant_p(count), \
403 "onstack flex array members require compile-time..."); \
404 union { \
405 u8 bytes[struct_size_t(type, member, count)]; \
406 type obj; \
407 } name##_u initializer; \
408 type *name = (type *)&name##_u
The “macro wizardry” behind this helper allows for the allocation of the total space needed for an instance of a flexible structure along with its flexible-array member on the stack. As mentioned above, there are situations where the size of the flexible-array member is known at compile time. In these cases, we can rely on the DEFINE_FLEX() family of helpers to declare the necessary objects, ensuring that any flexible-array member remains the last member of the composite structure –not placed in the middle.
Dan noticed that DEFINE_FLEX() and DEFINE_RAW_FLEX() are just wrappers for _DEFINE_FLEX(), and that these wrappers don’t allow for external static initialization:
However, _DEFINE_FLEX() _does_ have the infrastructure to initialize the members of the TYPE object created by this helper –with just a _small_ tweak: the obj member created by the helper at line 406 must be exposed to the call site, as shown in DEFINE_FLEX() at line 443 above when COUNTER is set to COUNT.
So I replied the following:
The code below works - however, notice that in this case we should
go through 'obj', which is an object defined in _DEFINE_FLEX().
_DEFINE_FLEX(struct nd_cmd_pkg, nd_cmd, nd_payload,
sizeof(struct nd_intel_get_security_state), = {
.obj = {
.nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
.nd_family = NVDIMM_FAMILY_INTEL,
.nd_size_out =
sizeof(struct nd_intel_get_security_state),
.nd_fw_size =
sizeof(struct nd_intel_get_security_state),
},
});
Then, in a subsequent e-mail I commented:
Now, I can modify the helper like this:
diff --git a/include/linux/overflow.h b/include/linux/overflow.h
index 69533e703be5..170d3cfe7ecc 100644
--- a/include/linux/overflow.h
+++ b/include/linux/overflow.h
@@ -404,7 +404,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
union { \
u8 bytes[struct_size_t(type, member, count)]; \
type obj; \
- } name##_u initializer; \
+ } name##_u = { .obj initializer }; \
type *name = (type *)&name##_u
/**
and then we can use the helper as follows:
_DEFINE_FLEX(struct nd_cmd_pkg, nd_cmd, nd_payload,
sizeof(struct nd_intel_get_security_state), = {
.nd_command = NVDIMM_INTEL_GET_SECURITY_STATE,
.nd_family = NVDIMM_FAMILY_INTEL,
.nd_size_out =
sizeof(struct nd_intel_get_security_state),
.nd_fw_size =
sizeof(struct nd_intel_get_security_state),
});
OK, I'll go and update the helper.
-Gustavo
And that’s exactly what I did in the patch below. With that change, the _DEFINE_FLEX() helper now works correctly and can statically initialize struct members without exposing any internals. 🙂
diff --git a/include/linux/overflow.h b/include/linux/overflow.h
index f33d74dac06f2b..7b7be27ca11318 100644
--- a/include/linux/overflow.h
+++ b/include/linux/overflow.h
@@ -396,7 +396,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
* @name: Name for a variable to define.
* @member: Name of the array member.
* @count: Number of elements in the array; must be compile-time const.
- * @initializer: initializer expression (could be empty for no init).
+ * @initializer: Initializer expression (e.g., pass `= { }` at minimum).
*/
#define _DEFINE_FLEX(type, name, member, count, initializer...) \
_Static_assert(__builtin_constant_p(count), \
@@ -404,7 +404,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
union { \
u8 bytes[struct_size_t(type, member, count)]; \
type obj; \
- } name##_u initializer; \
+ } name##_u = { .obj initializer }; \
type *name = (type *)&name##_u
/**
@@ -444,7 +444,7 @@ static inline size_t __must_check size_sub(size_t minuend, size_t subtrahend)
* elements in array @member.
*/
#define DEFINE_FLEX(TYPE, NAME, MEMBER, COUNTER, COUNT) \
- _DEFINE_FLEX(TYPE, NAME, MEMBER, COUNT, = { .obj.COUNTER = COUNT, })
+ _DEFINE_FLEX(TYPE, NAME, MEMBER, COUNT, = { .COUNTER = COUNT, })
Also, as seen in the patch above, I also fixed the DEFINE_FLEX() wrapper.
So, here is a short example of how to use this helper to statically initialize members in a structure –let’s assume FIXED_SIZE == 1:
In case you don’t need to initialize any members to specific values, just pass = {} as argument, or probably you should just use DEFINE_RAW_FLEX() or DEFINE_FLEX() instead.
While in Australia 🇦🇺, I had the honor of being invited to give a guest talk to graduate and master’s students at The University of Adelaide. It was a truly special experience because it was my first time presenting at a university, and one I deeply value as a meaningful milestone in my career. 🙂🙏🏼
Enhancing spatial safety: Better array-bounds checking in C (and Linux) (University of Adelaide –Guest talk)
The C language has historically suffered from a lack of proper bounds-checking on all kinds of arrays. The Kernel Self-Protection Project has been addressing this issue for several years. In this presentation, we will learn about the most recent hardening efforts to resolve the problem of bounds-checking, particularly for fixed-size and flexible arrays.
We will explore the different mechanisms being used to harden key APIs like memcpy() against buffer overflows, which includes the use of some interesting built-in compiler functions. We will also talk about a couple of recent compiler options like -fstrict-flex-arrays and -Wflex-array-member-not-at-end, as well as the new counted_by attribute released in Clang-18 a year ago, which helps us gain run-time bounds-checking coverage on flexible arrays.
Overall, we will discuss how various challenges have been overcome, and highlight the innovations developed to solve the problem of array bounds-checking in both C and the Linux kernel once and for all.
Earlier this year, I traveled to Australia 🇦🇺 to present for the second consecutive year at the Everything Open conference in Adelaide. I was so happy to be back in Australia – it was a great experience to travel to the other side of the world once again to speak about upstream Linux kernel hardening and share the work we do in the Kernel Self-Protection Project. ⚔️🛡️🐧
Huge thanks to the organizers for inviting me to present! 🙂🙌🏽
Enhancing spatial safety in the Linux kernel: Fixing thousands of -Wfamnae warnings
The introduction of the new -Wflex-array-member-not-at-end compiler option, released in GCC-14, has revealed approximately 60,000 warnings in the Linux kernel. Among them, some legitimate bugs have been uncovered.
In this presentation, we will explore in detail the different strategies we are employing to resolve all these warnings. These methods have already helped us resolve about 30% of them. Our ultimate goal in the Kernel Self-Protection Project is to globally enable this option in mainline, further enhancing the security of the upstream Linux kernel in the spatial safety domain.
Additionally, we will briefly review the recent history of hardening efforts that have led to the unveiling of these tens of thousands of warnings. This process illustrates the extensive and gradual nature of hardening the kernel, highlighting the challenges and persistence required to enhance its security. Looking ahead, after enabling this compiler option in mainline, I will briefly discuss the next challenge the Kernel Self-Protection Project will likely focus on.