bpo-39689: _struct: Avoid undefined behavior when loading native _Bool#18925
bpo-39689: _struct: Avoid undefined behavior when loading native _Bool#18925encukou wants to merge 3 commits intopython:masterfrom
Conversation
To avoid undefined behavior in the C code, we assume that in every mode, the size is 1, '\x00' is false, and `\x01` is true.
| case '?': UNPACK_SINGLE(ld, ptr, _Bool); goto convert_bool; | ||
| // memcpy-ing values other than 0 or 1 to a _Bool variable triggers | ||
| // undefined behavior, so cast from char instead. See bpo-39689. | ||
| case '?': ld = (_Bool)*ptr; goto convert_bool; |
There was a problem hiding this comment.
In theory this can trigger an unaligned access if sizeof(_Bool) > 1. Unaligned accesses that result in bus errors are the main reason for the memcpy().
Also I wonder if UB checkers can't get smart enough to figure out that e.g. (_Bool)*ptr == 42 and then complain.
There was a problem hiding this comment.
In theory this can trigger an unaligned access if sizeof(_Bool) > 1. Unaligned accesses that result in bus errors are the main reason for the memcpy().
Ah! You are right, thank you. This PR won't work as is. I won't have time to work on this soon, so I'll close it in the mean time.
Also I wonder if UB checkers can't get smart enough to figure out that e.g. (_Bool)*ptr == 42 and then complain.
I don't have access to the C standard, but from what I've read, casting to _Bool will convert non-zero values to 1. So this should be fine.
There was a problem hiding this comment.
So this should be fine.
Yes, I misread the construct as *(_Bool *)ptr), I now see that you are deliberately dereferencing the char pointer.
Hmm, the rank of _Bool is the smallest, but sizeof(char) can still be larger than sizeof(_Bool). So I wouldn't want this construct either.
There was a problem hiding this comment.
sizeof(char) is guaranteed to be 1. It can't be larger than sizeof(_Bool).
Alignment may be a different matter (or may not; I don't know).
To avoid undefined behavior in the C code, we assume that in
every mode, the size is 1,
\x00is false, and\x01is true.With that assumption we can cast from
charto_Bool, and avoid UB.https://bugs.python.org/issue39689