unix: work around glibc semaphore race condition#1795
unix: work around glibc semaphore race condition#1795addaleax wants to merge 8 commits intolibuv:v1.xfrom
Conversation
Hack around https://sourceware.org/bugzilla/show_bug.cgi?id=12674 by providing a custom implementation for glibc < 2.21 in terms of other concurrency primitives. The glibc implementation on these versions is inherently unsafe. So, while libuv and Node.js support those versions, it seems to make sense for libuv in its functionality as a platform abstraction library to provide a working version. Fixes: nodejs/node#19903
bnoordhuis
left a comment
There was a problem hiding this comment.
Nice work, Anna. Left some comments.
| } | ||
|
|
||
| #else /* !(defined(__APPLE__) && defined(__MACH__)) */ | ||
| #elif defined(__GLIBC__) && __GLIBC_MINOR__ < 21 |
There was a problem hiding this comment.
This needs to be either unconditional on the version or detect the version at runtime (could use gnu_get_libc_version() for that) because the build time and runtime versions can be different.
There was a problem hiding this comment.
@bnoordhuis I’ve removed the version check for now.
How would checking at runtime work best? I guess we don’t want to parse the version every time one of these functions is called, so maybe using some __attribute((constructor)) function that sets some local variable which keeps track of whether we need this is a good idea?
There was a problem hiding this comment.
Yep, that could work, or a uv_once_t in uv_sem_init().
There was a problem hiding this comment.
@bnoordhuis I went with that … could you take another look at the most recent commit?
| int err; | ||
| uv_semaphore_t* sem; | ||
|
|
||
| sem = malloc(sizeof(uv_semaphore_t)); |
There was a problem hiding this comment.
Can you use uv__malloc() and uv__free() (and sizeof(*sem))?
| return err; | ||
| } | ||
|
|
||
| assert(sizeof(uv_sem_t) >= sizeof(uv_semaphore_t*)); |
There was a problem hiding this comment.
Can you turn this into a STATIC_ASSERT()?
|
|
||
| assert(sizeof(uv_sem_t) >= sizeof(uv_semaphore_t*)); | ||
| sem->value = value; | ||
| *(uv_semaphore_t**) sem_ = sem; |
There was a problem hiding this comment.
You're welcome to turn uv_sem_t into an union if you think that's neater. It wouldn't be an API change (opaque type) and not an ABI break either.
There was a problem hiding this comment.
I can do that if you like, but tbh I think this is fine as-is for now.
There was a problem hiding this comment.
LGTM, thanks. CI: https://ci.nodejs.org/job/libuv-test-commit/790/
| static int glibc_needs_custom_semaphore = 0; | ||
|
|
||
| void glibc_version_check(void) { | ||
| const char* version = gnu_get_libc_version(); |
There was a problem hiding this comment.
Can you make this static?
(edit: for some reason, github puts the comment on the wrong line. I mean the function on line 522.)
There was a problem hiding this comment.
@bnoordhuis thanks for pointing that out – done!
| typedef struct uv_semaphore_s { | ||
| uv_mutex_t mutex; | ||
| uv_cond_t cond; | ||
| unsigned int value; |
There was a problem hiding this comment.
@gireeshpunathil It’s protected by a mutex, so I think we’re good here?
There was a problem hiding this comment.
@addaleax - I don't think that alone will help, there is nothing preventing the word being cached in an L2 D-cache of the thread's CPU - that leads to other threads getting stale values. The cache lines are immediately flushed only if the word is a volatile. If you doubt, I can check and confirm - in platforms like PPC that implement weakly consistent memory model, the issue will be pronounced, and we can easily test.
There was a problem hiding this comment.
@gireeshpunathil They generate the same machine code, byte-for-byte, for me locally… if you are sure that on PPC there’s a difference, I’d be curious about that.
Also, according to this SO answer, the mutex implementation already issues appropriate memory barriers on its own, which matches my expectations and goes beyond what would volatile would effect anyway?
Like, if you want, I can add volatile – there is probably no harm in doing so. But I’m more or less confident that this is not really it’s purpose.
There was a problem hiding this comment.
Yes, volatile isn't necessary; uv_mutex_lock() and uv_mutex_unlock() are compiler and memory barriers.
Aside, an architecture's memory model is really only relevant with multiple loads or stores to different locations. I'm not aware of any architecture where a single load or store on a naturally aligned word boundary isn't atomic.
(Doesn't mean the compiler can't reorder it with other loads and stores but that's a different issue.)
There was a problem hiding this comment.
I did few tests to convince myself that the mutex applies memory barriers on the data it protects (effect of executing mfence / msync) in the platforms I tested, though I don't see any documented support for mutex APIs on this behavior. It may be that the mutex's lock variable itself requires fencing to synchronize with other threads, and the fencing on the mutex variable also covers the data?
In terms of memory models: my point was not on atomicity (I agree most of the modern instrcutions are atomic for words) but on visibility: stores made by one CPU on cached data to be made available to other CPUs in the very next CPU cycle, volatile is the only option I believe, else it lives in cache, un-flushed. Unfortunately in my tests, I wasn't able to place my word in cache, so my theory stands un-proven. Volatile or non-volatile, code and output are different, but gcc never produced fencing instructions, so I think the compiler knows that this my word will never be cached, and will be written-through.
The below code is mere manifestation of compiler using memory vs. register for the stores, based on volatile or non-volatile, no caching related considerations in the picture at all.
#include <pthread.h>
#include <stdio.h>
int flag;
// word in question
int sem;
void* w(void* arg) {
//don't escape into OS primitives, stay in the user space.
while(1) {
if( sem > (1024 * 1024)) sem = 0;
sem++;
}
}
int main() {
pthread_t thread;
pthread_create(&thread, NULL, w, NULL);
while(1) {
if(flag++ > (1024 * 1024 * 1024)) {
flag = 0;
fprintf(stderr, "%d\n", sem);
}
}
pthread_join(thread, NULL);
return 0;
}$ gcc rec.c -lpthread -O3
$ ./a.out
0
0
0
^C
with
- int sem;
+ volatile int sem;$ gcc rec.c -lpthread -O3
$ ./a.out
475321
475321
1040652
^C
There was a problem hiding this comment.
It may be that the mutex's lock variable itself requires fencing to synchronize with other threads, and the fencing on the mutex variable also covers the data?
Yes, pretty much. You can't really implement a mutex without making it a full barrier; all effects from the critical section should be immediately visible to other threads when the mutex is unlocked.
volatileis the only option I believe, else it lives in cache, un-flushed.
No, volatile is exclusively a compiler construct. For cache coherency, you need memory barriers.
volatile tells the compiler that the pointed-to memory may change between loads and stores and that it's not safe to reorder with other volatile loads and stores. It's still allowed to reorder with non-volatile loads and stores, so you can't use it to delineate critical sections.
That said, a volatile int coupled with a __asm__ __volatile__ ("" ::: "memory") compiler barrier is sufficient to implement some synchronization primitives on architectures with strong memory models (like x86_64) but you'd still need to be careful of reordering at the hardware level.
(And if you really want to go off the deep end: you can avoid expensive memory barriers on x86 by simply doing something else until the store has propagated to other cores. The timing is micro-architecture dependent but it's usually > 200 cycles.)
|
LGTM |
| int err; | ||
| uv_semaphore_t* sem; | ||
|
|
||
| sem = malloc(sizeof(uv_semaphore_t)); |
There was a problem hiding this comment.
use uv__malloc()?
style: sizeof(*sem);
| return UV_ENOMEM; | ||
|
|
||
| if ((err = uv_mutex_init(&sem->mutex)) != 0) { | ||
| free(sem); |
| } | ||
|
|
||
| sem->value = value; | ||
| *(uv_semaphore_t**) sem_ = sem; |
There was a problem hiding this comment.
style: remove space before sem_ ?
There was a problem hiding this comment.
@santigimeno Done!
(I think the other comments were already addressed in earlier updates)
There was a problem hiding this comment.
(I think the other comments were already addressed in earlier updates)
Yes, sorry I didn't notice .
Hack around https://sourceware.org/bugzilla/show_bug.cgi?id=12674 by providing a custom implementation for glibc < 2.21 in terms of other concurrency primitives. The glibc implementation on these versions is inherently unsafe. So, while libuv and Node.js support those versions, it seems to make sense for libuv in its functionality as a platform abstraction library to provide a working version. Fixes: nodejs/node#19903 PR-URL: #1795 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
|
Landed in fbd6de3. Thanks! |
|
👌 beautiful work, thanks for tracking this one down @addaleax |
|
@rvagg thanks! this one was definitely fun to debug :) |
|
I'm personally a bit surprised from this approach – patching the bug inside libuv. (no offense meant) I'd imagine that affected systems want to patch their glibc instead/anyway, even if just because of the risk that some other SW might hit the same bug. I have hit some other glibc pthreads bug myself, months ago. In my case it wasn't possible to avoid the call in a similar way, but I still wouldn't take the approach if it was possible. |
|
@vcunat I mean, the point is more or less that libuv started using the broken code itself and ran into trouble on already-released versions. Yes, people should upgrade their systems, but doing so isn’t always feasible, and it’s not an option for Node.js or libuv to raise the glibc requirement in the middle of a release line. |
|
Hmm, I see that the fact of libuv starting to use glibc semaphores internally might be perceived as a regression, and it would be much harder to suggest to "users" (distributions) that they apply a patch to glibc instead. EDIT: so I think I understand the reasoning for this case and I don't much like it, but most of the cost is born by the libuv maintainers anyway... |
|
We support rhel/centos 6 and 7 and those ship with glibc 2.12 and 2.17 respectively. RedHat supports 7 until mid 2024 so our options basically are:
I like working around bugs in other people's software as much as the next guy (that is, I don't) but 1. seems like the least-bad option out of the three. |
|
Oh, and this PR also paved the way for a better semaphore implementation on z/os, see #1805. |
|
rhel/centos have roughly 200 patches on the vanilla glibc, most of them backported bugfixes, so I think those could easily be persuaded, but there most likely are other distributions where it will be a problem. (Red Hat also have maintainers working upstream and backport patches to other older branches.) |
|
@vcunat Is there any clear downside to addressing the problem in libuv? |
|
I see nothing significant. Tiny performance downgrade of semaphores even on new glibc. (And perhaps a noticeable downgrade on older glibc patched to fixed semaphores, but such distros can patch-out your libuv patch as well if they care enough, though that's a very hypothetical combination.) |
On 64-bit AIX `sizeof(uv_sem_t)` is 4 bytes which is not large enough to store a pointer. AIX doesn't use glibc so the work around introduced by libuv#1795 doesn't apply, so guard the STATIC_ASSERT with `#ifdef __GLIBC__`.
On 64-bit AIX `sizeof(uv_sem_t)` is 4 bytes which is not large enough to store a pointer. AIX doesn't use glibc so the work around introduced by libuv#1795 doesn't apply, so guard the STATIC_ASSERT so that it is only used when the custom semaphore implementation is used.
Pass -Dtarget_arch=ppc64 to gyp_uv.py to activate. Refs: nodejs/node#20129 Refs: #1795 PR-URL: #1807 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
On 64-bit AIX `sizeof(uv_sem_t)` is 4 bytes which is not large enough to store a pointer. AIX doesn't use glibc so the work around introduced by #1795 doesn't apply, so guard the STATIC_ASSERT so that it is only used when the custom semaphore implementation is used. Refs: nodejs/node#20129 Refs: #1795 PR-URL: #1808 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Hack around https://sourceware.org/bugzilla/show_bug.cgi?id=12674
by providing a custom implementation for glibc < 2.21 in terms of other
concurrency primitives.
The glibc implementation on these versions is inherently unsafe.
So, while libuv and Node.js support those versions, it seems to make
sense for libuv in its functionality as a platform abstraction library
to provide a working version.
Fixes: nodejs/node#19903
/cc @bnoordhuis