-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
Description
Bug report
I've seen this in the free-threaded build, but I think the problem can theoretically occur in the default build as well.
The problem is that after a fork(), an already dead ThreadHandle may be deallocated before it's marked as not joinable. The ThreadHandle_dealloc() function can crash in PyThread_detach_thread():
cpython/Modules/_threadmodule.c
Lines 66 to 70 in bcccf1f
| ThreadHandle_dealloc(ThreadHandleObject *self) | |
| { | |
| PyObject *tp = (PyObject *) Py_TYPE(self); | |
| if (self->joinable) { | |
| int ret = PyThread_detach_thread(self->handle); |
The steps leading to the crash are:
- A thread
T2starts and finishes, but is not joined. TheThreadHandleis not immediately deallocated, either because it's part of a larger reference cycle or due to biased reference counting (in the free-threaded build) - The main thread calls
fork() - In the child process, during
PyOS_AfterFork_Child(), theThreadHandleis deallocated. I've seen this happen in the free-threaded build due to biased reference counting merging the thread states inPyThreadState_Clear(). I believe this can also happen in the default build if, for example, a GC is triggered early on duringthreading._after_fork()before we get to marking theThreadHandleas not joinable.
Proposed fix
Early on in PyOS_AfterFork_Child(), we should fix up all ThreadHandle objects from C (without executing Python code) -- we should mark the dead ones as not joinable and update the remaining active thread.
I think it's important to do this without executing Python code. Once we start executing Python code, almost anything can happen, such as GC collections, destructors, etc.
cc @pitrou @gpshead @ericsnowcurrently