-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
Closed
Labels
Description
There are a few remaining scaling bottlenecks in the free-threaded build that we should fix.
I have been using the following benchmark to detect bottlenecks that were previously issues in older versions of the nogil forks:
https://gist.github.com/colesbury/429fe9f90036d43ad43576c3d357a12e
Note that for reliable results the above benchmark requires some setup:
- Adjust
NTHREADSif necessary on your system - Disable turbo boost or equivalent on your system
- Avoid running on hyper-threading siblings (i.e., use
taskset -c 0-<N>to choose separate physical cores)
Current bottlenecks
- cmodule_function
- load_string_const
- load_tuple_const
- create_closure
Underlying issues
- Reference count contention on non-string constants. We will want to immortalize most constants in
PyCodeObject. - Reference count contention on
func.__qualname__orcode.co_qualname(when creating closure) - Reference count contention on module-level
PyCFunctionObjects
Linked PRs
- gh-118527: Use
_Py_ID(__main__)for main module name #118528 - gh-118527: Use deferred reference counting for C functions on modules #118529
- gh-118527: Intern filename, name, and qualname in code objects. #118558
- gh-118527: Intern code name and filename on default build #118576
- gh-118527: Intern code consts in free-threaded build #118667