gh-115999: Enable free-threaded specialization of LOAD_CONST#129365

Yhg1s · 2025-01-27T16:23:51Z

Enable free-threaded specialization of LOAD_CONST, which is an unconditional specialization without stats bookkeeping, so it's a simple opcode replace. (It could even be a relaxed store if we didn't have to worry about instrumentation.)

Issue: Make the specializing interpreter thread-safe in --disable-gil builds #115999

Yhg1s · 2025-01-27T16:24:57Z

I'm not sure why LOAD_CONST was already checked off on the issue page. The test fails for free-threaded builds without the accompanying bytecode change.

mpage · 2025-01-27T17:31:39Z

I'm not sure why LOAD_CONST was already checked off on the issue page.

I think specializing LOAD_CONST during bytecode execution was a recent change: #128708. Previously, it happened when the bytecode was quickened during creation of the code object.

Yhg1s · 2025-01-27T19:31:02Z

Looks like about 1% gain from this, but given noise levels it could be a wash. Still, considering way more things are currently immortal in the free-threaded build it's probably a bigger win than in the regular build :D

https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20250127-3.14.0a4%2B-c92089c-NOGIL/bm-20250127-vultr-x86_64-Yhg1s-more_spec-3.14.0a4%2B-c92089c-vs-base.svg

Fidget-Spinner · 2025-01-28T02:44:33Z

Python/generated_cases.c.h

+            uint8_t expected = LOAD_CONST;
+            _Py_atomic_compare_exchange_uint8(
+                &this_instr->op.code, &expected,
+                _Py_IsImmortal(obj) ? LOAD_CONST_IMMORTAL : LOAD_CONST_MORTAL);


If you want a bigger win, perhaps you could allow deferred references as well. They behave the same as immortal objects if they're live on the stack.

Considering many LOAD_CONST are deferred objects, that should be a nice win in perf.

That can come in another PR though.

Yeah, that's probably a good idea. Running some tests on the testsuite, about 75% of all LOAD_CONST invocations involved immortal objects, 25% were deferred, and the remainder is less than 1% (in the free-threaded build. Not nearly as many immortal objects in a regular build). I suspect we'll get fewer immortal objects over time, and most of them will probably be deferred instead, I guess?

Now the question is: do I add PyStackRef_FromPyObjectDeferred(), or do I change PyStackRef_FromPyObjectImmortal() to accept deferred objects as well? :)

Python/bytecodes.c

mpage

LGTM!

Enable free-threaded specialization of LOAD_CONST.

c92089c

Yhg1s requested review from colesbury and mpage January 27, 2025 16:23

bedevere-app bot mentioned this pull request Jan 27, 2025

Make the specializing interpreter thread-safe in --disable-gil builds #115999

Closed

Yhg1s marked this pull request as ready for review January 27, 2025 19:30

Yhg1s requested a review from markshannon as a code owner January 27, 2025 19:30

bedevere-app bot added the awaiting core review label Jan 27, 2025

Fidget-Spinner reviewed Jan 28, 2025

View reviewed changes

mpage reviewed Jan 28, 2025

View reviewed changes

Python/bytecodes.c Outdated Show resolved Hide resolved

Fix incorrect use of compare-exchange.

a07d32f

Yhg1s added the skip news label Jan 28, 2025

mpage approved these changes Jan 28, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Jan 28, 2025

Yhg1s merged commit 5c930a2 into python:main Jan 29, 2025
70 checks passed

bedevere-app bot removed the awaiting merge label Jan 29, 2025

Yhg1s deleted the more-spec branch January 29, 2025 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-115999: Enable free-threaded specialization of LOAD_CONST#129365