bpo-37562 Refactor _PyObject_Vectorcall to improve performance a bit.#14735
bpo-37562 Refactor _PyObject_Vectorcall to improve performance a bit.#14735markshannon wants to merge 1 commit intopython:masterfrom
Conversation
1c76b6f to
444210d
Compare
|
Am I right that this is really just inlining |
jdemeyer
left a comment
There was a problem hiding this comment.
This PR rewrites some code as completely equivalent code. It happens that this new variant is very slightly faster (about 2 clock cycles per call). Since the old and new code are equivalent, this is just because of certain choices made by the compiler (like register allocation) and not because of a fundamental reason.
In my opinion, the benefit of this PR is too uncertain (maybe other compilers optimize this differently) and the complexity added by this PR is not justified.
|
It is not a completely equivalent. If it were, there would be no change in performance. The control flow graph of the two formulations differs. The CFG for While it is entirely possible that a future compiler will produce as good code for master as for this PR, I know of know of no mechanism that will allow a compiler will produce better code from master than this PR. As for additional complexity, I really don't see any. This PR adds 6 lines, of which 2 are asserts. IMO, this PR rewrites |
|
I propose #15144 as simpler alternative. |
|
I don't see how #15144 is a simpler alternative. It is larger and it doesn't work on Windows. |
|
Since bpo-37562 is closed, and this code conflicts, I'm closing this PR. |
Refactors
_PyObject_Vectorcallwhich (on my machine) reduces the performance regression observed in bpo-37562https://bugs.python.org/issue37562