-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
bpo-32477: Move jumps optimization from the peepholer to the compiler. #5077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This doesn't look like it's really removing a peephole optimization. Instead, it moves a peephole optimization from the peepholer to the compiler, but the implementation looks similar. At least having the peepholer in a separate file means the compiler doesn't get more complicated by optimizations. |
a6a670a to
99cf21b
Compare
|
The changes to What are the changes? Are there branches that are now removed that weren't before? |
99cf21b to
c8d598e
Compare
|
@markshannon This is diff of importlib dis: http://paste.ubuntu.com/p/4yn278YDrV/ This PR removes unused codes after RETURN_VALUE. It merges two step unconditional JUMPs into one JUMP. |
|
What's the status of this PR? @serhiy-storchaka asked me to not fix compiler warnings with my PR #10652 before this PR is merged. @pitrou doesn't seem to like the overall idea:
I'm not sure if @rhettinger likes the idea (but his comment is not really on the PR itself): I understand that this change allows to remove more deadcode. Would it be possible to get the same effect by organizing the peephole optimizer differently? For example, would we also remove more deadcode by running the peephole optimizer twice? I'm trying to understand if removing more deadcode can only be done by moving the jump optimization inside the compiler. I have no opinion on this PR at this point :-) |
|
EDIT: _module_repr_from_spec() code was wrong, I fixed it. I tested on this Python code: def _module_repr_from_spec(spec):
"""Return the repr to use for the module."""
# We mostly replicate _module_repr() using the spec attributes.
name = '?' if spec.name is None else spec.name
if spec.origin is None:
if spec.loader is None:
return '<module {!r}>'.format(name)
else:
return '<module {!r} ({!r})>'.format(name, spec.loader)
else:
if spec.has_location:
return '<module {!r} from {!r}>'.format(name, spec.origin)
else:
return '<module {!r} ({})>'.format(spec.name, spec.origin)I ran the peephole optimizer twice: diff --git a/Python/compile.c b/Python/compile.c
index 45e78cb22c..7d361c6af0 100644
--- a/Python/compile.c
+++ b/Python/compile.c
@@ -5577,6 +5577,13 @@ makecode(struct compiler *c, struct assembler *a)
if (!bytecode)
goto error;
+ PyObject *bytecode2 = PyCode_Optimize(bytecode, consts, names, a->a_lnotab);
+ Py_CLEAR(bytecode);
+ bytecode = bytecode2;
+ if (!bytecode) {
+ goto error;
+ }
+
tmp = PyList_AsTuple(consts); /* PyCode_New requires a tuple */
if (!tmp)
goto error;The bytecode is different. Before: After: => the JUMP_FORWARD has been removed: it was deadcode. As a side effect, offsets and jump targets are different. Example: LOAD_FAST moves from offset 66 to 62, and "POP_JUMP_IF_FALSE 86" becomes "POP_JUMP_IF_FALSE 84". The question is now if we can modify the peephole to get the same result when it's only run once, or if we have to move the jump optimization to the compiler. |
|
I can be wrong, but it seems like the optimizer doesn't remove basic blocks which are deadcode. I implemented such optimization in my peepholer optimizer (for Python bytecode) written in pure Python: peephole.c uses an hash table: offset => block index. My bytecode project uses an additiona "next_block" information. It is used in the dead code removal to follow the chain of blocks and check which basic blocks are dead code. compile.c uses a different structure for basic blocks than peephole.c: Maybe this one can be used to detect unreachable basic blocks? I would suggest to leave peephole.c as it is, but try to remove unreachable basic blocks in compile.c. What do you think @serhiy-storchaka? |
|
Note: the pure Python peephole optimizer has a few doc at https://bytecode.readthedocs.io/en/latest/peephole.html#optimizations |
|
I discussed with @serhiy-storchaka on IRC, he told me:
For the first point, I think that it's fine. We can organize PyCode_Optimize() to first optimize jumps, and only later remove unreachable blocks. For the second point, maybe that's a reference to this comment in peephole.c? |
c8d598e to
7f7c575
Compare
|
I'm closing this, as it is obsolete. |
https://bugs.python.org/issue32477