Skip to content

Conversation

@markshannon
Copy link
Member

@markshannon markshannon commented Jun 2, 2023

Shows a surprising, but pleasing, 1.7% speedup.
These are common instructions, so the reduction in code size and memory reads is considerable.

In faster-cpython/ideas#585 I refer these to as "single instruction superinstructions", which is a rather clumsy name. I'm not sure if these need a name, but suggestions for a better name are welcome.

I've left the superinstructions involving LOAD_CONST alone, as we are looking at other approaches to speed up constants.

@markshannon markshannon changed the title Replace some superinstructions with single instruction equivalent. GH-105229: Replace some superinstructions with single instruction equivalent. Jun 2, 2023
Copy link
Member

@brandtbucher brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these need a name, but suggestions for a better name are welcome.

"Combined instructions", maybe?

def_op('DICT_UPDATE', 165)

def_op('LOAD_FAST_LOAD_FAST', 168)
def_op('STORE_FAST_LOAD_FAST', 169)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, how are the stats on this? Is it worth keeping?

I would expect most of these pairs to span multiple lines. Although maybe comprehensions and generator expressions make up for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, as you say it might be quite low.
I think we should keep it for this PR anyway, as we are replacing two-codeunit superinstructions with the one-codeunit equivalent. We can change the choice of superinstructions in another PR.

@@ -0,0 +1 @@
Replace some dynamic superinstructions will single instruction equivalents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Replace some dynamic superinstructions will single instruction equivalents.
Replace some dynamic superinstructions with single instruction equivalents.

}

static void
make_super_instruction(cfg_instr *inst1, cfg_instr *inst2, int super_op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
make_super_instruction(cfg_instr *inst1, cfg_instr *inst2, int super_op)
make_combined_instruction(cfg_instr *inst1, cfg_instr *inst2, int combined_op)

@@ -0,0 +1 @@
Replace some dynamic superinstructions will single instruction equivalents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Replace some dynamic superinstructions will single instruction equivalents.
Replace some dynamic superinstructions with single instruction equivalents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh. Crossed posts.

Comment on lines 1592 to 1594
/* Skip if instructions are on different lines */
if (inst1->i_loc.lineno != inst2->i_loc.lineno) {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a comment from @brandtbucher that this requirement eliminated most super-instructions (faster-cpython/ideas#585 (comment)). And yet your pyperformance results on this PR are quite good. I wonder how these two things are both true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instructions without observable side effects (e.g. LOAD_FAST, or maybe POP_TOP), can't we still do a super instruction and keep the NOP for the extra lineno?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a comment

Oh, I realize now that this comment was specific to STORE_FAST__LOAD_FAST.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how these two things are both true.

Oh, I guess the most likely answer is in faster-cpython/ideas#585 (comment) :

it appears that the reduction in code size more than compensates.

Copy link
Member Author

@markshannon markshannon Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is just my hypothesis.

The goal is to remove superinstructions that cross block and line boundaries, as they are problematic.
I was just trying to not slow things down by removing them. A speedup was just a pleasant surprise.
Even if the speedup is mainly from layout changes, or cache effects, the main motivation of this PR remains valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants