Skip to content

Optimize set.pop() to advance a pointer instead of indexing.#10429

Merged
miss-islington merged 2 commits intopython:masterfrom
rhettinger:set_pop_entry_logic
Nov 9, 2018
Merged

Optimize set.pop() to advance a pointer instead of indexing.#10429
miss-islington merged 2 commits intopython:masterfrom
rhettinger:set_pop_entry_logic

Conversation

@rhettinger
Copy link
Copy Markdown
Contributor

Gives approx 20% speed-up using clang depending on the number of elements in the set (the less dense the set, the more the speed-up).

Uses the same entry++ logic used elsewhere in the setobject.c code.

@rhettinger
Copy link
Copy Markdown
Contributor Author

FWIW, here is disassembly of the inner-loop which is now very tight:

LBB24_5:                                ## =>This Inner Loop Header: Depth=1
	movq	%rsi, %rax
	addq	$16, %rax
	movq	%r10, %rsi
	cmpq	%rcx, %rax
	jbe	LBB24_6
## %bb.7:                               ##   in Loop: Header=BB24_5 Depth=1
	movq	(%rsi), %rax
	testq	%rax, %rax
	je	LBB24_5
LBB24_8:                                ##   in Loop: Header=BB24_5 Depth=1
	cmpq	%r9, %rax
	je	LBB24_5

The set_pop() function entry and exit code is also tighter (formerly it saved and restored three registers, and now it skips that work).

Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide microbenchmarks?

In most cases the loop ends after 1 or 2 iterations. Additional operations before and after the loop can eat the benefit of the optimization of short loop.

entry->hash = -1;
so->used--;
so->finger = i + 1; /* next place to start */
so->finger = entry - so->table; /* next place to start */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is + 1 missed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Thanks for noticing.

@serhiy-storchaka serhiy-storchaka removed their assignment Nov 9, 2018
@rhettinger
Copy link
Copy Markdown
Contributor Author

I can't run new benchmarks right now (my build has been broken for a couple of days since the extensive include file changes went in). The benchmark looked like this:

$ python -m timeit -r11 -s 's=set(range(10_000))' -s 's_pop=s.pop' 'while s: s_pop()'

@miss-islington
Copy link
Copy Markdown
Contributor

@rhettinger: Status check is done, and it's a success ✅ .

@miss-islington miss-islington merged commit cf5863f into python:master Nov 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage skip issue skip news

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants