Skip to content

Conversation

@seberg
Copy link
Contributor

@seberg seberg commented Dec 12, 2019

All keywords should first be checked for pointer identity. Only
after that failed for all keywords (unlikely) should unicode
equality be used.
The original code would call unicode equality on any non-matching
keyword argument. Meaning calling it often e.g. when a function
has many kwargs but only the last one is provided.


Unless I am missing something big, this seems like it was the original intention, and simple typo. It may make sense to backport?

I have to admit, I have not actually recompiled and timed this, I may do that tomorrow (but I have never had the need to compile python, so need to look up how).

https://bugs.python.org/issue39028

@seberg seberg changed the title ENH: Fix performance issue in keyword extraction bpo-39028: Fix performance issue in keyword extraction Dec 12, 2019
@seberg
Copy link
Contributor Author

seberg commented Dec 12, 2019

Ah, I missed that this code is used for PyArg_* where it is not clear that strings are almost always interned, so the change would likely be useful for argument clinic functions, but probably not in general.

EDIT: Both can make sense I guess, but for the clinic it is clearly interned I think? So another option is to have two versions here...

Python/getargs.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is related to kwname == key.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change LGTM, but please move the comment before the pointers comparison or before the first loop.

Although this is almost dead code. It is used in just few sites generated by Argument Clinic and soon will not be used at all.

@serhiy-storchaka
Copy link
Member

Ah, sorry, the use in _PyArg_UnpackKeywords() is not dead, so this optimization has long term effect.

@seberg seberg force-pushed the kwarg-extract-performance branch from ba65512 to 17561c2 Compare December 12, 2019 17:31
@seberg
Copy link
Contributor Author

seberg commented Dec 12, 2019

Yeah, although it only will have remotely noticeable effects for >3 kwargs probably, which seems pretty rare in python. Anyway moved and tweaked the comment.

@seberg seberg changed the title bpo-39028: Fix performance issue in keyword extraction bpo-39028: Performance enhancement in keyword extraction Dec 12, 2019
Python/getargs.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the second loop, why we need to have both of find_keyword_interned and find_keyword?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, sorry, I do not think we do. I was trying around with it a bit, and apparently forgot to commit before pushing or something :/. fixed.

All keywords should first be checked for pointer identity. Only
after that failed for all keywords (unlikely) should unicode
equality be used.
The original code would call unicode equality on any non-matching
keyword argument. Meaning calling it often e.g. when a function
has many kwargs but only the last one is provided.
@seberg seberg force-pushed the kwarg-extract-performance branch from 17561c2 to 52562b9 Compare December 13, 2019 14:19
@methane
Copy link
Member

methane commented Dec 16, 2019

LGTM. Would you create an NEWS entry?
https://devguide.python.org/committing/#what-s-new-and-news-entries

@methane methane merged commit 75bb07e into python:master Dec 18, 2019
@seberg seberg deleted the kwarg-extract-performance branch January 21, 2020 19:11
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Jan 31, 2020
)

All keywords should first be checked for pointer identity. Only
after that failed for all keywords (unlikely) should unicode
equality be used.
The original code would call unicode equality on any non-matching
keyword argument. Meaning calling it often e.g. when a function
has many kwargs but only the last one is provided.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants