-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
bpo-32940: IDLE: Simplify StringTranslatePseudoMapping in pyparse #5862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
the class as well as the instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I exhausted my ideas for this issue. If CI passes, I expect to merge sometime tomorrow.
Lib/idlelib/pyparse.py
Outdated
| def get(self, key, default=None): | ||
| return self._get(key) | ||
| def __getitem__(self, key): | ||
| return self.get(key, 'x') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had done a version yesterday that inherited from dict and only implemented:
class Parse(dict):
def __missing__(self, key):
return 'x'
It didn't have get or __getitem__ functionality on missing values, but it worked for the translate. But then I read on SO that's it's generally not good to inherit directly from dict and to use Mapping or MutableMapping instead. I guess this doesn't matter here, but I did notice that inheriting from dict allows some attributes to change from read-only to not, such as _tran.update = {1: 2}. So when is it OK to inherit directly from dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it works. We are not going to do anything so nonsensical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining __missing__ looks the most natural and efficient way. This is the case for which __missing__ was introduced.
| class StringTranslatePseudoMapping(Mapping): | ||
| r"""Utility class to be used with str.translate() | ||
| class ParseMap(dict): | ||
| r"""Dict subclass that maps anything not in dict to 'x'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought of this while I was sleeping. Should be class _ParseMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. PEP434 declares nearly all of idlelib 'private', and it became such in practice nearly 3 years later, for in 3.6. There is no point to further privatization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
|
I added a commit to remove the Mapping import. |
|
|
||
| class StringTranslatePseudoMapping(Mapping): | ||
| r"""Utility class to be used with str.translate() | ||
| class ParseMap(dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this class equivalent to defaultdict(lambda:"x")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class doesn't add the keys to the dictionary when they haven't been found. It will return the default value without growing the dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, defaultdict calls the function to get 'x', instead of just returning it.
Lib/idlelib/pyparse.py
Outdated
| def get(self, key, default=None): | ||
| return self._get(key) | ||
| def __getitem__(self, key): | ||
| return self.get(key, 'x') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining __missing__ looks the most natural and efficient way. This is the case for which __missing__ was introduced.
Lib/idlelib/pyparse.py
Outdated
| # Map1 maps uninteresting chars to 'x', open brackets to '(', | ||
| # close brackets to ')', and preserves quotes, backslashes, | ||
| # newlines and hashes. Used for str.translate() in _study1(). | ||
| map1 = ParseMap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason of renaming _tran to map1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests pass with def __missing__(self, key): return 'x', so I will switch if at least as fast, which I suspect it is. (Working on timeit test)
map1 reads better to me. The object is a map for _study1. 'tran' is a prefix for numerous words, often verbs, not a word in itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a matter of style, but to me, _tran looks more semantic. It refers to the purpose of it (a translation table). map1 refers only to its type. This is like naming the integer variable "int1" instead of "cnt" or "col".
| self._get = _get | ||
| # Calling this triples access time; see bpo-32940 | ||
| def __missing__(self, key): | ||
| return 120 # ord('x') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return just 'x'. It would be make sense to use b'x'[0] if integers have advantage, but I tested and didn't see a difference.
It is worth to remove ord() from values for other keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tests, on the issue, show that ints slightly win. I see no need to change.
Lib/idlelib/pyparse.py
Outdated
| def get(self, key, default=None): | ||
| return self._get(key) | ||
| # Map all ascii to 120 to avoid __missing__ call, then replace some. | ||
| trans = ParseMap((i,120) for i in range(128)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 8 requires a space after comma.
But wouldn't ParseMap.fromkeys(range(128), 'x') look clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
Lib/idlelib/pyparse.py
Outdated
| >>> mapping = StringTranslatePseudoMapping(preserve_dict, ord('x')) | ||
| >>> text = "a + b\tc\nd" | ||
| >>> text.translate(mapping) | ||
| >>> keepwhite = ParseMap((ord(c),ord(c)) for c in ' \t\n\r') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a space after comma. And the second ord() is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space added
|
Thanks @csabella for the PR, and @terryjreedy for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7. |
…thonGH-5862) The new code also runs faster. (cherry picked from commit f0daa88) Co-authored-by: Cheryl Sabella <[email protected]>
|
GH-5940 is a backport of this pull request to the 3.7 branch. |
|
GH-5941 is a backport of this pull request to the 3.6 branch. |
…thonGH-5862) The new code also runs faster. (cherry picked from commit f0daa88) Co-authored-by: Cheryl Sabella <[email protected]>
…-5862) The new code also runs faster. (cherry picked from commit f0daa88) Co-authored-by: Cheryl Sabella <[email protected]>
…-5862) The new code also runs faster. (cherry picked from commit f0daa88) Co-authored-by: Cheryl Sabella <[email protected]>
https://bugs.python.org/issue32940