bpo-34515: lib2to3: support non-ASCII identifiers#8950
bpo-34515: lib2to3: support non-ASCII identifiers#8950benjaminp merged 4 commits intopython:masterfrom
Conversation
|
this will fix google/yapf#607 |
Lib/lib2to3/tests/test_parser.py
Outdated
There was a problem hiding this comment.
Probably a good idea to add a CJK-specific test (or non-Latin-1), such as
蟒 = 3
錦蛇 = 1
See also
https://github.com/python/cpython/blob/master/Lib/test/test_unicode_identifiers.py
1b3072b to
50f189e
Compare
Lib/lib2to3/pgen2/tokenize.py
Outdated
There was a problem hiding this comment.
Lib/tokenize.py appears to parse numbers before identifiers to avoid having a look-behind assertion here. Can we take that approach here, too?
There was a problem hiding this comment.
@benjaminp copied codes in Lib/tokenize.py.
There was a problem hiding this comment.
In Lib/tokenize.py, I see:
Name = r'\w+'
50f189e to
37d8770
Compare
|
Thanks @holymonson for the PR, and @benjaminp for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7. |
|
GH-9333 is a backport of this pull request to the 3.7 branch. |
|
@benjaminp What we actually want is to merge the two pure Python tokenizers. This pull request makes this harder. |
|
See BPO-33338. |
|
That does seem like a better solution. Do you want me to revert this?
|
|
I'm thinking. My change will only affect Python 3.8 so the backport PR (GH-9333) does make life of YAPF users better in the interim. I'll revert on master only when I rebase my tokenizer merge pull request. |
|
IOW, let's leave it for now. |
https://bugs.python.org/issue34515