bpo-34515: lib2to3: support non-ASCII identifiers#8950

holymonson · 2018-08-27T06:54:17Z

https://bugs.python.org/issue34515

holymonson · 2018-08-27T07:18:18Z

kamahen · 2018-08-27T11:43:40Z

Lib/lib2to3/tests/test_parser.py

Probably a good idea to add a CJK-specific test (or non-Latin-1), such as

蟒 = 3
錦蛇 = 1

See also
https://github.com/python/cpython/blob/master/Lib/test/test_unicode_identifiers.py

benjaminp · 2018-09-10T18:51:30Z

Lib/lib2to3/pgen2/tokenize.py

Lib/tokenize.py appears to parse numbers before identifiers to avoid having a look-behind assertion here. Can we take that approach here, too?

@benjaminp copied codes in Lib/tokenize.py.

In Lib/tokenize.py, I see:

Name = r'\w+'

miss-islington · 2018-09-15T17:32:32Z

Thanks @holymonson for the PR, and @benjaminp for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒⛏🤖

…-8950) (cherry picked from commit 10a428b) Co-authored-by: Monson Shao <holymonson@gmail.com>

bedevere-bot · 2018-09-15T17:32:49Z

GH-9333 is a backport of this pull request to the 3.7 branch.

ambv · 2018-09-15T17:36:35Z

@benjaminp What we actually want is to merge the two pure Python tokenizers. This pull request makes this harder.

ambv · 2018-09-15T17:37:32Z

See BPO-33338.

benjaminp · 2018-09-15T17:40:34Z

That does seem like a better solution. Do you want me to revert this?

ambv · 2018-09-15T17:44:49Z

I'm thinking. My change will only affect Python 3.8 so the backport PR (GH-9333) does make life of YAPF users better in the interim. I'll revert on master only when I rebase my tokenizer merge pull request.

ambv · 2018-09-15T17:45:01Z

IOW, let's leave it for now.

(cherry picked from commit 10a428b) Co-authored-by: Monson Shao <holymonson@gmail.com>

the-knights-who-say-ni added the CLA signed label Aug 27, 2018

bedevere-bot added the awaiting review label Aug 27, 2018

holymonson mentioned this pull request Aug 27, 2018

PEP 3131 -- Supporting Non-ASCII Identifiers google/yapf#607

Closed

kamahen reviewed Aug 27, 2018

View reviewed changes

holymonson force-pushed the non_ascii_identifiers branch from 1b3072b to 50f189e Compare August 27, 2018 12:51

benjaminp reviewed Sep 10, 2018

View reviewed changes

holymonson added 4 commits September 15, 2018 10:33

lib2to3: support non-ASCII identifiers

6cf1258

add news

0774738

add more test

fb3a039

imitate tokenize.py

37d8770

holymonson force-pushed the non_ascii_identifiers branch from 50f189e to 37d8770 Compare September 15, 2018 02:53

benjaminp added the needs backport to 3.7 label Sep 15, 2018

benjaminp merged commit 10a428b into python:master Sep 15, 2018

bedevere-bot removed the awaiting review label Sep 15, 2018

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 15, 2018

closes bpo-34515: Support non-ASCII identifiers in lib2to3. (pythonGH…

a9f58c2

…-8950) (cherry picked from commit 10a428b) Co-authored-by: Monson Shao <holymonson@gmail.com>

bedevere-bot removed the needs backport to 3.7 label Sep 15, 2018

miss-islington added a commit that referenced this pull request Sep 15, 2018

closes bpo-34515: Support non-ASCII identifiers in lib2to3. (GH-8950)

51dbae8

(cherry picked from commit 10a428b) Co-authored-by: Monson Shao <holymonson@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bpo-34515: lib2to3: support non-ASCII identifiers#8950