Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Jan 14, 2020

@vstinner
Copy link
Member Author

cc @malemburg

@vstinner vstinner changed the title bpo-37751: Document the change in What's New in Python 3.8 bpo-37751: Document the change in What's New in Python 3.9 Jan 14, 2020
@vstinner
Copy link
Member Author

Oops, the change was done in Python 3.9, not in Python 3.8! PR updated.

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

:data:`~errno.EBADF` error.
(Contributed by Victor Stinner in :issue:`39239`.)

* :func:`codecs.lookup` now normalizes the encoding name the same way than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other differences. For example, normalize_encoding("КОИ-8") returns "кои_8", but codecs.lookup normalizes it to "8".

The comment in the sources is also not correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encodings.normalize_encoding() says "Note that encoding names should be ASCII only." You're correct: "КОИ-8" is normalized to "8" by codecs.lookup() because the C function _Py_normalize_encoding() ignores non-ASCII letters.

I don't know which behavior is correct. It sounds strange to me to have a non-ASCII encoding name. Which encoding is supposed to be used to encoding the encoding name?!? :-D Maybe encodings.normalize_encoding() should also ignore non-ASCII letters, be more strict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created bpo-39337: codecs.lookup() ignores non-ASCII characters, whereas encodings.normalize_encoding() copies them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe encodings.normalize_encoding() should also ignore non-ASCII letters, be more strict.

Hm, the annotation of normalize_encoding have the words: Note that encoding names should be ASCII only.
+1 for encodings.normalize_encoding() should be similar as _Py_normalize_encoding().
And I created a PR: #22219

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other differences. For example, normalize_encoding("КОИ-8") returns "кои_8", but codecs.lookup normalizes it to "8".

After #22219 merged, this problem have been fixed(MAYBE enhanced will be more exact).

:data:`~errno.EBADF` error.
(Contributed by Victor Stinner in :issue:`39239`.)

* :func:`codecs.lookup` now normalizes the encoding name the same way than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* :func:`codecs.lookup` now normalizes the encoding name the same way than
* :func:`codecs.lookup` now normalizes the encoding name the same way as

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I copied the NEWS entry from commit 20f59fe. If there is a typo, it should also be fixed in Misc/NEWS.d/next/Core and Builtins/2019-08-20-04-36-37.bpo-37751.CSFzUd.rst.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I create a new PR in #23096, and this word have been replaced.

@csabella
Copy link
Contributor

@vstinner is this one that needs to be merged soon?

@vstinner
Copy link
Member Author

vstinner commented Oct 1, 2020

@shihai1991: I failed to find time to finish this PR. Since you are wokring on bpo-39337, do you want to continue the work on this PR? You can steal it (copy/paste my change), and try to address previous comments.

@shihai1991
Copy link
Member

@shihai1991: I failed to find time to finish this PR. Since you are wokring on bpo-39337, do you want to continue the work on this PR? You can steal it (copy/paste my change), and try to address previous comments.

No problem, I will take a look :)

@shihai1991
Copy link
Member

A new PR in #23096 (copy from this PR).

@vstinner vstinner closed this Jan 19, 2021
@vstinner vstinner deleted the codecs_whatsnew38 branch January 19, 2021 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting core review docs Documentation in the Doc dir skip news

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants