Skip to content

Conversation

@miss-islington
Copy link
Contributor

@miss-islington miss-islington commented Aug 19, 2019

The documented definition was much broader than the real one:
there are tons of characters with general category "Other",
and we don't (and shouldn't) treat most of them as whitespace.

Rewrite the definition to agree with the comment on
_PyUnicode_IsWhitespace, and with the logic in makeunicodedata.py,
which is what generates that function and so ultimately governs.

Add suitable breadcrumbs so that a reader who wants to pin down
exactly what this definition means (what's a "bidirectional class"
of "B"?) can do so. The unicodedata module documentation is an
appropriate central place for our references to Unicode's own copious
documentation, so point there.

Also add to the isspace() test a thorough check that the
implementation agrees with the intended definition.
(cherry picked from commit 8c1c426)

Co-authored-by: Greg Price [email protected]

https://bugs.python.org/issue36502

…ythonGH-15296)

The documented definition was much broader than the real one:
there are tons of characters with general category "Other",
and we don't (and shouldn't) treat most of them as whitespace.

Rewrite the definition to agree with the comment on
_PyUnicode_IsWhitespace, and with the logic in makeunicodedata.py,
which is what generates that function and so ultimately governs.

Add suitable breadcrumbs so that a reader who wants to pin down
exactly what this definition means (what's a "bidirectional class"
of "B"?) can do so.  The `unicodedata` module documentation is an
appropriate central place for our references to Unicode's own copious
documentation, so point there.

Also add to the isspace() test a thorough check that the
implementation agrees with the intended definition.
(cherry picked from commit 8c1c426)

Co-authored-by: Greg Price <[email protected]>
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good bot.

@miss-islington
Copy link
Contributor Author

@gnprice and @vstinner: Status check is done, and it's a success ✅ .

@miss-islington miss-islington merged commit 0fcdd8d into python:3.7 Aug 19, 2019
@miss-islington miss-islington deleted the backport-8c1c426-3.7 branch August 19, 2019 10:10
@miss-islington
Copy link
Contributor Author

@gnprice and @vstinner: Status check is done, and it's a success ✅ .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip news tests Tests in the Lib/test dir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants