bpo-30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString(). #2599

serhiy-storchaka · 2017-07-06T07:56:28Z

They no longer cache the wchar_t* representation of string objects.

https://bugs.python.org/issue30863

…ing(). They no longer cache the wchar_t* representation of string objects.

mlouielu · 2017-07-13T10:08:52Z

Misc/NEWS

 C API
 -----

+- bpo-30863: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() no longer


I think this should change to use blurb.

zhangyangyu · 2018-07-30T04:26:13Z

Objects/unicodeobject.c

+    Py_ssize_t res;
+
+    assert(unicode != NULL);
+    assert(PyUnicode_IS_READY(unicode));


Can't we get not ready string here? IMHO it should be put after the following if block.

zhangyangyu · 2018-07-30T04:27:11Z

Objects/unicodeobject.c

-                        "embedded null character");
+
+    buflen = unicode_get_widechar_size(unicode);
+    if (buflen < 0) {


Why could it be < 0? It looks to me assert(buflen >= 0) is right.

zhangyangyu · 2018-07-30T04:27:26Z

Objects/unicodeobject.c

    }

-    buffer = PyMem_NEW(wchar_t, buflen + 1);
+    buffer = (wchar_t *) PyMem_MALLOC(sizeof(wchar_t) * (buflen + 1));


Do we need overflow check here?

zhangyangyu · 2018-07-30T05:11:41Z

Objects/unicodeobject.c

+        memcpy(w, wstr, size * sizeof(wchar_t));
+        return;
+    }
+    assert(PyUnicode_KIND(unicode) != SIZEOF_WCHAR_T);


I'd suggest remove this assertion since we have two more specific assertions below and they are easier to read and understand.

zhangyangyu

LGTM. Only question is this patch aims to remove cache or get rid of dependency of PyUnicode_AsUnicodeAndSize so we can deprecate it? I mean should the NEWS entry reflect this.

serhiy-storchaka · 2018-07-30T14:39:38Z

Both. Deprecating PyUnicode_AsUnicodeAndSize is a primary goal. Do you prefer to merge these issues?

zhangyangyu · 2018-07-31T06:15:24Z

No. I just mean if that's also a goal, it's better to mention it in the NEWS. Currently the NEWS takes removing the cache as the only goal.

serhiy-storchaka · 2018-07-31T06:31:09Z

If deprecating PyUnicode_AsUnicodeAndSize is a different issue, the NEWS for this issue shouldn't mention it.

zhangyangyu · 2018-07-31T07:30:54Z

I mean something like

:c:func:PyUnicode_AsWideChar and :c:func:PyUnicode_AsWideCharString no
longer cache the wchar_t* representation of string objects and rely on PyUnicode_AsUnicodeAndSize.

Is it acceptable to you? If not, doesn't matter, just merge it since the code already LGTM. :-)

vstinner · 2018-10-20T01:03:58Z

@serhiy-storchaka and @zhangyangyu: ping

serhiy-storchaka · 2018-10-20T05:53:19Z

I was going to discuss this change on Python-Dev first.

vstinner · 2018-10-22T15:20:55Z

I was going to discuss this change on Python-Dev first.

https://mail.python.org/pipermail/python-dev/2018-October/155530.html

zooba · 2018-10-23T13:21:53Z

I started a new Pipelines build at https://dev.azure.com/python/cpython/_build/results?buildId=32792&view=logs to get a first check on whether there are significant performance regressions in the test suite. If it comes out about equal (with, say, this recent build), then I have no concerns.

zooba · 2018-10-23T13:45:58Z

Looks good to me 👍

vstinner · 2018-10-23T21:59:40Z

Thanks @serhiy-storchaka and @zooba: I wanted to do this change since Python 3.3 :-) I already attempted to make this change previously, but I failed. Mostly, because the legacy API was still very popular.

bpo-30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharStr…

280c8cf

…ing(). They no longer cache the wchar_t* representation of string objects.

serhiy-storchaka added the type-feature A feature request or enhancement label Jul 6, 2017

the-knights-who-say-ni added the CLA signed label Jul 6, 2017

mlouielu reviewed Jul 13, 2017

View reviewed changes

Mariatta added needs rebase and removed needs rebase labels Oct 9, 2017

serhiy-storchaka added 2 commits October 12, 2017 23:24

Move the NEWS entry to NEWS.d/.

5534fb8

Merge branch 'master' into wide-char

4add4a1

brettcannon added the awaiting review label Feb 2, 2018

methane approved these changes Jul 10, 2018

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Jul 10, 2018

serhiy-storchaka added 2 commits July 10, 2018 11:33

Merge branch 'master' into wide-char

466e94a

Add references in the news entry.

3f81083

zhangyangyu self-requested a review July 24, 2018 06:00

zhangyangyu reviewed Jul 30, 2018

View reviewed changes

serhiy-storchaka added 2 commits July 30, 2018 10:10

Merge branch 'master' into wide-char

f010479

Address review comments.

6d69632

zhangyangyu approved these changes Jul 30, 2018

View reviewed changes

serhiy-storchaka merged commit c46db92 into python:master Oct 23, 2018

bedevere-bot removed the awaiting merge label Oct 23, 2018

serhiy-storchaka deleted the wide-char branch October 23, 2018 19:58

Uh oh!

bpo-30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString(). #2599

bpo-30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString(). #2599

Uh oh!

Conversation

serhiy-storchaka commented Jul 6, 2017 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlouielu Jul 13, 2017

Choose a reason for hiding this comment

Uh oh!

zhangyangyu Jul 30, 2018

Choose a reason for hiding this comment

Uh oh!

zhangyangyu Jul 30, 2018

Choose a reason for hiding this comment

Uh oh!

zhangyangyu Jul 30, 2018

Choose a reason for hiding this comment

Uh oh!

zhangyangyu Jul 30, 2018

Choose a reason for hiding this comment

Uh oh!

zhangyangyu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serhiy-storchaka commented Jul 30, 2018

Uh oh!

zhangyangyu commented Jul 31, 2018

Uh oh!

serhiy-storchaka commented Jul 31, 2018

Uh oh!

zhangyangyu commented Jul 31, 2018

Uh oh!

vstinner commented Oct 20, 2018

Uh oh!

serhiy-storchaka commented Oct 20, 2018

Uh oh!

vstinner commented Oct 22, 2018

Uh oh!

zooba commented Oct 23, 2018

Uh oh!

zooba commented Oct 23, 2018

Uh oh!

vstinner commented Oct 23, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

serhiy-storchaka commented Jul 6, 2017 •

edited by bedevere-bot

Loading

zhangyangyu left a comment •

edited

Loading