Skip to content

Conversation

@serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Jul 6, 2017

They no longer cache the wchar_t* representation of string objects.

https://bugs.python.org/issue30863

…ing().

They no longer cache the wchar_t* representation of string objects.
@serhiy-storchaka serhiy-storchaka added the type-feature A feature request or enhancement label Jul 6, 2017
Misc/NEWS Outdated
C API
-----

- bpo-30863: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() no longer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should change to use blurb.

@zhangyangyu zhangyangyu self-requested a review July 24, 2018 06:00
Py_ssize_t res;

assert(unicode != NULL);
assert(PyUnicode_IS_READY(unicode));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we get not ready string here? IMHO it should be put after the following if block.

"embedded null character");

buflen = unicode_get_widechar_size(unicode);
if (buflen < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why could it be < 0? It looks to me assert(buflen >= 0) is right.

}

buffer = PyMem_NEW(wchar_t, buflen + 1);
buffer = (wchar_t *) PyMem_MALLOC(sizeof(wchar_t) * (buflen + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need overflow check here?

memcpy(w, wstr, size * sizeof(wchar_t));
return;
}
assert(PyUnicode_KIND(unicode) != SIZEOF_WCHAR_T);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest remove this assertion since we have two more specific assertions below and they are easier to read and understand.

Copy link
Member

@zhangyangyu zhangyangyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only question is this patch aims to remove cache or get rid of dependency of PyUnicode_AsUnicodeAndSize so we can deprecate it? I mean should the NEWS entry reflect this.

@serhiy-storchaka
Copy link
Member Author

Both. Deprecating PyUnicode_AsUnicodeAndSize is a primary goal. Do you prefer to merge these issues?

@zhangyangyu
Copy link
Member

No. I just mean if that's also a goal, it's better to mention it in the NEWS. Currently the NEWS takes removing the cache as the only goal.

@serhiy-storchaka
Copy link
Member Author

If deprecating PyUnicode_AsUnicodeAndSize is a different issue, the NEWS for this issue shouldn't mention it.

@zhangyangyu
Copy link
Member

I mean something like

:c:func:PyUnicode_AsWideChar and :c:func:PyUnicode_AsWideCharString no
longer cache the wchar_t* representation of string objects and rely on PyUnicode_AsUnicodeAndSize.

Is it acceptable to you? If not, doesn't matter, just merge it since the code already LGTM. :-)

@vstinner
Copy link
Member

@serhiy-storchaka and @zhangyangyu: ping

@serhiy-storchaka
Copy link
Member Author

I was going to discuss this change on Python-Dev first.

@vstinner
Copy link
Member

I was going to discuss this change on Python-Dev first.

https://mail.python.org/pipermail/python-dev/2018-October/155530.html

@zooba
Copy link
Member

zooba commented Oct 23, 2018

I started a new Pipelines build at https://dev.azure.com/python/cpython/_build/results?buildId=32792&view=logs to get a first check on whether there are significant performance regressions in the test suite. If it comes out about equal (with, say, this recent build), then I have no concerns.

@zooba
Copy link
Member

zooba commented Oct 23, 2018

Looks good to me 👍

@serhiy-storchaka serhiy-storchaka merged commit c46db92 into python:master Oct 23, 2018
@serhiy-storchaka serhiy-storchaka deleted the wide-char branch October 23, 2018 19:58
@vstinner
Copy link
Member

Thanks @serhiy-storchaka and @zooba: I wanted to do this change since Python 3.3 :-) I already attempted to make this change previously, but I failed. Mostly, because the legacy API was still very popular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-feature A feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants