-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
BUG: disallow setting flag to writeable after fromstring, frombuffer #11739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doc/release/1.16.0-notes.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based off readonly buffers? Seems harmless to allow them to be made readonly if the source buffer was writeable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarify - readonly buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we ought to rename this to frombuffer at some point, especially once we only support python 3 and fromstring(str) fails but fromstr(bytes) succeeds.
|
@eric-wieser ping |
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the purpose of this test - why would a writeable array have a non-writeable base?
Also, on my machine type(vals.base) is always bytes - I don't see any point listing str and unicode here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed non-writable bases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the fix here is that we should be pickling bytearray objects, not bytes
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a comment - there's something special about 1000 here - using anything below 251 (dtype=np.int32 on my machine) does not set .base for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment. The condition is in array_setstate, to decide to use PyArray_SetBaseObject
f9cae4f to
b6c72b8
Compare
|
Squashed to a single commit and rebased |
b6c72b8 to
b2857c0
Compare
b2857c0 to
b87d4bd
Compare
|
rebased to fix merge conflict |
b87d4bd to
b17d772
Compare
|
Can this go in for 1.16? |
|
@eric-wieser, as you were the only one to look at this, can you give this a final review and approve/reject? The change itself is rather minor and only involves removing code. |
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add def test_writeable_from_readonly here to split the tests up a little?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And perhaps add def test_writeable_from_buffer as a test name here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
eric-wieser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good - but the test could do with being split up a little.
b17d772 to
a2202b9
Compare
|
|
||
| .. _`NEP 15` : http://www.numpy.org/neps/nep-0015-merge-multiarray-umath.html | ||
| .. _`NEP 18` : http://www.numpy.org/neps/nep-0018-array-function-protocol.html | ||
| Arrays based off readonly buffers cannot be set ``writeable`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace, but this can be fixed in the 1.16 release
|
Thanks @mattip |
|
This is one is a bit of a bummer for us over at Nibabel. We are reading in large (sometimes very large) datasets, from disk. We're doing this in Python. Because we've done the reading, we know that the bytes object we read, can be modified safely without causing problems for anyone else. Thus we set the https://github.com/nipy/nibabel/blob/master/nibabel/volumeutils.py#L543 The change in this PR prevented us from doing that. In the spirit of Python's "let me shoot myself in the foot if I really want to, and I claim to know what I'm doing" - is there any way to keep our memory-saving behavior with current numpy? See: nipy/nibabel#697 |
|
We currently do not have a way to force the |
|
I don't think the goals really conflict. I think it boils down to |
|
This change broke my code as well. I have a use case where I download serialized 3D arrays that represent a big image of brain tissue. The images are split into chunks to make random access possible. If someone wants to do a "non-chunk aligned write", we must download the appropriate chunks, render them into a buffer, and shade the overlapping region. However, the decoder spits out numpy arrays set to "write=False". Since these buffers are going to be reuploaded and discarded, it's perfectly safe to write to them and saves memory. In this case, there is a 1:1 relationship between the underlying buffer and the numpy array. That can't be known ahead of time by numpy, it's something the programmer knows. As we are using numpy to processes large images, occasionally it is necessary to ensure that we have 1x memory usage without copies. Taking advantage of tricks like this has enabled us to render 62 GB images without blowing out memory in the past (we were limited to a small number of copies that could be held in memory on a large machine). My options going forward are to pin the numpy version or make my code less efficient. It would be desirable if this change were either reverted or another avenue was available for "dangerous" modifications. |
|
@mattip I think we are going to need a workaround here. |
If this is true, then your decoder has a bug in it. You should not be outputting readonly arrays if your intent is for them to not be readonly. |
|
In my case: |
|
you would get better performance by using |
|
As @mattip alludes to, the important line is Numpy used to let you ignore python there, but now it does not. The fix is to somehow obtain a |
GetArrayFromImage was refactored so the buffer is first passed to this GetArrayViewFromImage. In this case, it does not need to be writable. Remove the restriction to retain used in GetArrayViewFromImage. See also: numpy/numpy#11739
Fixes #9440. The comment in the code was incorrect,
array_setstateused in unpickling does not call_IsWriteable._IsWriteableis only used in setting flags.Note that new pickling tests use a large buffer to ensure it goes through
PyArray_SetBaseObjectinarray_setstate, but that the flag is still set towriteable.