CI: Add CI to test on free-threaded Python#753
Conversation
rgommers
left a comment
There was a problem hiding this comment.
Thanks @andfoy for getting the ball rolling on this! Since this is a single job, I'd like it to be as clean and simple as possible. We need numpy/scipy/cython, install, and run tests. A single TODO for adding matplotlib to the test dependencies once that is ready.
|
This is passing now on my fork: https://github.com/andfoy/pywt/actions/runs/9590437998 |
pywt/_extensions/meson.build
Outdated
| cython_args = ['-3', '--fast-fail', '--output-file', '@OUTPUT@', '--include-dir', '@BUILD_ROOT@', '@INPUT@'] | ||
|
|
||
| cython_gen = generator(cython, | ||
| arguments : cython_args, | ||
| output : '@BASENAME@.c', | ||
| depends : _cython_tree) |
There was a problem hiding this comment.
I copied this pattern over from SciPy, this enables granular control over the flags passed to Cython, in this case it helped me to debug the actual error by setting --gdb
There was a problem hiding this comment.
Keeping this unresolved, since it is useful context.
pywt/_extensions/_swt.pyx
Outdated
| strides_view = <signed long [:data.ndim]> <signed long *> cA.strides | ||
| strides_view = strides_view.copy() | ||
| output_info.strides = <pywt_index_t *> &strides_view[0] | ||
| output_info.shape = <size_t *> &output_shape[0] |
There was a problem hiding this comment.
A copy was required here, since cA would go out of scope after redefinition on any of the complex cases
There was a problem hiding this comment.
Good catch! That would explain a crash indeed. I'm not quite sure whether this isn't happening also further down, since there is another pattern like that where an ndarray or memory view is overwritten. It's probably protected only by this code that keeps a reference:
if not trim_approx:
ret.append((cA, cD))however, the default for trim_approx is False. It all looks a bit fragile.
There was a problem hiding this comment.
Did you audit the other .pyx files for this pattern?
There was a problem hiding this comment.
Did you audit the other
.pyxfiles for this pattern?
I did this audit, and didn't find a pattern like in this file with an ndarray being overwritten and then being reused in a for-loop.
|
I pushed a couple of tiny cleanups to make the diff even smaller. |
rgommers
left a comment
There was a problem hiding this comment.
The added CI job and the build file changes LGTM now. The issue in the Cython code is actually interesting. It's not clear to me whether:
- this code was always buggy, or
- it relied on the GIL and hence we only see the crash with free-threaded CPython, or
- the code is supposed to work even with free-threaded CPython, and this is a Cython bug at the moment.
I think this is exactly the kind of thing that we may want to use to better test Cython, so rather than fixing up the issue by making a copy (the code for which isn't too pretty), let's make sure to really understand what is going on here.
Could you look at the generated C code with and without the change to _swt.pyx and see what is different? And reduce the issue to the minimal code for a standalone reproducer, if applicable?
|
After a more detailed exploration, I compared the output C files produced by Cython under Python 3.12 and 3.13. In particular, the issue comes from the following trace:
__pyx_v_raw_strides = ((npy_intp *)__pyx_f_5numpy_7ndarray_7strides_strides(__pyx_v_cA));
Such redefinition implies the creation of an intermediate variable that contains the array, which is then assigned to the original __Pyx_DECREF_SET(__pyx_v_cA, ((PyArrayObject *)__pyx_t_15));On Python 3.12, calling The bug (or feature) that we are now experiencing on Python 3.13 is that, when the original array object is deleted (as its refcount hits zero), its attributes also are deleted. In this case, since The main question here is, should any access to the attributes of an object imply an increase of the refcount, or is this behaviour expected and it should be taken care of explicitly when writing Cython code? It is important to mention that this happens regardless of the setting of the |
|
I suspect this change is because the memory allocator got changed in Python 3.13 (if nothing else uses the memory that strides points to then you may get away with accessing it, even if it's technically not allowed). I think you need to account for the reference counting explicitly. As far as Cython is concerned strides and shape are just arbitrary int pointers and it knows nothing about the underlying ownership. |
|
Merged, nice work @andfoy. Looks like the next step is to start uploading Linux and macOS wheels - would you be able to open a new PR for that? |
This PR introduces a CI to run the test suite on Linux against the free-threaded distribution of Python 3.13. This effort follows the recently added testing and wheel distributions of NumPy and SciPy.