-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Support compute=False from DataTree.to_netcdf #10625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| writer = ArrayWriter() | ||
|
|
||
| # TODO: figure out how to properly handle unlimited_dims | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to refactor this to a common function used in both the netCDF and the Zarr writer. Do you see a way to do that? At first glance the "validate region / encoding" bit seems to make this hard.
If there is no easy way to do that, can you please add a comment to both functions to remind future contributors to keep the logic in sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this would be tricky.
I'm not sure a comment makes sense here -- there's no intrinsic reason why the implementations need to match, although hopefully this suggestion would be somewhat obvious? There are also unit tests, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this separate datatree_io.py file has come to the end of it's usefulness. In a follow-up I can just merge it into the respective backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely agreed! In the long term, we might even implement Dataset IO in terms of DataTree IO. This would let us avoid redundant code paths, similar to how we currently implement many DataArray operations in terms of Dataset.
|
This is ready for a final review now that tests are passing. |
|
Just a heads up, I am going to submit this shortly so I can start iterating on follow-ups |
* main: (46 commits) use the new syntax of ignoring bots (pydata#10668) modification methods on `Coordinates` (pydata#10318) Silence warnings from test_tutorial.py (pydata#10661) test: update write_empty test for zarr 3.1.2 (pydata#10665) Bump actions/checkout from 4 to 5 in the actions group (pydata#10652) Add load_datatree function (pydata#10649) Support compute=False from DataTree.to_netcdf (pydata#10625) Fix typos (pydata#10655) In case of misconfiguration of dataset.encoding `unlimited_dims` warn instead of raise (pydata#10648) fix ``auto_complex`` for ``open_datatree`` (pydata#10632) Fix bug indexing with boolean scalars (pydata#10635) Improve DataTree typing (pydata#10644) Update Cartopy and Iris references (pydata#10645) Empty release notes (pydata#10642) release notes for v2025.08.0 (pydata#10641) Fix `ds.merge` to prevent altering original object depending on join value (pydata#10596) Add asynchronous load method (pydata#10327) Add DataTree.prune() method … (pydata#10598) Avoid refining parent dimensions in NetCDF files (pydata#10623) clarify lazy behaviour and eager loading chunks=None in open_*-functions (pydata#10627) ...
Writing to zarr with `mode='r+'` was broken by pydata#10625, due to a Zarr bug (zarr-developers/zarr-python#3428). This add a work-around that works on older versions of Zarr. The PR that introduced this bug has not yet appeared in an xarray release, so there's no need for a release note.
Split out of #10624
This PR combines adds support for
compute=FalsefromDataTree.to_netcdfandto_zarr. To do so, I refactored the internals of these methods to use Xarray's lower level data store interface directly, rather than callingDatasetmethods.whats-new.rst