Introduction to cf_xarray¶
This notebook is a brief introduction to cf_xarray’s current capabilities.
import numpy as np
import xarray as xr
import cf_xarray as cfxr
# For this notebooks, it's nicer if we don't show the array values by default
xr.set_options(display_expand_data=False)
<xarray.core.options.set_options at 0x76164ce28290>
cf_xarray works best when xarray keeps attributes by default.
xr.set_options(keep_attrs=True)
<xarray.core.options.set_options at 0x76164c5b7a50>
Lets read two datasets.
ds = xr.tutorial.load_dataset("air_temperature")
ds.air.attrs["standard_name"] = "air_temperature"
ds
<xarray.Dataset> Size: 31MB
Dimensions: (time: 2920, lat: 25, lon: 53)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...This one is inspired by POP model output and illustrates how the coordinates
attribute is interpreted. It also illustrates one way of tagging curvilinear
grids for convenient use of cf_xarray
from cf_xarray.datasets import popds as pop
pop
<xarray.Dataset> Size: 29kB
Dimensions: (nlat: 20, nlon: 30)
Coordinates:
TLONG (nlat, nlon) float64 5kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
TLAT (nlat, nlon) float64 5kB 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0
ULONG (nlat, nlon) float64 5kB 0.5 0.5 0.5 0.5 0.5 ... 0.5 0.5 0.5 0.5
ULAT (nlat, nlon) float64 5kB 2.5 2.5 2.5 2.5 2.5 ... 2.5 2.5 2.5 2.5
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Data variables:
UVEL (nlat, nlon) float64 5kB 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0
TEMP (nlat, nlon) float64 5kB 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0This synthetic dataset has multiple X and Y coords. An example would be
model output on a staggered grid.
from cf_xarray.datasets import multiple
multiple
<xarray.Dataset> Size: 6kB
Dimensions: (x1: 30, y1: 20, x2: 10, y2: 5)
Coordinates:
* x1 (x1) int64 240B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29
* y1 (y1) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
* x2 (x2) int64 80B 0 1 2 3 4 5 6 7 8 9
* y2 (y2) int64 40B 0 1 2 3 4
Data variables:
v1 (x1, y1) float64 5kB 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0 15.0
v2 (x2, y2) float64 400B 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0 15.0This dataset has ancillary variables
from cf_xarray.datasets import anc
anc
<xarray.Dataset> Size: 3kB
Dimensions: (x: 10, y: 20)
Dimensions without coordinates: x, y
Data variables:
q (x, y) float64 2kB -1.672 -0.4738 ... -0.4787 0.8472
q_error_limit (x, y) float64 2kB -0.4484 -0.1561 ... 0.1659 -1.121
q_detection_limit float64 8B 0.001What attributes have been discovered?¶
The criteria for identifying variables using CF attributes are listed here.
ds.lon
<xarray.DataArray 'lon' (lon: 53)> Size: 212B
200.0 202.5 205.0 207.5 210.0 212.5 ... 317.5 320.0 322.5 325.0 327.5 330.0
Coordinates:
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
Attributes:
standard_name: longitude
long_name: Longitude
units: degrees_east
axis: Xds.lon has attributes axis: X. This means that cf_xarray can identify the
'X' axis as being represented by the lon variable.
It can also use the standard_name and units attributes to infer that lon
is “Longitude”. To see variable names that cf_xarray can infer, use ds.cf
ds.cf
Coordinates:
CF Axes: * X: ['lon']
* Y: ['lat']
* T: ['time']
Z: n/a
CF Coordinates: * longitude: ['lon']
* latitude: ['lat']
* time: ['time']
vertical: n/a
Cell Measures: area, volume: n/a
Standard Names: * latitude: ['lat']
* longitude: ['lon']
* time: ['time']
Bounds: n/a
Grid Mappings: n/a
Data Variables:
Cell Measures: area, volume: n/a
Standard Names: air_temperature: ['air']
Bounds: n/a
Grid Mappings: n/a
For pop, only latitude and longitude are detected, not X or Y. Please
comment here: https://github.com/xarray-contrib/cf-xarray/issues/23 if you have
opinions about this behaviour.
pop.cf
Coordinates:
CF Axes: * X: ['nlon']
* Y: ['nlat']
Z, T: n/a
CF Coordinates: longitude: ['TLONG', 'ULONG']
latitude: ['TLAT', 'ULAT']
vertical, time: n/a
Cell Measures: area, volume: n/a
Standard Names: n/a
Bounds: n/a
Grid Mappings: n/a
Data Variables:
Cell Measures: area, volume: n/a
Standard Names: sea_water_potential_temperature: ['TEMP']
sea_water_x_velocity: ['UVEL']
Bounds: n/a
Grid Mappings: n/a
For multiple, multiple X and Y coordinates are detected
multiple.cf
Coordinates:
CF Axes: * X: ['x1', 'x2']
* Y: ['y1', 'y2']
Z, T: n/a
CF Coordinates: longitude, latitude, vertical, time: n/a
Cell Measures: area, volume: n/a
Standard Names: n/a
Bounds: n/a
Grid Mappings: n/a
Data Variables:
Cell Measures: area, volume: n/a
Standard Names: n/a
Bounds: n/a
Grid Mappings: n/a
Feature: Accessing coordinate variables¶
.cf implements __getitem__ to allow easy access to coordinate and axis
variables.
ds.cf["X"]
<xarray.DataArray 'lon' (lon: 53)> Size: 212B
200.0 202.5 205.0 207.5 210.0 212.5 ... 317.5 320.0 322.5 325.0 327.5 330.0
Coordinates:
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
Attributes:
standard_name: longitude
long_name: Longitude
units: degrees_east
axis: XIndexing with a scalar key raises an error if the key maps to multiple variables names
multiple.cf["X"]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[12], line 1
----> 1 multiple.cf["X"]
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:2650, in CFDatasetAccessor.__getitem__(self, key)
2618 def __getitem__(self, key: Hashable | Iterable[Hashable]) -> DataArray | Dataset:
2619 """
2620 Index into a Dataset making use of CF attributes.
2621
(...) 2648 Add additional keys by specifying "custom criteria". See :ref:`custom_criteria` for more.
2649 """
-> 2650 return _getitem(self, key)
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:1031, in _getitem(accessor, key, skip)
1029 names = _get_all(obj, k)
1030 names = drop_bounds(names)
-> 1031 check_results(names, k)
1032 successful[k] = bool(names)
1033 coords.extend(names)
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:1001, in _getitem.<locals>.check_results(names, key)
999 def check_results(names, key):
1000 if scalar_key and len(names) > 1:
-> 1001 raise KeyError(
1002 f"Receive multiple variables for key {key!r}: {names}. "
1003 f"Expected only one. Please pass a list [{key!r}] "
1004 f"instead to get all variables matching {key!r}."
1005 )
KeyError: "Receive multiple variables for key 'X': {'x2', 'x1'}. Expected only one. Please pass a list ['X'] instead to get all variables matching 'X'."
pop.cf["longitude"]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[13], line 1
----> 1 pop.cf["longitude"]
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:2650, in CFDatasetAccessor.__getitem__(self, key)
2618 def __getitem__(self, key: Hashable | Iterable[Hashable]) -> DataArray | Dataset:
2619 """
2620 Index into a Dataset making use of CF attributes.
2621
(...) 2648 Add additional keys by specifying "custom criteria". See :ref:`custom_criteria` for more.
2649 """
-> 2650 return _getitem(self, key)
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:1031, in _getitem(accessor, key, skip)
1029 names = _get_all(obj, k)
1030 names = drop_bounds(names)
-> 1031 check_results(names, k)
1032 successful[k] = bool(names)
1033 coords.extend(names)
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:1001, in _getitem.<locals>.check_results(names, key)
999 def check_results(names, key):
1000 if scalar_key and len(names) > 1:
-> 1001 raise KeyError(
1002 f"Receive multiple variables for key {key!r}: {names}. "
1003 f"Expected only one. Please pass a list [{key!r}] "
1004 f"instead to get all variables matching {key!r}."
1005 )
KeyError: "Receive multiple variables for key 'longitude': {'TLONG', 'ULONG'}. Expected only one. Please pass a list ['longitude'] instead to get all variables matching 'longitude'."
To get back all variables associated with that key, pass a single element list instead.
multiple.cf[["X"]]
<xarray.Dataset> Size: 320B
Dimensions: (x2: 10, x1: 30)
Coordinates:
* x2 (x2) int64 80B 0 1 2 3 4 5 6 7 8 9
* x1 (x1) int64 240B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29
Data variables:
*empty*pop.cf[["longitude"]]
<xarray.Dataset> Size: 10kB
Dimensions: (nlat: 20, nlon: 30)
Coordinates:
TLONG (nlat, nlon) float64 5kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
ULONG (nlat, nlon) float64 5kB 0.5 0.5 0.5 0.5 0.5 ... 0.5 0.5 0.5 0.5
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Data variables:
*empty*DataArrays return DataArrays
pop.UVEL.cf["longitude"]
<xarray.DataArray 'ULONG' (nlat: 20, nlon: 30)> Size: 5kB
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ... 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
Coordinates:
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Attributes:
units: degrees_eastDataset.cf[...] returns a single DataArray, parsing the coordinates
attribute if present, so we correctly get the TLONG variable and not the
ULONG variable
pop.cf["TEMP"]
<xarray.DataArray 'TEMP' (nlat: 20, nlon: 30)> Size: 5kB
15.0 15.0 15.0 15.0 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0 15.0 15.0 15.0 15.0
Coordinates:
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
TLONG (nlat, nlon) float64 5kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
TLAT (nlat, nlon) float64 5kB 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0
Attributes:
coordinates: TLONG TLAT
standard_name: sea_water_potential_temperatureDataset.cf[...] also interprets the ancillary_variables attribute. The
ancillary variables are returned as coordinates of a DataArray
anc.cf["q"]
<xarray.DataArray 'q' (x: 10, y: 20)> Size: 2kB
-1.672 -0.4738 -0.08992 0.4606 -0.9749 ... -1.222 -0.2186 0.175 -0.4787 0.8472
Coordinates:
q_error_limit (x, y) float64 2kB -0.4484 -0.1561 ... 0.1659 -1.121
q_detection_limit float64 8B 0.001
Dimensions without coordinates: x, y
Attributes:
standard_name: specific_humidity
units: g/g
ancillary_variables: q_error_limit q_detection_limitFeature: Accessing variables by standard names¶
pop.cf[["sea_water_potential_temperature", "UVEL"]]
<xarray.Dataset> Size: 29kB
Dimensions: (nlat: 20, nlon: 30)
Coordinates:
TLONG (nlat, nlon) float64 5kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
TLAT (nlat, nlon) float64 5kB 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0
ULONG (nlat, nlon) float64 5kB 0.5 0.5 0.5 0.5 0.5 ... 0.5 0.5 0.5 0.5
ULAT (nlat, nlon) float64 5kB 2.5 2.5 2.5 2.5 2.5 ... 2.5 2.5 2.5 2.5
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Data variables:
TEMP (nlat, nlon) float64 5kB 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0
UVEL (nlat, nlon) float64 5kB 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0Note that ancillary variables are included as coordinate variables
anc.cf["specific_humidity"]
<xarray.DataArray 'q' (x: 10, y: 20)> Size: 2kB
-1.672 -0.4738 -0.08992 0.4606 -0.9749 ... -1.222 -0.2186 0.175 -0.4787 0.8472
Coordinates:
q_error_limit (x, y) float64 2kB -0.4484 -0.1561 ... 0.1659 -1.121
q_detection_limit float64 8B 0.001
Dimensions without coordinates: x, y
Attributes:
standard_name: specific_humidity
units: g/g
ancillary_variables: q_error_limit q_detection_limitFeature: Utility functions¶
There are some utility functions to allow use by downstream libraries
pop.cf.keys()
{'X',
'Y',
'latitude',
'longitude',
'sea_water_potential_temperature',
'sea_water_x_velocity'}
You can test for presence of these keys
"sea_water_x_velocity" in pop.cf
True
You can also get out the available Axis names
pop.cf.axes
{'X': ['nlon'], 'Y': ['nlat']}
or available Coordinate names. Same for cell measures (.cf.cell_measures) and
standard names (.cf.standard_names).
pop.cf.coordinates
{'longitude': ['TLONG', 'ULONG'], 'latitude': ['TLAT', 'ULAT']}
Note: Although it is possible to assign additional coordinates,
.cf.coordinates only returns a subset of
("longitude", "latitude", "vertical", "time").
Feature: Rewriting property dictionaries¶
cf_xarray will rewrite the .sizes and .chunks dictionaries so that one can
index by a special CF axis or coordinate name
ds.cf.sizes
{'T': 2920, 'time': 2920, 'latitude': 25, 'Y': 25, 'X': 53, 'longitude': 53}
Note the duplicate entries above:
One for
X,Y,Tand one for
longitude,latitudeandtime.
An error is raised if there are multiple 'X' variables (for example)
multiple.cf.sizes
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[26], line 1
----> 1 multiple.cf.sizes
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:1751, in CFAccessor.__getattr__(self, attr)
1750 def __getattr__(self, attr):
-> 1751 return _getattr(
1752 obj=self._obj,
1753 attr=attr,
1754 accessor=self,
1755 key_mappers=_DEFAULT_KEY_MAPPERS,
1756 wrap_classes=True,
1757 )
File ~/checkouts/readthedocs.org/user_builds/cf-xarray/checkouts/latest/cf_xarray/accessor.py:897, in _getattr(obj, attr, accessor, key_mappers, wrap_classes, extra_decorator)
895 for name in inverted[key]:
896 if name in newmap:
--> 897 raise AttributeError(
898 f"cf_xarray can't wrap attribute {attr!r} because there are multiple values for {name!r}. "
899 f"There is no unique mapping from {name!r} to a value in {attr!r}."
900 )
901 newmap.update(dict.fromkeys(inverted[key], value))
902 newmap.update({key: attribute[key] for key in unused_keys})
AttributeError: cf_xarray can't wrap attribute 'sizes' because there are multiple values for 'X'. There is no unique mapping from 'X' to a value in 'sizes'.
multiple.v1.cf.sizes
{'X': 30, 'Y': 20}
Feature: Renaming variables¶
cf_xarray lets you rewrite variables in one dataset to like variables in
another dataset.
In this example, a one-to-one mapping is not possible and the coordinate variables are not renamed.
da = pop.cf["TEMP"]
da.cf.rename_like(ds)
/tmp/ipykernel_1534/2327264871.py:2: UserWarning: Conflicting variables skipped:
['TLAT']: ['lat'] (latitude)
['TLONG']: ['lon'] (longitude)
['nlat']: ['lat'] (Y)
['nlon']: ['lon'] (X)
da.cf.rename_like(ds)
<xarray.DataArray 'TEMP' (nlat: 20, nlon: 30)> Size: 5kB
15.0 15.0 15.0 15.0 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0 15.0 15.0 15.0 15.0
Coordinates:
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
TLONG (nlat, nlon) float64 5kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
TLAT (nlat, nlon) float64 5kB 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0
Attributes:
coordinates: TLONG TLAT
standard_name: sea_water_potential_temperatureIf we exclude all axes (variables with axis attribute), a one-to-one mapping
is possible. In this example, TLONG and TLAT are renamed to lon and lat
i.e. their counterparts in ds. Note the the coordinates attribute is
appropriately changed.
da.cf.rename_like(ds, skip="axes")
<xarray.DataArray 'TEMP' (nlat: 20, nlon: 30)> Size: 5kB
15.0 15.0 15.0 15.0 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0 15.0 15.0 15.0 15.0
Coordinates:
* nlon (nlon) int64 240B 0 1 2 3 4 5 6 7 8 ... 21 22 23 24 25 26 27 28 29
* nlat (nlat) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
lon (nlat, nlon) float64 5kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
lat (nlat, nlon) float64 5kB 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0
Attributes:
coordinates: lon lat
standard_name: sea_water_potential_temperatureFeature: Rewriting arguments¶
cf_xarray can rewrite arguments for a large number of xarray functions. By
this I mean that instead of specifying say dim="lon", you can pass dim="X"
or dim="longitude" and cf_xarray will rewrite that to dim="lon" based on
the attributes present in the dataset.
Here are a few examples
Slicing¶
ds.air.cf.isel(T=1)
<xarray.DataArray 'air' (lat: 25, lon: 53)> Size: 11kB
242.1 242.7 243.1 243.4 243.6 243.8 ... 297.5 297.1 296.9 296.4 296.4 296.6
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
time datetime64[ns] 8B 2013-01-01T06:00:00
Attributes:
long_name: 4xDaily Air temperature at sigma level 995
units: degK
precision: 2
GRIB_id: 11
GRIB_name: TMP
var_desc: Air temperature
dataset: NMC Reanalysis
level_desc: Surface
statistic: Individual Obs
parent_stat: Other
actual_range: [185.16 322.1 ]
standard_name: air_temperatureSlicing works will expand a single key like X to multiple dimensions if those
dimensions are tagged with axis: X
multiple.cf.isel(X=1, Y=1)
<xarray.Dataset> Size: 48B
Dimensions: ()
Coordinates:
x1 int64 8B 1
y1 int64 8B 1
x2 int64 8B 1
y2 int64 8B 1
Data variables:
v1 float64 8B 15.0
v2 float64 8B 15.0Reductions¶
ds.air.cf.mean("X")
<xarray.DataArray 'air' (time: 2920, lat: 25)> Size: 584kB
242.0 242.0 243.7 251.2 257.2 260.8 ... 292.3 294.4 295.8 297.0 297.9 298.8
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Attributes:
long_name: 4xDaily Air temperature at sigma level 995
units: degK
precision: 2
GRIB_id: 11
GRIB_name: TMP
var_desc: Air temperature
dataset: NMC Reanalysis
level_desc: Surface
statistic: Individual Obs
parent_stat: Other
actual_range: [185.16 322.1 ]
standard_name: air_temperatureExpanding to multiple dimensions is also supported
# takes the mean along ["x1", "x2"]
multiple.cf.mean("X")
<xarray.Dataset> Size: 400B
Dimensions: (y1: 20, y2: 5)
Coordinates:
* y1 (y1) int64 160B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
* y2 (y2) int64 40B 0 1 2 3 4
Data variables:
v1 (y1) float64 160B 15.0 15.0 15.0 15.0 15.0 ... 15.0 15.0 15.0 15.0
v2 (y2) float64 40B 15.0 15.0 15.0 15.0 15.0Plotting¶
ds.air.cf.isel(time=1).cf.plot(x="X", y="Y")
<matplotlib.collections.QuadMesh at 0x7616205a4b50>
ds.air.cf.isel(T=1, Y=[0, 1, 2]).cf.plot(x="longitude", hue="latitude")
[<matplotlib.lines.Line2D at 0x76162067ca50>,
<matplotlib.lines.Line2D at 0x7616204a16d0>,
<matplotlib.lines.Line2D at 0x7616204a1a10>]
cf_xarray can facet
seasonal = (
ds.air.groupby("time.season").mean().reindex(season=["DJF", "MAM", "JJA", "SON"])
)
seasonal.cf.plot(x="longitude", y="latitude", col="season")
<xarray.plot.facetgrid.FacetGrid at 0x76162063c050>
Resample & groupby¶
ds.cf.resample(T="D").mean()
<xarray.Dataset> Size: 8MB
Dimensions: (time: 730, lat: 25, lon: 53)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
* time (time) datetime64[ns] 6kB 2013-01-01 2013-01-02 ... 2014-12-31
Data variables:
air (time, lat, lon) float64 8MB 241.9 242.3 242.7 ... 295.9 295.5
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...cf_xarray also understands the “datetime accessor” syntax for groupby
ds.cf.groupby("T.month").mean("longitude")
<xarray.Dataset> Size: 607kB
Dimensions: (time: 2920, lat: 25)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat) float64 584kB 242.0 242.0 243.7 ... 297.0 297.9 298.8
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...Rolling & coarsen¶
ds.cf.rolling(X=5).mean()
<xarray.Dataset> Size: 31MB
Dimensions: (lat: 25, lon: 53, time: 2920)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float64 31MB nan nan nan nan ... 297.6 297.0 296.6
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...coarsen works but everything later will break because of xarray bug
https://github.com/pydata/xarray/issues/4120
ds.isel(lon=slice(50)).cf.coarsen(Y=5, X=10).mean()
Feature: mix “special names” and variable names¶
ds.cf.groupby("T.month").mean(["lat", "X"])
<xarray.Dataset> Size: 47kB
Dimensions: (time: 2920)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time) float64 23kB 274.2 273.5 273.2 273.6 ... 273.0 273.0 273.4
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...Feature: Weight by Cell Measures¶
cf_xarray can weight by cell measure variables if the appropriate attribute is
set
# Lets make some weights (not sure if this is right)
ds.coords["cell_area"] = (
np.cos(ds.air.cf["latitude"] * np.pi / 180)
* xr.ones_like(ds.air.cf["longitude"])
* 105e3
* 110e3
)
# and set proper attributes
ds["cell_area"].attrs = dict(standard_name="cell_area", units="m2")
ds.air.attrs["cell_measures"] = "area: cell_area"
ds.air.cf.weighted("area").mean(["latitude", "time"]).cf.plot(x="longitude")
ds.air.mean(["lat", "time"]).cf.plot(x="longitude")
[<matplotlib.lines.Line2D at 0x76161c6c1a10>]
Feature: Cell boundaries and vertices¶
cf_xarray can infer cell boundaries (for rectilinear grids) and convert
CF-standard bounds variables to vertices.
ds_bnds = ds.cf.add_bounds(["lat", "lon"])
ds_bnds
<xarray.Dataset> Size: 31MB
Dimensions: (time: 2920, lat: 25, lon: 53, bounds: 2)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
cell_area (lat, lon) float32 5kB 2.989e+09 2.989e+09 ... 1.116e+10
lat_bounds (lat, bounds) float32 200B 76.25 73.75 73.75 ... 16.25 13.75
lon_bounds (lon, bounds) float32 424B 198.8 201.2 201.2 ... 328.8 331.2
Dimensions without coordinates: bounds
Data variables:
air (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...We can also convert each bounds variable independently with the top-level functions
lat_bounds = ds_bnds.cf.get_bounds("latitude")
lat_vertices = cfxr.bounds_to_vertices(lat_bounds, bounds_dim="bounds")
lat_vertices
<xarray.DataArray 'lat_bounds' (lat_vertices: 26)> Size: 104B 76.25 73.75 71.25 68.75 66.25 63.75 ... 26.25 23.75 21.25 18.75 16.25 13.75 Dimensions without coordinates: lat_vertices
# Or we can convert _all_ bounds variables on a dataset
ds_crns = ds_bnds.cf.bounds_to_vertices()
ds_crns
<xarray.Dataset> Size: 31MB
Dimensions: (time: 2920, lat: 25, lon: 53, bounds: 2, lat_vertices: 26,
lon_vertices: 54)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
cell_area (lat, lon) float32 5kB 2.989e+09 2.989e+09 ... 1.116e+10
lat_bounds (lat, bounds) float32 200B 76.25 73.75 73.75 ... 16.25 13.75
lon_bounds (lon, bounds) float32 424B 198.8 201.2 201.2 ... 328.8 331.2
* lat_vertices (lat_vertices) float32 104B 76.25 73.75 71.25 ... 16.25 13.75
* lon_vertices (lon_vertices) float32 216B 198.8 201.2 203.8 ... 328.8 331.2
Dimensions without coordinates: bounds
Data variables:
air (time, lat, lon) float64 31MB 241.2 242.5 ... 296.2 295.7
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...Feature: Add canonical CF attributes¶
cf_xarray can add missing canonical CF attributes consistent with the official
CF standard name table.
ds_canonical = ds.cf.add_canonical_attributes(verbose=True)
ds_canonical
CF Standard Name Table info:
- version_number: 92
- conventions: CF-StandardNameTable-92
- first_published: 2025-07-24T14:20:46Z
- last_modified: 2025-07-24T14:20:46Z
- institution: Centre for Environmental Data Analysis
- contact: support@ceda.ac.uk
Attributes added:
- air:
* description: Air temperature is the bulk temperature of the air, not the surface (skin) temperature. It is strongly recommended that a variable with this standard name should have a units_metadata attribute, with one of the values "on-scale" or "difference", whichever is appropriate for the data, because it is essential to know whether the temperature is on-scale (meaning relative to the origin of the scale indicated by the units) or refers to temperature differences (implying that the origin of the temperature scale is irrevelant), in order to convert the units correctly (cf. https://cfconventions.org/cf-conventions/cf-conventions.html#temperature-units).
- lat:
* description: Latitude is positive northward; its units of degree_north (or equivalent) indicate this explicitly. In a latitude-longitude system defined with respect to a rotated North Pole, the standard name of grid_latitude should be used instead of latitude. Grid latitude is positive in the grid-northward direction, but its units should be plain degree.
- lon:
* description: Longitude is positive eastward; its units of degree_east (or equivalent) indicate this explicitly. In a latitude-longitude system defined with respect to a rotated North Pole, the standard name of grid_longitude should be used instead of longitude. Grid longitude is positive in the grid-eastward direction, but its units should be plain degree.
- cell_area:
* description: "Cell_area" is the horizontal area of a gridcell.
<xarray.Dataset> Size: 31MB
Dimensions: (time: 2920, lat: 25, lon: 53)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
cell_area (lat, lon) float32 5kB 2.989e+09 2.989e+09 ... 1.116e+10
Data variables:
air (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
history: Thu Dec 11 19:21:58 2025: cf.add_canonical_attributes(overr...