Creating and saving objects with units

Attaching units

Usually, when loading data from disk we get a Dataset or DataArray with units in attributes:

In [1]: ds = xr.Dataset(
   ...:     {
   ...:         "a": (("lon", "lat"), [[11.84, 3.12, 9.7], [7.8, 9.3, 14.72]]),
   ...:         "b": (("lon", "lat"), [[13, 2, 7], [5, 4, 9]], {"units": "m"}),
   ...:     },
   ...:     coords={"lat": [10, 20, 30], "lon": [74, 76]},
   ...: )
   ...: ds
   ...: 
Out[1]: 
<xarray.Dataset> Size: 136B
Dimensions:  (lon: 2, lat: 3)
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76
Data variables:
    a        (lon, lat) float64 48B 11.84 3.12 9.7 7.8 9.3 14.72
    b        (lon, lat) int64 48B 13 2 7 5 4 9

In [2]: da = ds.b
   ...: da
   ...: 
Out[2]: 
<xarray.DataArray 'b' (lon: 2, lat: 3)> Size: 48B
array([[13,  2,  7],
       [ 5,  4,  9]])
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76
Attributes:
    units:    m

In order to get pint.Quantity instances, we can use the Dataset.pint.quantify() or DataArray.pint.quantify() methods:

In [3]: ds.pint.quantify()
Out[3]: 
<xarray.Dataset> Size: 136B
Dimensions:  (lon: 2, lat: 3)
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76
Data variables:
    a        (lon, lat) float64 48B 11.84 3.12 9.7 7.8 9.3 14.72
    b        (lon, lat) int64 48B [m] 13 2 7 5 4 9

We can also override the units of a variable:

In [4]: ds.pint.quantify(b="km")
Out[4]: 
<xarray.Dataset> Size: 136B
Dimensions:  (lon: 2, lat: 3)
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76
Data variables:
    a        (lon, lat) float64 48B 11.84 3.12 9.7 7.8 9.3 14.72
    b        (lon, lat) int64 48B [km] 13 2 7 5 4 9

In [5]: da.pint.quantify("degree")
Out[5]: 
<xarray.DataArray 'b' (lon: 2, lat: 3)> Size: 48B
<Quantity([[13  2  7]
 [ 5  4  9]], 'degree')>
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76

Overriding works even if there is no units attribute, so we could use this to attach units to a normal Dataset:

In [6]: temporary_ds = xr.Dataset({"a": ("x", [0, 5, 10])}, coords={"x": [1, 2, 3]})
   ...: temporary_ds.pint.quantify({"a": "m"})
   ...: 
Out[6]: 
<xarray.Dataset> Size: 48B
Dimensions:  (x: 3)
Coordinates:
  * x        (x) int64 24B 1 2 3
Data variables:
    a        (x) int64 24B [m] 0 5 10

Of course, we could use pint.Unit instances instead of strings to specify units, too.

Note

Unit objects tied to different registries cannot interact with each other. In order to avoid this, DataArray.pint.quantify() and Dataset.pint.quantify() will make sure only a single registry is used per xarray object.

If we wanted to change the units of the data of a DataArray, we could do so using the DataArray.name attribute:

In [7]: da.pint.quantify({da.name: "J", "lat": "degree", "lon": "degree"})
Out[7]: 
<xarray.DataArray 'b' (lon: 2, lat: 3)> Size: 48B
<Quantity([[13  2  7]
 [ 5  4  9]], 'joule')>
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76

However, xarray currently doesn’t support units in indexes, so the new units were set as attributes. To really observe the changes the quantify methods make, we have to first swap the dimensions:

In [8]: ds_with_units = ds.swap_dims({"lon": "x", "lat": "y"}).pint.quantify(
   ...:     {"lat": "degree", "lon": "degree"}
   ...: )
   ...: ds_with_units
   ...: 
Out[8]: 
<xarray.Dataset> Size: 136B
Dimensions:  (x: 2, y: 3)
Coordinates:
    lat      (y) int64 24B [deg] 10 20 30
    lon      (x) int64 16B [deg] 74 76
Dimensions without coordinates: x, y
Data variables:
    a        (x, y) float64 48B 11.84 3.12 9.7 7.8 9.3 14.72
    b        (x, y) int64 48B [m] 13 2 7 5 4 9

In [9]: da_with_units = da.swap_dims({"lon": "x", "lat": "y"}).pint.quantify(
   ...:     {"lat": "degree", "lon": "degree"}
   ...: )
   ...: da_with_units
   ...: 
Out[9]: 
<xarray.DataArray 'b' (x: 2, y: 3)> Size: 48B
<Quantity([[13  2  7]
 [ 5  4  9]], 'meter')>
Coordinates:
    lat      (y) int64 24B [deg] 10 20 30
    lon      (x) int64 16B [deg] 74 76
Dimensions without coordinates: x, y

By default, Dataset.pint.quantify() and DataArray.pint.quantify() will use the unit registry at pint_xarray.unit_registry (the application registry). If we want a different registry, we can either pass it as the unit_registry parameter:

In [10]: ureg = pint.UnitRegistry(force_ndarray_like=True)

In [11]: da.pint.quantify("degree", unit_registry=ureg)
Out[11]: 
<xarray.DataArray 'b' (lon: 2, lat: 3)> Size: 48B
<Quantity([[13  2  7]
 [ 5  4  9]], 'degree')>
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76

or overwrite the default registry:

In [12]: pint_xarray.unit_registry = ureg

In [13]: da.pint.quantify("degree")
Out[13]: 
<xarray.DataArray 'b' (lon: 2, lat: 3)> Size: 48B
<Quantity([[13  2  7]
 [ 5  4  9]], 'degree')>
Coordinates:
  * lat      (lat) int64 24B 10 20 30
  * lon      (lon) int64 16B 74 76

Note

To properly work with xarray, the force_ndarray_like or force_ndarray options have to be enabled on the custom registry.

Without it, python scalars wrapped by pint.Quantity may raise errors or have their units stripped.

Saving with units

In order to not lose the units when saving to disk, we first have to call the Dataset.pint.dequantify() and DataArray.pint.dequantify() methods:

In [14]: ds_with_units.pint.dequantify()
Out[14]: 
<xarray.Dataset> Size: 136B
Dimensions:  (x: 2, y: 3)
Coordinates:
    lat      (y) int64 24B 10 20 30
    lon      (x) int64 16B 74 76
Dimensions without coordinates: x, y
Data variables:
    a        (x, y) float64 48B 11.84 3.12 9.7 7.8 9.3 14.72
    b        (x, y) int64 48B 13 2 7 5 4 9

In [15]: da_with_units.pint.dequantify()
Out[15]: 
<xarray.DataArray 'b' (x: 2, y: 3)> Size: 48B
array([[13,  2,  7],
       [ 5,  4,  9]])
Coordinates:
    lat      (y) int64 24B 10 20 30
    lon      (x) int64 16B 74 76
Dimensions without coordinates: x, y
Attributes:
    units:    meter

This will get the string representation of a pint.Unit instance and attach it as a units attribute. The data of the variable will now be whatever pint wrapped.