Creating and saving objects with units
Attaching units
Usually, when loading data from disk we get a Dataset
or
DataArray
with units in attributes:
In [1]: ds = xr.Dataset(
...: {
...: "a": (("lon", "lat"), [[11.84, 3.12, 9.7], [7.8, 9.3, 14.72]]),
...: "b": (("lon", "lat"), [[13, 2, 7], [5, 4, 9]], {"units": "m"}),
...: },
...: coords={"lat": [10, 20, 30], "lon": [74, 76]},
...: )
...: ds
...:
Out[1]:
<xarray.Dataset>
Dimensions: (lon: 2, lat: 3)
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
Data variables:
a (lon, lat) float64 11.84 3.12 9.7 7.8 9.3 14.72
b (lon, lat) int64 13 2 7 5 4 9
In [2]: da = ds.b
...: da
...:
Out[2]:
<xarray.DataArray 'b' (lon: 2, lat: 3)>
array([[13, 2, 7],
[ 5, 4, 9]])
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
Attributes:
units: m
In order to get pint.Quantity
instances, we can use the
Dataset.pint.quantify()
or DataArray.pint.quantify()
methods:
In [3]: ds.pint.quantify()
Out[3]:
<xarray.Dataset>
Dimensions: (lon: 2, lat: 3)
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
Data variables:
a (lon, lat) float64 11.84 3.12 9.7 7.8 9.3 14.72
b (lon, lat) int64 [m] 13 2 7 5 4 9
We can also override the units of a variable:
In [4]: ds.pint.quantify(b="km")
Out[4]:
<xarray.Dataset>
Dimensions: (lon: 2, lat: 3)
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
Data variables:
a (lon, lat) float64 11.84 3.12 9.7 7.8 9.3 14.72
b (lon, lat) int64 [km] 13 2 7 5 4 9
In [5]: da.pint.quantify("degree")
Out[5]:
<xarray.DataArray 'b' (lon: 2, lat: 3)>
<Quantity([[13 2 7]
[ 5 4 9]], 'degree')>
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
Overriding works even if there is no units
attribute, so we could use this
to attach units to a normal Dataset
:
In [6]: temporary_ds = xr.Dataset({"a": ("x", [0, 5, 10])}, coords={"x": [1, 2, 3]})
...: temporary_ds.pint.quantify({"a": "m"})
...:
Out[6]:
<xarray.Dataset>
Dimensions: (x: 3)
Coordinates:
* x (x) int64 1 2 3
Data variables:
a (x) int64 [m] 0 5 10
Of course, we could use pint.Unit
instances instead of strings to
specify units, too.
Note
Unit objects tied to different registries cannot interact with each
other. In order to avoid this, DataArray.pint.quantify()
and
Dataset.pint.quantify()
will make sure only a single registry is
used per xarray
object.
If we wanted to change the units of the data of a DataArray
, we
could do so using the DataArray.name
attribute:
In [7]: da.pint.quantify({da.name: "J", "lat": "degree", "lon": "degree"})
Out[7]:
<xarray.DataArray 'b' (lon: 2, lat: 3)>
<Quantity([[13 2 7]
[ 5 4 9]], 'joule')>
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
However, xarray currently doesn’t support units in indexes, so the new units were set
as attributes. To really observe the changes the quantify
methods make, we
have to first swap the dimensions:
In [8]: ds_with_units = ds.swap_dims({"lon": "x", "lat": "y"}).pint.quantify(
...: {"lat": "degree", "lon": "degree"}
...: )
...: ds_with_units
...:
Out[8]:
<xarray.Dataset>
Dimensions: (x: 2, y: 3)
Coordinates:
lat (y) int64 [deg] 10 20 30
lon (x) int64 [deg] 74 76
Dimensions without coordinates: x, y
Data variables:
a (x, y) float64 11.84 3.12 9.7 7.8 9.3 14.72
b (x, y) int64 [m] 13 2 7 5 4 9
In [9]: da_with_units = da.swap_dims({"lon": "x", "lat": "y"}).pint.quantify(
...: {"lat": "degree", "lon": "degree"}
...: )
...: da_with_units
...:
Out[9]:
<xarray.DataArray 'b' (x: 2, y: 3)>
<Quantity([[13 2 7]
[ 5 4 9]], 'meter')>
Coordinates:
lat (y) int64 [deg] 10 20 30
lon (x) int64 [deg] 74 76
Dimensions without coordinates: x, y
By default, Dataset.pint.quantify()
and
DataArray.pint.quantify()
will use the unit registry at
pint_xarray.unit_registry
(the
application registry
). If we want a
different registry, we can either pass it as the unit_registry
parameter:
In [10]: ureg = pint.UnitRegistry(force_ndarray_like=True)
In [11]: da.pint.quantify("degree", unit_registry=ureg)
Out[11]:
<xarray.DataArray 'b' (lon: 2, lat: 3)>
<Quantity([[13 2 7]
[ 5 4 9]], 'degree')>
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
or overwrite the default registry:
In [12]: pint_xarray.unit_registry = ureg
In [13]: da.pint.quantify("degree")
Out[13]:
<xarray.DataArray 'b' (lon: 2, lat: 3)>
<Quantity([[13 2 7]
[ 5 4 9]], 'degree')>
Coordinates:
* lat (lat) int64 10 20 30
* lon (lon) int64 74 76
Note
To properly work with
xarray
, theforce_ndarray_like
orforce_ndarray
options have to be enabled on the custom registry.
Without it, python scalars wrapped by pint.Quantity
may raise errors or
have their units stripped.
Saving with units
In order to not lose the units when saving to disk, we first have to call the
Dataset.pint.dequantify()
and DataArray.pint.dequantify()
methods:
In [14]: ds_with_units.pint.dequantify()
Out[14]:
<xarray.Dataset>
Dimensions: (x: 2, y: 3)
Coordinates:
lat (y) int64 10 20 30
lon (x) int64 74 76
Dimensions without coordinates: x, y
Data variables:
a (x, y) float64 11.84 3.12 9.7 7.8 9.3 14.72
b (x, y) int64 13 2 7 5 4 9
In [15]: da_with_units.pint.dequantify()
Out[15]:
<xarray.DataArray 'b' (x: 2, y: 3)>
array([[13, 2, 7],
[ 5, 4, 9]])
Coordinates:
lat (y) int64 10 20 30
lon (x) int64 74 76
Dimensions without coordinates: x, y
Attributes:
units: meter
This will get the string representation of a pint.Unit
instance and
attach it as a units
attribute. The data of the variable will now be
whatever pint wrapped.