Skip to content

Commit ac3b80c

Browse files
eshort0401dcherian
andauthored
Slightly Amend Zarr Encoding Specification Doc #8749 (#11013)
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
1 parent e095d9f commit ac3b80c

3 files changed

Lines changed: 13 additions & 11 deletions

File tree

doc/internals/zarr-encoding-spec.rst

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ When accessing arrays with zarr-python, this information is available in the arr
4343
metadata but not in the attributes dictionary.
4444

4545
When reading a Zarr group, Xarray looks for dimension information in the appropriate
46-
location based on the format version, raising an error if it can't be found. The
46+
location based on the inferred format version, raising an error if it can't be found. The
4747
dimension information is used to define the variable dimension names and then
48-
(for Zarr V2) removed from the attributes dictionary returned to the user.
48+
(for Zarr V2) is removed from the attributes dictionary returned to the user.
4949

5050
CF Conventions
5151
--------------
@@ -59,17 +59,14 @@ used to describe metadata in NetCDF and Zarr.
5959
Compatibility and Reading
6060
-------------------------
6161

62-
Because of these encoding choices, Xarray cannot read arbitrary Zarr arrays, but only
63-
Zarr data with valid dimension metadata. Xarray supports:
62+
Because of these encoding choices, Xarray cannot read arbitrary Zarr groups, but only
63+
Zarr groups containing arrays with valid dimension metadata. Xarray supports:
6464

65-
- Zarr V2 arrays with ``_ARRAY_DIMENSIONS`` attributes
66-
- Zarr V3 arrays with ``dimension_names`` metadata
67-
- `NCZarr <https://docs.unidata.ucar.edu/nug/current/nczarr_head.html>`_ format
68-
(dimension names are defined in the ``.zarray`` file)
65+
1. Zarr V3 arrays with ``dimension_names`` metadata
66+
2. Zarr V2 arrays with ``_ARRAY_DIMENSIONS`` attributes
67+
3. `NCZarr <https://docs.unidata.ucar.edu/nug/current/nczarr_head.html>`_ format (dimension names are defined in the ``dimrefs`` field in the custom ``.zarray`` file)
6968

70-
After decoding the dimension information and assigning the variable dimensions,
71-
Xarray proceeds to [optionally] decode each variable using its standard CF decoding
72-
machinery used for NetCDF data.
69+
Xarray checks each of these three conventions, in the order given above, when looking for dimension name metadata. Note that while Xarray can read NCZarr groups, it currently does not write NCZarr groups. After decoding the dimension information and assigning the variable dimensions, Xarray proceeds to [optionally] decode each variable using its standard CF decoding machinery used for NetCDF data.
7370

7471
Finally, it's worth noting that Xarray writes (and attempts to read)
7572
"consolidated metadata" by default (the ``.zmetadata`` file), which is another

doc/whats-new.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ Deprecations
2626
Bug Fixes
2727
~~~~~~~~~
2828

29+
- Slightly amend `Xarray's Zarr Encoding Specification doc <https://docs.xarray.dev/en/latest/internals/zarr-encoding-spec.html>`_ for clarity, and provide a code comment in ``xarray.backends.zarr._get_zarr_dims_and_attrs`` referencing the doc (:issue:`8749` :pull:`11013`).
30+
By `Ewan Short <https://github.com/eshort0401>`_.
2931
- Fix silent data corruption when writing dask arrays to sharded Zarr stores.
3032
Dask chunk boundaries must now align with shard boundaries, not just internal
3133
Zarr chunk boundaries (:issue:`10831`).

xarray/backends/zarr.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -360,6 +360,9 @@ def _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name):
360360

361361

362362
def _get_zarr_dims_and_attrs(zarr_obj, dimension_key, try_nczarr):
363+
# Check for attributes and dimension name metadata as discussed in the Zarr encoding
364+
# specification https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html
365+
363366
# Zarr V3 explicitly stores the dimension names in the metadata
364367
try:
365368
# if this exists, we are looking at a Zarr V3 array

0 commit comments

Comments
 (0)