From 32a4663a17c7c59dbebb5c8b066067b49e60813c Mon Sep 17 00:00:00 2001 From: BrianMichell Date: Mon, 16 Mar 2026 19:44:58 +0000 Subject: [PATCH] Add documentation for open_as_void --- docs/index.rst | 6 + docs/open_as_void.rst | 220 +++++++++++++++++++++++++++++ tensorstore/driver/zarr/index.rst | 7 + tensorstore/driver/zarr3/index.rst | 7 + 4 files changed, 240 insertions(+) create mode 100644 docs/open_as_void.rst diff --git a/docs/index.rst b/docs/index.rst index 404ec7582..21e388d43 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -36,6 +36,12 @@ TensorStore driver/index kvstore/index +.. toctree:: + :hidden: + :caption: Advanced Features + + open_as_void + TensorStore is a library for efficiently reading and writing large multi-dimensional arrays. Highlights diff --git a/docs/open_as_void.rst b/docs/open_as_void.rst new file mode 100644 index 000000000..e695fc1f4 --- /dev/null +++ b/docs/open_as_void.rst @@ -0,0 +1,220 @@ +.. _open-as-void: + +Raw Byte Access (``open_as_void``) +================================== + +The ``open_as_void`` option provides raw byte-level access to zarr arrays with +structured data types, bypassing the normal field interpretation. This feature +is available for both the :ref:`driver/zarr2` and :ref:`driver/zarr3` drivers. + +Supported Data Types +-------------------- + +The ``open_as_void`` option is only valid for structured data types: + +- **Zarr v2**: ``structured`` dtype (NumPy-style structured arrays) +- **Zarr v3**: ``struct`` and ``structured`` dtypes + +Attempting to use ``open_as_void`` with non-structured data types will result +in an error. + +Purpose +------- + +When opening an array with :json:`"open_as_void": true`, TensorStore exposes +the underlying byte representation of the array data rather than interpreting +it according to the stored field structure. + +Behavior +-------- + +When ``open_as_void`` is enabled: + +1. **Data type becomes byte**: The resulting TensorStore has dtype + :json:schema:`~dtype.byte` regardless of the original structured data type. + +2. **Additional dimension added**: A new innermost dimension is appended to + represent the byte layout of each element. The size of this dimension + equals the number of bytes per element in the original structured type. + +3. **Codecs are preserved**: All encoding/decoding (including compression) + is still applied. The raw bytes exposed are the *decoded* element bytes, + not the raw compressed chunk data. + +Dimension Transformation +~~~~~~~~~~~~~~~~~~~~~~~~ + +For an array with shape ``[D0, D1, ..., Dn]`` and a structured data type of +size ``B`` bytes per element, opening with ``open_as_void`` produces a +TensorStore with: + +- Shape: ``[D0, D1, ..., Dn, B]`` +- Rank: original rank + 1 +- Data type: ``byte`` + +.. admonition:: Example: Zarr v2 structured dtype + :class: example + + A zarr v2 array with structured dtype ``[("x", "|u1"), ("y", "` via the +:json:schema:`~driver/zarr2.open_as_void` option, which exposes array data as +raw bytes instead of interpreting it according to the data type. + Compressors ----------- diff --git a/tensorstore/driver/zarr3/index.rst b/tensorstore/driver/zarr3/index.rst index 0acfc92b1..d04b0092f 100644 --- a/tensorstore/driver/zarr3/index.rst +++ b/tensorstore/driver/zarr3/index.rst @@ -15,6 +15,13 @@ creating new arrays, and resizing arrays. .. json:schema:: driver/zarr3/Metadata +Raw Byte Access +--------------- + +The zarr3 driver supports :ref:`raw byte access` via the +:json:schema:`~driver/zarr3.open_as_void` option, which exposes array data as +raw bytes instead of interpreting it according to the data type. + Codecs ------