Chunk-Based Analysis to Reduce Memory Usage, using XArray, HDF5, and Dask

Chunk-Based Analysis to Reduce Memory Usage, using XArray, HDF5, and Dask

Author
Dr. Nicholas Del Grosso

Scientific datasets are increasingly too large to load into memory all at once. High-resolution imaging, large simulation outputs, and long experimental recordings can easily exceed the RAM available on a typical workstation. If analysis requires loading the entire dataset before computation begins, many otherwise simple operations become impossible.

This notebook introduces a different approach: chunk-based analysis. Instead of loading an entire dataset into memory, data can be processed in smaller pieces. Modern scientific Python tools — particularly XArray, NetCDF/HDF5, and Dask—make this possible while keeping the code readable and close to the mathematical operations scientists want to perform. The goal is to understand how these tools cooperate to reduce memory pressure and enable scalable analysis pipelines.

The notebook proceeds in stages. First, you will learn the basic structure of XArray objects, which add labeled dimensions and metadata to NumPy-style arrays. Next, you will see how these arrays can be saved to NetCDF/HDF5 files, enabling efficient on-disk storage with compression and encoding options. Finally, you will explore how lazy loading and chunked computation allow large analyses to run without loading the entire dataset into memory, and how Dask coordinates the computation behind the scenes.

Setup

Import Packages

import dask.distributed
import netCDF4
import numpy as np
import xarray as xr

Utility Functions

from contextlib import contextmanager
import os
import sys

import matplotlib.pyplot as plt
from memory_profiler import memory_usage
import pandas as pd


def _generate_calcium_data_file(fname="calcium_imaging_session.nc", nx=256, ny=256, nt=4000, n_cells=12, baseline=0.15, noise_sd=0.05) -> None:
    import numpy as np
    import xarray as xr
    
    time = np.linspace(2, 20, nt)
    
    rng = np.random.default_rng(42)
    
    # coordinate grids
    x = np.arange(nx)
    y = np.arange(ny)
    X, Y = np.meshgrid(x, y, indexing="ij")
    
    # ----------------------------
    # Background signal
    # ----------------------------
    # Start with baseline + pixel noise
    data = baseline + rng.normal(0, noise_sd, size=(nx, ny, nt))
    
    # Add slow global drift over time
    slow_drift = (
        0.03 * np.sin(2 * np.pi * time / 9.0)
        + 0.02 * np.sin(2 * np.pi * time / 3.7 + 1.2)
    )
    data += slow_drift[None, None, :]
    
    # Add a weak spatial background gradient
    spatial_bg = 0.03 * ((X / nx) + (Y / ny))
    data += spatial_bg[:, :, None]
    
    # ----------------------------
    # Shared neuropil-like signal
    # ----------------------------
    shared_events = rng.poisson(0.003, size=nt).astype(float)
    
    kernel_len = 120
    tau = 18
    kernel = np.exp(-np.arange(kernel_len) / tau)
    shared_trace = np.convolve(shared_events, kernel, mode="full")[:nt]
    shared_trace /= shared_trace.max() + 1e-12
    shared_trace *= 0.05
    
    data += shared_trace[None, None, :]
    
    # Add synthetic cells
    cell_masks = []
    cell_traces = []
    
    for i in range(n_cells):
        # random cell center, avoiding borders
        cx = rng.integers(20, nx - 20)
        cy = rng.integers(20, ny - 20)
    
        # random elliptical Gaussian footprint
        sx = rng.uniform(3, 8)
        sy = rng.uniform(3, 8)
        amp = rng.uniform(0.2, 0.8)
    
        footprint = np.exp(-(((X - cx) ** 2) / (2 * sx**2) + ((Y - cy) ** 2) / (2 * sy**2)))
        footprint *= amp
    
        # sparse spike/event train
        event_rate = rng.uniform(0.002, 0.01)
        spikes = rng.poisson(event_rate, size=nt).astype(float)
    
        # calcium decay kernel
        tau_decay = rng.uniform(8, 30)
        kernel_len = 200
        decay_kernel = np.exp(-np.arange(kernel_len) / tau_decay)
    
        trace = np.convolve(spikes, decay_kernel, mode="full")[:nt]
    
        # normalize and scale
        if trace.max() > 0:
            trace = trace / trace.max()
        trace *= rng.uniform(0.3, 1.2)
    
        # add a tiny bit of within-cell temporal noise
        trace += rng.normal(0, 0.01, size=nt)
        trace = np.clip(trace, 0, None)
    
        # add cell contribution to movie
        data += footprint[:, :, None] * trace[None, None, :]
    
        cell_masks.append(footprint)
        cell_traces.append(trace)
    
    
    # Clamp to nonnegative values
    data = np.clip(data, 0, None)
    
    # Build xarray object
    movie = xr.DataArray(
        data,
        name="image",
        dims=["x", "y", "time"],
        coords={
            "x": x,
            "y": y,
            "time": time,
        },
        attrs={
            "description": "Synthetic calcium imaging session with noisy background, drift, shared activity, and localized active cells"
        },
    )
    movie.to_netcdf(fname)



def _format_duration(seconds: float, precision: int = 1) -> str:
    """
    Takes a time in seconds and returns a string (e.g. ) that is more human-readable.

    Looking to do this in a real project?  Some alternatives:
      - `humanize`: https://humanize.readthedocs.io/en/latest/
    """


    if seconds < 0:
        raise ValueError("Duration must be non-negative")

    units = [("s", 1), ("ms", 1e-3), ("µs", 1e-6)]

    for unit, scale in units:
        if seconds >= scale:
            value = seconds / scale
            return f"{value:.{precision}f} {unit}"
    else:
        return f"{seconds / 1e-9:.{precision}f} ns"



@contextmanager
def _trace_lines_of(fun):
    "a (very) basic line tracer.  Collects (timestamp, lineno) for each executed line inside a function."
    import time
    try:
        target_code = fun.__code__
    except AttributeError:
        yield []
        return
    target_frame = None
    records = []

    def tracer(frame, event, arg):
        nonlocal target_frame

        if event == "call" and frame.f_code is target_code:
            target_frame = frame
            return tracer

        elif frame is target_frame:
            if event == "line":
                records.append((frame.f_lineno, time.perf_counter()))
            elif event == "return":
                target_frame = None

            return tracer

    old_trace = sys.gettrace()
    sys.settrace(tracer)

    try:
        yield records
    finally:
        sys.settrace(old_trace)


def _sample_memory(fun, interval=.00005):
    
    # Collect memory traces and line number timings
    with _trace_lines_of(fun) as line_trace:
        memory_trace = memory_usage(fun, interval=interval, timestamps=True)

    # Make Comparable DataFrames out of the two datasets
    line_trace_df = pd.DataFrame(line_trace, columns=['Line', 'Time'])
    if len(line_trace_df) > 0:
        line_trace_df.Time -= line_trace_df.Time[0]
    
    memory_trace = memory_trace[1:]
    
    memory_trace_df = pd.DataFrame(memory_trace, columns=['Memory', 'Time'])
    memory_trace_df['Time'] -= memory_trace_df['Time'][0]
    memory_trace_df['Memory'] -= memory_trace_df['Memory'][0]

    return line_trace_df, memory_trace_df


def _plot_memory(data: pd.DataFrame, x='Time', y='Memory', ax=None):
    "Makes a line plot."

    peak_memory_mb = round(data[y].max(), 1)
    total_time = _format_duration(data[x].max())

    ax = ax if ax is not None else plt.gca()
    ax.plot(data[x], data[y])
    ax.fill_between(data[x], data[y], 0, alpha=0.3)
    ax.set(xlabel='Time (s)', ylabel='Memory (MB)', title=f"Total Time: {total_time} -- Peak Memory: {peak_memory_mb} MB")

    ax.margins(y=0)
    
    ylim_max = data[y].max() * 1.05 if data[y].max() > 1 else 1
    ax.set_ylim(0, ylim_max)
    
    
    ax.spines["top"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)
    return ax


def _plot_line_numbers(data: pd.DataFrame, x='Time', text='Line', linestyle='--', color='gray', alpha=0.3, fontsize=6, ax=None):
    "Makes vertical lines with text above them."
    
    ax = ax if ax is not None else plt.gca()
    ymin, ymax = ax.get_ylim()
    y_text = ymax - 0.04 * (ymax - ymin)

    for _, row in data.iterrows():
        ax.axvline(row[x], linestyle=linestyle, alpha=alpha, color=color)
        ax.text(row[x], y_text, str(int(row[text])), rotation=90, ha="right", va="bottom", fontsize=fontsize)
        
    return ax

def _analyze_memory(*funs, interval=.00005, linestyle='--', color='gray', alpha=0.3, fontsize=6):
    "Convenient wrapper function: records memory traces of provided functions and makes the plot"

    if len(funs) == 1:
        fig, axes = plt.subplots();
        axes = [axes]
    else:
        fig, axes = plt.subplots(nrows=len(funs), sharex=True)


    print(axes)
    for ax, fun in zip(axes, funs):
        lines, memory = _sample_memory(fun, interval=interval)
        # ax = ax if ax is not None else plt.gca()
        _plot_memory(memory, ax=ax)
        _plot_line_numbers(lines, linestyle=linestyle, color=color, alpha=alpha, fontsize=fontsize, ax=ax)
    
    y_max = max([ax.get_ylim()[1] for ax in axes])
    for ax in axes:
        ax.set_ylim(0, y_max)

    plt.tight_layout()


def _format_bytes(bytes: float, precision: int = 2) -> str:
    """
    Takes a time in seconds and returns a string (e.g. ) that is more human-readable.

    Looking to do this in a real project?  Some alternatives:
      - `humanfriendly`: https://pypi.org/project/humanfriendly/#getting-started
    """

    if bytes < 0:
        raise ValueError("bytes must be non-negative")

    units = [("KB", 1000), ("MB", 1_000_000), ("GB", 1_000_000_000), ("TB", 1_000_000_000_000)]

    for unit, scale in reversed(units):
        if bytes >= scale:
            value = bytes / scale
            return f"{value:.{precision}f} {unit}"
    else:
        return f"{bytes} B"

def _file_size(path: str) -> int:
    return os.path.getsize(path)

def _print_file_size(path: str, label='') -> None:
    text = _format_bytes(_file_size(path))
    if label:
        text = label + ': ' + text
    print(text)


class utils:
    analyze_memory = _analyze_memory
    generate_calcium_data_file = _generate_calcium_data_file
    print_file_size = _print_file_size

Section 1: Intro to Working with XArray

NumPy arrays are powerful but minimal: they store numerical data without any built-in information about what the axes represent. In scientific datasets, however, each axis usually corresponds to meaningful quantities such as space, time, wavelength, or experimental condition. When working with multidimensional data, it is often useful to label these dimensions explicitly.

XArray extends NumPy by adding labels and metadata to multidimensional arrays. Each array can have named dimensions, coordinate values, and descriptive metadata attached to it. This makes many operations clearer and safer, since calculations can reference dimensions by name rather than by numeric index. In addition, XArray integrates naturally with file formats such as NetCDF and with distributed computing tools like Dask.

Exercises

The exercises in this section introduce the core concepts of XArray: treating a DataArray like a NumPy array, labeling dimensions, attaching coordinates, and adding metadata to describe the data.

xr.DataArray() as a Numpy-Like Array

At its core, an XArray DataArray wraps a NumPy array. Most numerical operations that work on NumPy arrays also work on DataArray objects, including slicing, aggregation, and arithmetic. This means that existing NumPy-style code often requires very little modification to work with labeled arrays.

Example: Create a three-dimensional DataArray from a Numpy array:

da = xr.DataArray(
    data=np.random.random(size=(10, 20, 30))
)
da
<xarray.DataArray (dim_0: 10, dim_1: 20, dim_2: 30)> Size: 48kB
array([[[0.04114747, 0.75038331, 0.13638907, ..., 0.38833257,
         0.28897767, 0.44763658],
        [0.05093347, 0.84213056, 0.73766516, ..., 0.01407856,
         0.76522943, 0.87498363],
        [0.84178601, 0.30799076, 0.2984225 , ..., 0.37395342,
         0.16885954, 0.44853357],
        ...,
        [0.85808347, 0.16014808, 0.04753766, ..., 0.8107473 ,
         0.93155405, 0.27736656],
        [0.05476814, 0.56464578, 0.40849327, ..., 0.41397046,
         0.40700989, 0.872461  ],
        [0.33342521, 0.57842884, 0.20717272, ..., 0.56226436,
         0.26117397, 0.78730901]],
       [[0.41578296, 0.97891988, 0.94396026, ..., 0.51148956,
         0.32645908, 0.49573464],
        [0.0894115 , 0.40364212, 0.90407579, ..., 0.7082974 ,
         0.59801165, 0.55842448],
        [0.32918622, 0.2603817 , 0.54499274, ..., 0.43422843,
         0.56081601, 0.7011575 ],
...
        [0.67786071, 0.45865382, 0.27930756, ..., 0.84763965,
         0.71848224, 0.12861828],
        [0.3438912 , 0.66496328, 0.962331  , ..., 0.73381751,
         0.38001691, 0.64951477],
        [0.51649295, 0.17889422, 0.9333006 , ..., 0.4241998 ,
         0.6520835 , 0.53247383]],
       [[0.35375472, 0.06168421, 0.79528549, ..., 0.73293053,
         0.54893905, 0.99901047],
        [0.45519009, 0.25676456, 0.70768052, ..., 0.64053184,
         0.83816718, 0.24689528],
        [0.07249795, 0.92133683, 0.38432126, ..., 0.21450932,
         0.98230673, 0.76157872],
        ...,
        [0.68634487, 0.97082859, 0.64525841, ..., 0.39506538,
         0.2014205 , 0.07881245],
        [0.94521576, 0.14213673, 0.01033459, ..., 0.9120943 ,
         0.68981696, 0.56170583],
        [0.29622479, 0.7068755 , 0.24623064, ..., 0.30720207,
         0.64914055, 0.66220053]]])
Dimensions without coordinates: dim_0, dim_1, dim_2
xarray.DataArray
  • dim_0: 10
  • dim_1: 20
  • dim_2: 30
  • 0.04115 0.7504 0.1364 0.7344 0.5692 ... 0.07554 0.3072 0.6491 0.6622
    array([[[0.04114747, 0.75038331, 0.13638907, ..., 0.38833257,
             0.28897767, 0.44763658],
            [0.05093347, 0.84213056, 0.73766516, ..., 0.01407856,
             0.76522943, 0.87498363],
            [0.84178601, 0.30799076, 0.2984225 , ..., 0.37395342,
             0.16885954, 0.44853357],
            ...,
            [0.85808347, 0.16014808, 0.04753766, ..., 0.8107473 ,
             0.93155405, 0.27736656],
            [0.05476814, 0.56464578, 0.40849327, ..., 0.41397046,
             0.40700989, 0.872461  ],
            [0.33342521, 0.57842884, 0.20717272, ..., 0.56226436,
             0.26117397, 0.78730901]],
           [[0.41578296, 0.97891988, 0.94396026, ..., 0.51148956,
             0.32645908, 0.49573464],
            [0.0894115 , 0.40364212, 0.90407579, ..., 0.7082974 ,
             0.59801165, 0.55842448],
            [0.32918622, 0.2603817 , 0.54499274, ..., 0.43422843,
             0.56081601, 0.7011575 ],
    ...
            [0.67786071, 0.45865382, 0.27930756, ..., 0.84763965,
             0.71848224, 0.12861828],
            [0.3438912 , 0.66496328, 0.962331  , ..., 0.73381751,
             0.38001691, 0.64951477],
            [0.51649295, 0.17889422, 0.9333006 , ..., 0.4241998 ,
             0.6520835 , 0.53247383]],
           [[0.35375472, 0.06168421, 0.79528549, ..., 0.73293053,
             0.54893905, 0.99901047],
            [0.45519009, 0.25676456, 0.70768052, ..., 0.64053184,
             0.83816718, 0.24689528],
            [0.07249795, 0.92133683, 0.38432126, ..., 0.21450932,
             0.98230673, 0.76157872],
            ...,
            [0.68634487, 0.97082859, 0.64525841, ..., 0.39506538,
             0.2014205 , 0.07881245],
            [0.94521576, 0.14213673, 0.01033459, ..., 0.9120943 ,
             0.68981696, 0.56170583],
            [0.29622479, 0.7068755 , 0.24623064, ..., 0.30720207,
             0.64914055, 0.66220053]]])

      Exercise: Select the first 5 rows of da, using the slicing synatax x[:10, :, :]

      Solution
      da[:5, :, :]
      <xarray.DataArray (dim_0: 5, dim_1: 20, dim_2: 30)> Size: 24kB
      array([[[0.04114747, 0.75038331, 0.13638907, ..., 0.38833257,
               0.28897767, 0.44763658],
              [0.05093347, 0.84213056, 0.73766516, ..., 0.01407856,
               0.76522943, 0.87498363],
              [0.84178601, 0.30799076, 0.2984225 , ..., 0.37395342,
               0.16885954, 0.44853357],
              ...,
              [0.85808347, 0.16014808, 0.04753766, ..., 0.8107473 ,
               0.93155405, 0.27736656],
              [0.05476814, 0.56464578, 0.40849327, ..., 0.41397046,
               0.40700989, 0.872461  ],
              [0.33342521, 0.57842884, 0.20717272, ..., 0.56226436,
               0.26117397, 0.78730901]],
             [[0.41578296, 0.97891988, 0.94396026, ..., 0.51148956,
               0.32645908, 0.49573464],
              [0.0894115 , 0.40364212, 0.90407579, ..., 0.7082974 ,
               0.59801165, 0.55842448],
              [0.32918622, 0.2603817 , 0.54499274, ..., 0.43422843,
               0.56081601, 0.7011575 ],
      ...
              [0.69525271, 0.64018338, 0.65966751, ..., 0.34683598,
               0.00548341, 0.97329194],
              [0.80950062, 0.64463964, 0.27341711, ..., 0.68010487,
               0.58527527, 0.17991659],
              [0.72386847, 0.04767042, 0.4784509 , ..., 0.25828601,
               0.23017327, 0.83358268]],
             [[0.27685799, 0.68762621, 0.03612248, ..., 0.68830425,
               0.43521138, 0.48475464],
              [0.13150606, 0.64611598, 0.67140634, ..., 0.72932653,
               0.55523022, 0.58939696],
              [0.99564735, 0.67435163, 0.4850405 , ..., 0.00560686,
               0.48457899, 0.52615165],
              ...,
              [0.03060629, 0.05426987, 0.81598002, ..., 0.50067055,
               0.71105902, 0.8936419 ],
              [0.05694798, 0.54300376, 0.98750746, ..., 0.11035621,
               0.29567352, 0.82197459],
              [0.84906294, 0.14679959, 0.2824754 , ..., 0.85234413,
               0.89728047, 0.23293677]]])
      Dimensions without coordinates: dim_0, dim_1, dim_2
      xarray.DataArray
      • dim_0: 5
      • dim_1: 20
      • dim_2: 30
      • 0.04115 0.7504 0.1364 0.7344 0.5692 ... 0.01969 0.8523 0.8973 0.2329
        array([[[0.04114747, 0.75038331, 0.13638907, ..., 0.38833257,
                 0.28897767, 0.44763658],
                [0.05093347, 0.84213056, 0.73766516, ..., 0.01407856,
                 0.76522943, 0.87498363],
                [0.84178601, 0.30799076, 0.2984225 , ..., 0.37395342,
                 0.16885954, 0.44853357],
                ...,
                [0.85808347, 0.16014808, 0.04753766, ..., 0.8107473 ,
                 0.93155405, 0.27736656],
                [0.05476814, 0.56464578, 0.40849327, ..., 0.41397046,
                 0.40700989, 0.872461  ],
                [0.33342521, 0.57842884, 0.20717272, ..., 0.56226436,
                 0.26117397, 0.78730901]],
               [[0.41578296, 0.97891988, 0.94396026, ..., 0.51148956,
                 0.32645908, 0.49573464],
                [0.0894115 , 0.40364212, 0.90407579, ..., 0.7082974 ,
                 0.59801165, 0.55842448],
                [0.32918622, 0.2603817 , 0.54499274, ..., 0.43422843,
                 0.56081601, 0.7011575 ],
        ...
                [0.69525271, 0.64018338, 0.65966751, ..., 0.34683598,
                 0.00548341, 0.97329194],
                [0.80950062, 0.64463964, 0.27341711, ..., 0.68010487,
                 0.58527527, 0.17991659],
                [0.72386847, 0.04767042, 0.4784509 , ..., 0.25828601,
                 0.23017327, 0.83358268]],
               [[0.27685799, 0.68762621, 0.03612248, ..., 0.68830425,
                 0.43521138, 0.48475464],
                [0.13150606, 0.64611598, 0.67140634, ..., 0.72932653,
                 0.55523022, 0.58939696],
                [0.99564735, 0.67435163, 0.4850405 , ..., 0.00560686,
                 0.48457899, 0.52615165],
                ...,
                [0.03060629, 0.05426987, 0.81598002, ..., 0.50067055,
                 0.71105902, 0.8936419 ],
                [0.05694798, 0.54300376, 0.98750746, ..., 0.11035621,
                 0.29567352, 0.82197459],
                [0.84906294, 0.14679959, 0.2824754 , ..., 0.85234413,
                 0.89728047, 0.23293677]]])

          Exercise: Compute the mean, using either DataArray.mean() or np.mean()

          Solution
          da.mean()
          <xarray.DataArray ()> Size: 8B
          array(0.49582556)
          xarray.DataArray
          • 0.4958
            array(0.49582556)
              np.mean(da)
              <xarray.DataArray ()> Size: 8B
              array(0.49582556)
              xarray.DataArray
              • 0.4958
                array(0.49582556)

                  Exercise: Compute the mean over the third axis, using da.mean(axis=2)

                  Solution
                  da.mean(axis=2)
                  <xarray.DataArray (dim_0: 10, dim_1: 20)> Size: 2kB
                  array([[0.36543086, 0.51930887, 0.50379079, 0.52659501, 0.52329708,
                          0.45206561, 0.44555284, 0.47871553, 0.51089349, 0.48598282,
                          0.52021286, 0.37824522, 0.59671092, 0.615205  , 0.52748188,
                          0.52641723, 0.67116061, 0.42539296, 0.49287799, 0.47272362],
                         [0.55609468, 0.4810388 , 0.5259652 , 0.59836652, 0.54134858,
                          0.48422569, 0.46709276, 0.52790002, 0.45980352, 0.48049169,
                          0.43466852, 0.55785632, 0.57304273, 0.55303505, 0.517658  ,
                          0.45762595, 0.47264817, 0.51178683, 0.48442642, 0.59459949],
                         [0.44657057, 0.56822554, 0.49829036, 0.54690908, 0.53317257,
                          0.52955974, 0.44061672, 0.47899669, 0.50972067, 0.51375136,
                          0.5035676 , 0.47461658, 0.57139615, 0.38634051, 0.49474923,
                          0.53198677, 0.50013701, 0.42897054, 0.50942051, 0.47081368],
                         [0.46629911, 0.41400256, 0.49320932, 0.49706875, 0.50986913,
                          0.47141487, 0.44684023, 0.54404585, 0.45654007, 0.52816059,
                          0.51185483, 0.41566652, 0.49721622, 0.52734675, 0.39633733,
                          0.46176637, 0.55182298, 0.44828229, 0.53683044, 0.49020325],
                         [0.53004363, 0.49924226, 0.46172736, 0.48735078, 0.45591984,
                          0.50744314, 0.51750811, 0.55884772, 0.5100851 , 0.43773424,
                          0.47074307, 0.55014578, 0.45866962, 0.46788254, 0.50701368,
                          0.56608109, 0.41439794, 0.45354272, 0.5338212 , 0.54764117],
                         [0.54622386, 0.50704904, 0.52853719, 0.48871144, 0.40561274,
                          0.39808459, 0.52884049, 0.56943325, 0.54578676, 0.51050244,
                          0.45564268, 0.51644497, 0.50357821, 0.51080586, 0.44878332,
                          0.43235936, 0.43539606, 0.51422296, 0.61260323, 0.47746059],
                         [0.57749112, 0.55907827, 0.48373934, 0.51663829, 0.48428476,
                          0.58290814, 0.50846196, 0.51487716, 0.48403421, 0.60375554,
                          0.50852149, 0.47286028, 0.44520898, 0.52491347, 0.53538122,
                          0.51516768, 0.50169217, 0.46943158, 0.37626364, 0.44376613],
                         [0.50950841, 0.47831953, 0.59670342, 0.3995389 , 0.52428618,
                          0.50096481, 0.53531568, 0.46965357, 0.50080524, 0.37695733,
                          0.45476231, 0.4805418 , 0.42229679, 0.47802709, 0.45059819,
                          0.37335441, 0.55677103, 0.46432355, 0.55098146, 0.55196238],
                         [0.51830114, 0.51199971, 0.47828723, 0.41539535, 0.47819357,
                          0.49026753, 0.47132953, 0.48441641, 0.51107812, 0.49218128,
                          0.41817107, 0.54548401, 0.45529535, 0.50994582, 0.38968902,
                          0.47973765, 0.42666402, 0.48926506, 0.60631024, 0.55265594],
                         [0.53724441, 0.55331959, 0.58556178, 0.38731547, 0.47589156,
                          0.53687486, 0.45980396, 0.43617056, 0.54166356, 0.49557229,
                          0.4859638 , 0.48989926, 0.44355795, 0.52572281, 0.4783072 ,
                          0.5768246 , 0.43944115, 0.4844711 , 0.49882065, 0.43162461]])
                  Dimensions without coordinates: dim_0, dim_1
                  xarray.DataArray
                  • dim_0: 10
                  • dim_1: 20
                  • 0.3654 0.5193 0.5038 0.5266 0.5233 ... 0.4394 0.4845 0.4988 0.4316
                    array([[0.36543086, 0.51930887, 0.50379079, 0.52659501, 0.52329708,
                            0.45206561, 0.44555284, 0.47871553, 0.51089349, 0.48598282,
                            0.52021286, 0.37824522, 0.59671092, 0.615205  , 0.52748188,
                            0.52641723, 0.67116061, 0.42539296, 0.49287799, 0.47272362],
                           [0.55609468, 0.4810388 , 0.5259652 , 0.59836652, 0.54134858,
                            0.48422569, 0.46709276, 0.52790002, 0.45980352, 0.48049169,
                            0.43466852, 0.55785632, 0.57304273, 0.55303505, 0.517658  ,
                            0.45762595, 0.47264817, 0.51178683, 0.48442642, 0.59459949],
                           [0.44657057, 0.56822554, 0.49829036, 0.54690908, 0.53317257,
                            0.52955974, 0.44061672, 0.47899669, 0.50972067, 0.51375136,
                            0.5035676 , 0.47461658, 0.57139615, 0.38634051, 0.49474923,
                            0.53198677, 0.50013701, 0.42897054, 0.50942051, 0.47081368],
                           [0.46629911, 0.41400256, 0.49320932, 0.49706875, 0.50986913,
                            0.47141487, 0.44684023, 0.54404585, 0.45654007, 0.52816059,
                            0.51185483, 0.41566652, 0.49721622, 0.52734675, 0.39633733,
                            0.46176637, 0.55182298, 0.44828229, 0.53683044, 0.49020325],
                           [0.53004363, 0.49924226, 0.46172736, 0.48735078, 0.45591984,
                            0.50744314, 0.51750811, 0.55884772, 0.5100851 , 0.43773424,
                            0.47074307, 0.55014578, 0.45866962, 0.46788254, 0.50701368,
                            0.56608109, 0.41439794, 0.45354272, 0.5338212 , 0.54764117],
                           [0.54622386, 0.50704904, 0.52853719, 0.48871144, 0.40561274,
                            0.39808459, 0.52884049, 0.56943325, 0.54578676, 0.51050244,
                            0.45564268, 0.51644497, 0.50357821, 0.51080586, 0.44878332,
                            0.43235936, 0.43539606, 0.51422296, 0.61260323, 0.47746059],
                           [0.57749112, 0.55907827, 0.48373934, 0.51663829, 0.48428476,
                            0.58290814, 0.50846196, 0.51487716, 0.48403421, 0.60375554,
                            0.50852149, 0.47286028, 0.44520898, 0.52491347, 0.53538122,
                            0.51516768, 0.50169217, 0.46943158, 0.37626364, 0.44376613],
                           [0.50950841, 0.47831953, 0.59670342, 0.3995389 , 0.52428618,
                            0.50096481, 0.53531568, 0.46965357, 0.50080524, 0.37695733,
                            0.45476231, 0.4805418 , 0.42229679, 0.47802709, 0.45059819,
                            0.37335441, 0.55677103, 0.46432355, 0.55098146, 0.55196238],
                           [0.51830114, 0.51199971, 0.47828723, 0.41539535, 0.47819357,
                            0.49026753, 0.47132953, 0.48441641, 0.51107812, 0.49218128,
                            0.41817107, 0.54548401, 0.45529535, 0.50994582, 0.38968902,
                            0.47973765, 0.42666402, 0.48926506, 0.60631024, 0.55265594],
                           [0.53724441, 0.55331959, 0.58556178, 0.38731547, 0.47589156,
                            0.53687486, 0.45980396, 0.43617056, 0.54166356, 0.49557229,
                            0.4859638 , 0.48989926, 0.44355795, 0.52572281, 0.4783072 ,
                            0.5768246 , 0.43944115, 0.4844711 , 0.49882065, 0.43162461]])

                      Labeling the Data and the Dimensions: name= and dims=

                      One of XArray’s most useful features is the ability to name dimensions explicitly. Instead of referring to axes by position—such as “axis 0” or “axis 2”—operations can refer to dimensions using meaningful labels like “x”, “y”, or “time”.

                      Once dimensions are named, many operations become easier to read and harder to misuse. For example, computing the mean across time can be expressed as mean(dim=“time”), which clearly communicates the intent of the calculation.

                      Exercise: Make a new da 3-dimensional array variable using xr.DataArray(), this time additionally setting name="image" and dims=['x', 'y', 'time']

                      Solution
                      da = xr.DataArray(
                          data=np.random.random(size=(10, 20, 30)),
                          name='image',
                          dims=['x', 'y', 'time']
                      )
                      da
                      <xarray.DataArray 'image' (x: 10, y: 20, time: 30)> Size: 48kB
                      array([[[0.67710939, 0.30888335, 0.45708531, ..., 0.19024812,
                               0.57919888, 0.4943471 ],
                              [0.6595412 , 0.05768168, 0.41122793, ..., 0.42941126,
                               0.02140642, 0.13401437],
                              [0.47115825, 0.03549168, 0.47427858, ..., 0.11232079,
                               0.15764563, 0.62205852],
                              ...,
                              [0.28629424, 0.28075838, 0.90142574, ..., 0.09790786,
                               0.66381097, 0.76691221],
                              [0.56623202, 0.6275929 , 0.5333757 , ..., 0.73606387,
                               0.67496555, 0.01273267],
                              [0.6557586 , 0.71014066, 0.5542154 , ..., 0.67508891,
                               0.21545972, 0.04647963]],
                             [[0.46266484, 0.27270618, 0.37350087, ..., 0.7071857 ,
                               0.89672212, 0.25277119],
                              [0.73110233, 0.39435732, 0.07033897, ..., 0.90960717,
                               0.91212683, 0.85858381],
                              [0.54258139, 0.11662828, 0.70256949, ..., 0.85428115,
                               0.03788034, 0.63837203],
                      ...
                              [0.36590867, 0.27510572, 0.46112712, ..., 0.33816704,
                               0.07879941, 0.38753586],
                              [0.29461655, 0.75531933, 0.07085249, ..., 0.52368795,
                               0.6893739 , 0.49329903],
                              [0.24075614, 0.3107854 , 0.24904419, ..., 0.00986273,
                               0.73754247, 0.11510364]],
                             [[0.65561021, 0.98253874, 0.70322808, ..., 0.43339667,
                               0.21292771, 0.95580234],
                              [0.71432444, 0.32570543, 0.89027762, ..., 0.16513184,
                               0.45325214, 0.84195001],
                              [0.89453391, 0.08796845, 0.97497481, ..., 0.1163913 ,
                               0.67460762, 0.55036986],
                              ...,
                              [0.79907291, 0.12435965, 0.4183625 , ..., 0.75998778,
                               0.54159547, 0.64966451],
                              [0.57047831, 0.65203531, 0.30976315, ..., 0.27428619,
                               0.9356512 , 0.63459768],
                              [0.87974006, 0.32322905, 0.91136299, ..., 0.35024879,
                               0.94698808, 0.50808183]]])
                      Dimensions without coordinates: x, y, time
                      xarray.DataArray
                      'image'
                      • x: 10
                      • y: 20
                      • time: 30
                      • 0.6771 0.3089 0.4571 0.2478 0.8169 ... 0.3813 0.3502 0.947 0.5081
                        array([[[0.67710939, 0.30888335, 0.45708531, ..., 0.19024812,
                                 0.57919888, 0.4943471 ],
                                [0.6595412 , 0.05768168, 0.41122793, ..., 0.42941126,
                                 0.02140642, 0.13401437],
                                [0.47115825, 0.03549168, 0.47427858, ..., 0.11232079,
                                 0.15764563, 0.62205852],
                                ...,
                                [0.28629424, 0.28075838, 0.90142574, ..., 0.09790786,
                                 0.66381097, 0.76691221],
                                [0.56623202, 0.6275929 , 0.5333757 , ..., 0.73606387,
                                 0.67496555, 0.01273267],
                                [0.6557586 , 0.71014066, 0.5542154 , ..., 0.67508891,
                                 0.21545972, 0.04647963]],
                               [[0.46266484, 0.27270618, 0.37350087, ..., 0.7071857 ,
                                 0.89672212, 0.25277119],
                                [0.73110233, 0.39435732, 0.07033897, ..., 0.90960717,
                                 0.91212683, 0.85858381],
                                [0.54258139, 0.11662828, 0.70256949, ..., 0.85428115,
                                 0.03788034, 0.63837203],
                        ...
                                [0.36590867, 0.27510572, 0.46112712, ..., 0.33816704,
                                 0.07879941, 0.38753586],
                                [0.29461655, 0.75531933, 0.07085249, ..., 0.52368795,
                                 0.6893739 , 0.49329903],
                                [0.24075614, 0.3107854 , 0.24904419, ..., 0.00986273,
                                 0.73754247, 0.11510364]],
                               [[0.65561021, 0.98253874, 0.70322808, ..., 0.43339667,
                                 0.21292771, 0.95580234],
                                [0.71432444, 0.32570543, 0.89027762, ..., 0.16513184,
                                 0.45325214, 0.84195001],
                                [0.89453391, 0.08796845, 0.97497481, ..., 0.1163913 ,
                                 0.67460762, 0.55036986],
                                ...,
                                [0.79907291, 0.12435965, 0.4183625 , ..., 0.75998778,
                                 0.54159547, 0.64966451],
                                [0.57047831, 0.65203531, 0.30976315, ..., 0.27428619,
                                 0.9356512 , 0.63459768],
                                [0.87974006, 0.32322905, 0.91136299, ..., 0.35024879,
                                 0.94698808, 0.50808183]]])

                          Exercise: Select the fourth time sample using da.sel(time=4)

                          Solution
                          da.sel(time=4)
                          <xarray.DataArray 'image' (x: 10, y: 20)> Size: 2kB
                          array([[0.81690343, 0.7209225 , 0.37353862, 0.09224265, 0.03095172,
                                  0.1034145 , 0.63437993, 0.17861791, 0.2149326 , 0.11016639,
                                  0.11142093, 0.2261374 , 0.35131947, 0.95335097, 0.64888917,
                                  0.02271543, 0.26942174, 0.26820578, 0.49959026, 0.11380439],
                                 [0.35311137, 0.4148705 , 0.41419894, 0.60659794, 0.09539062,
                                  0.71539388, 0.95375252, 0.07784231, 0.84844934, 0.11753521,
                                  0.78966488, 0.75023638, 0.28176082, 0.79697694, 0.12724592,
                                  0.03527607, 0.99368476, 0.88356642, 0.76887476, 0.68415071],
                                 [0.42463613, 0.14438795, 0.40494972, 0.67949229, 0.53039794,
                                  0.81529368, 0.06234376, 0.99362028, 0.79408995, 0.38304623,
                                  0.09001437, 0.98109669, 0.24317204, 0.63199111, 0.12674025,
                                  0.50320134, 0.15266813, 0.53977947, 0.71372985, 0.24760575],
                                 [0.57205422, 0.95364895, 0.34793683, 0.23919657, 0.12988433,
                                  0.63331466, 0.52231607, 0.33278371, 0.62182073, 0.4274041 ,
                                  0.42886689, 0.34358668, 0.4653964 , 0.3645093 , 0.19066942,
                                  0.23865388, 0.42829744, 0.11948408, 0.03752981, 0.95159085],
                                 [0.41837782, 0.36533384, 0.25933047, 0.08754735, 0.49177214,
                                  0.58898798, 0.53283701, 0.43497849, 0.98585099, 0.62895794,
                                  0.52711377, 0.47866854, 0.67471869, 0.9185391 , 0.75206332,
                                  0.55107556, 0.73957786, 0.94516618, 0.42911865, 0.48761669],
                                 [0.04747765, 0.24085401, 0.16609059, 0.53680168, 0.79971536,
                                  0.95924525, 0.38889558, 0.62757553, 0.69035109, 0.88295853,
                                  0.67377996, 0.84171832, 0.82412218, 0.08195431, 0.79791916,
                                  0.08518784, 0.43338793, 0.52470145, 0.62609036, 0.8949754 ],
                                 [0.56147571, 0.37038801, 0.75341007, 0.39167142, 0.31499963,
                                  0.65011844, 0.72642505, 0.51515493, 0.30194453, 0.59054549,
                                  0.2370653 , 0.20692905, 0.13203398, 0.71813268, 0.69640391,
                                  0.04455144, 0.62728676, 0.82446057, 0.74412542, 0.37767237],
                                 [0.10371527, 0.94428154, 0.38452835, 0.64577214, 0.09205662,
                                  0.06029028, 0.72665883, 0.83662477, 0.26651286, 0.83489503,
                                  0.0988879 , 0.40095675, 0.32867554, 0.99772062, 0.6113875 ,
                                  0.93377601, 0.69657599, 0.6114565 , 0.07554179, 0.00437493],
                                 [0.30154039, 0.23765152, 0.68592597, 0.40720714, 0.41215613,
                                  0.57588725, 0.53873454, 0.09855938, 0.86329038, 0.06073814,
                                  0.18346947, 0.89362387, 0.66958095, 0.82418249, 0.76896116,
                                  0.63618435, 0.48571894, 0.96678574, 0.88264642, 0.73664438],
                                 [0.71095648, 0.63434897, 0.83616863, 0.99057802, 0.5831305 ,
                                  0.39098512, 0.79564113, 0.69978499, 0.7932254 , 0.11968389,
                                  0.18836944, 0.50066022, 0.67891818, 0.45608767, 0.36538924,
                                  0.42506311, 0.33150542, 0.19117472, 0.10269446, 0.37903476]])
                          Dimensions without coordinates: x, y
                          xarray.DataArray
                          'image'
                          • x: 10
                          • y: 20
                          • 0.8169 0.7209 0.3735 0.09224 0.03095 ... 0.3315 0.1912 0.1027 0.379
                            array([[0.81690343, 0.7209225 , 0.37353862, 0.09224265, 0.03095172,
                                    0.1034145 , 0.63437993, 0.17861791, 0.2149326 , 0.11016639,
                                    0.11142093, 0.2261374 , 0.35131947, 0.95335097, 0.64888917,
                                    0.02271543, 0.26942174, 0.26820578, 0.49959026, 0.11380439],
                                   [0.35311137, 0.4148705 , 0.41419894, 0.60659794, 0.09539062,
                                    0.71539388, 0.95375252, 0.07784231, 0.84844934, 0.11753521,
                                    0.78966488, 0.75023638, 0.28176082, 0.79697694, 0.12724592,
                                    0.03527607, 0.99368476, 0.88356642, 0.76887476, 0.68415071],
                                   [0.42463613, 0.14438795, 0.40494972, 0.67949229, 0.53039794,
                                    0.81529368, 0.06234376, 0.99362028, 0.79408995, 0.38304623,
                                    0.09001437, 0.98109669, 0.24317204, 0.63199111, 0.12674025,
                                    0.50320134, 0.15266813, 0.53977947, 0.71372985, 0.24760575],
                                   [0.57205422, 0.95364895, 0.34793683, 0.23919657, 0.12988433,
                                    0.63331466, 0.52231607, 0.33278371, 0.62182073, 0.4274041 ,
                                    0.42886689, 0.34358668, 0.4653964 , 0.3645093 , 0.19066942,
                                    0.23865388, 0.42829744, 0.11948408, 0.03752981, 0.95159085],
                                   [0.41837782, 0.36533384, 0.25933047, 0.08754735, 0.49177214,
                                    0.58898798, 0.53283701, 0.43497849, 0.98585099, 0.62895794,
                                    0.52711377, 0.47866854, 0.67471869, 0.9185391 , 0.75206332,
                                    0.55107556, 0.73957786, 0.94516618, 0.42911865, 0.48761669],
                                   [0.04747765, 0.24085401, 0.16609059, 0.53680168, 0.79971536,
                                    0.95924525, 0.38889558, 0.62757553, 0.69035109, 0.88295853,
                                    0.67377996, 0.84171832, 0.82412218, 0.08195431, 0.79791916,
                                    0.08518784, 0.43338793, 0.52470145, 0.62609036, 0.8949754 ],
                                   [0.56147571, 0.37038801, 0.75341007, 0.39167142, 0.31499963,
                                    0.65011844, 0.72642505, 0.51515493, 0.30194453, 0.59054549,
                                    0.2370653 , 0.20692905, 0.13203398, 0.71813268, 0.69640391,
                                    0.04455144, 0.62728676, 0.82446057, 0.74412542, 0.37767237],
                                   [0.10371527, 0.94428154, 0.38452835, 0.64577214, 0.09205662,
                                    0.06029028, 0.72665883, 0.83662477, 0.26651286, 0.83489503,
                                    0.0988879 , 0.40095675, 0.32867554, 0.99772062, 0.6113875 ,
                                    0.93377601, 0.69657599, 0.6114565 , 0.07554179, 0.00437493],
                                   [0.30154039, 0.23765152, 0.68592597, 0.40720714, 0.41215613,
                                    0.57588725, 0.53873454, 0.09855938, 0.86329038, 0.06073814,
                                    0.18346947, 0.89362387, 0.66958095, 0.82418249, 0.76896116,
                                    0.63618435, 0.48571894, 0.96678574, 0.88264642, 0.73664438],
                                   [0.71095648, 0.63434897, 0.83616863, 0.99057802, 0.5831305 ,
                                    0.39098512, 0.79564113, 0.69978499, 0.7932254 , 0.11968389,
                                    0.18836944, 0.50066022, 0.67891818, 0.45608767, 0.36538924,
                                    0.42506311, 0.33150542, 0.19117472, 0.10269446, 0.37903476]])

                              Exercise: Select the first-through fifth rows by name using da.sel(x=slice(0, 5))

                              Solution
                              da.sel(x=slice(0, 5))
                              <xarray.DataArray 'image' (x: 5, y: 20, time: 30)> Size: 24kB
                              array([[[0.67710939, 0.30888335, 0.45708531, ..., 0.19024812,
                                       0.57919888, 0.4943471 ],
                                      [0.6595412 , 0.05768168, 0.41122793, ..., 0.42941126,
                                       0.02140642, 0.13401437],
                                      [0.47115825, 0.03549168, 0.47427858, ..., 0.11232079,
                                       0.15764563, 0.62205852],
                                      ...,
                                      [0.28629424, 0.28075838, 0.90142574, ..., 0.09790786,
                                       0.66381097, 0.76691221],
                                      [0.56623202, 0.6275929 , 0.5333757 , ..., 0.73606387,
                                       0.67496555, 0.01273267],
                                      [0.6557586 , 0.71014066, 0.5542154 , ..., 0.67508891,
                                       0.21545972, 0.04647963]],
                                     [[0.46266484, 0.27270618, 0.37350087, ..., 0.7071857 ,
                                       0.89672212, 0.25277119],
                                      [0.73110233, 0.39435732, 0.07033897, ..., 0.90960717,
                                       0.91212683, 0.85858381],
                                      [0.54258139, 0.11662828, 0.70256949, ..., 0.85428115,
                                       0.03788034, 0.63837203],
                              ...
                                      [0.41128742, 0.76827003, 0.13690284, ..., 0.35153952,
                                       0.21000871, 0.89327571],
                                      [0.73643851, 0.28246673, 0.55631256, ..., 0.20599446,
                                       0.26434225, 0.99029251],
                                      [0.37146648, 0.43607167, 0.64789792, ..., 0.40202998,
                                       0.57516726, 0.17500694]],
                                     [[0.31257181, 0.61820674, 0.52070954, ..., 0.96581995,
                                       0.23534712, 0.41627359],
                                      [0.87671843, 0.20151553, 0.9917814 , ..., 0.44273185,
                                       0.20544541, 0.37368288],
                                      [0.26560331, 0.22749109, 0.94030219, ..., 0.24526497,
                                       0.78634043, 0.07908533],
                                      ...,
                                      [0.82781059, 0.29590796, 0.95242948, ..., 0.67263742,
                                       0.38092654, 0.89350527],
                                      [0.57307289, 0.73484916, 0.36874063, ..., 0.02862867,
                                       0.92761903, 0.49103933],
                                      [0.4093979 , 0.14409232, 0.98184898, ..., 0.38638863,
                                       0.67179747, 0.79713569]]])
                              Dimensions without coordinates: x, y, time
                              xarray.DataArray
                              'image'
                              • x: 5
                              • y: 20
                              • time: 30
                              • 0.6771 0.3089 0.4571 0.2478 0.8169 ... 0.1034 0.3864 0.6718 0.7971
                                array([[[0.67710939, 0.30888335, 0.45708531, ..., 0.19024812,
                                         0.57919888, 0.4943471 ],
                                        [0.6595412 , 0.05768168, 0.41122793, ..., 0.42941126,
                                         0.02140642, 0.13401437],
                                        [0.47115825, 0.03549168, 0.47427858, ..., 0.11232079,
                                         0.15764563, 0.62205852],
                                        ...,
                                        [0.28629424, 0.28075838, 0.90142574, ..., 0.09790786,
                                         0.66381097, 0.76691221],
                                        [0.56623202, 0.6275929 , 0.5333757 , ..., 0.73606387,
                                         0.67496555, 0.01273267],
                                        [0.6557586 , 0.71014066, 0.5542154 , ..., 0.67508891,
                                         0.21545972, 0.04647963]],
                                       [[0.46266484, 0.27270618, 0.37350087, ..., 0.7071857 ,
                                         0.89672212, 0.25277119],
                                        [0.73110233, 0.39435732, 0.07033897, ..., 0.90960717,
                                         0.91212683, 0.85858381],
                                        [0.54258139, 0.11662828, 0.70256949, ..., 0.85428115,
                                         0.03788034, 0.63837203],
                                ...
                                        [0.41128742, 0.76827003, 0.13690284, ..., 0.35153952,
                                         0.21000871, 0.89327571],
                                        [0.73643851, 0.28246673, 0.55631256, ..., 0.20599446,
                                         0.26434225, 0.99029251],
                                        [0.37146648, 0.43607167, 0.64789792, ..., 0.40202998,
                                         0.57516726, 0.17500694]],
                                       [[0.31257181, 0.61820674, 0.52070954, ..., 0.96581995,
                                         0.23534712, 0.41627359],
                                        [0.87671843, 0.20151553, 0.9917814 , ..., 0.44273185,
                                         0.20544541, 0.37368288],
                                        [0.26560331, 0.22749109, 0.94030219, ..., 0.24526497,
                                         0.78634043, 0.07908533],
                                        ...,
                                        [0.82781059, 0.29590796, 0.95242948, ..., 0.67263742,
                                         0.38092654, 0.89350527],
                                        [0.57307289, 0.73484916, 0.36874063, ..., 0.02862867,
                                         0.92761903, 0.49103933],
                                        [0.4093979 , 0.14409232, 0.98184898, ..., 0.38638863,
                                         0.67179747, 0.79713569]]])

                                  Exercise: Compute the Mean image over time by name, using da.mean(dim='time'):

                                  Solution
                                  da.mean(dim='time')
                                  <xarray.DataArray 'image' (x: 10, y: 20)> Size: 2kB
                                  array([[0.47843092, 0.46969608, 0.40644396, 0.50741432, 0.50431421,
                                          0.48563248, 0.53645447, 0.45694158, 0.47792393, 0.49790484,
                                          0.47847525, 0.45882576, 0.50619661, 0.50320613, 0.50071791,
                                          0.41972101, 0.45554273, 0.48766826, 0.5085251 , 0.50531774],
                                         [0.49078497, 0.52379173, 0.48343256, 0.52503334, 0.45745951,
                                          0.54254514, 0.61334868, 0.40266028, 0.47112354, 0.46579182,
                                          0.54729357, 0.46197744, 0.47021517, 0.46994462, 0.47692702,
                                          0.51305033, 0.56584108, 0.42563489, 0.51674477, 0.58459544],
                                         [0.44949127, 0.60548923, 0.48180034, 0.45403246, 0.59818813,
                                          0.56062785, 0.45900812, 0.57637942, 0.49849704, 0.49103374,
                                          0.54039484, 0.42094949, 0.48788003, 0.47691933, 0.42459847,
                                          0.58556951, 0.53766951, 0.54855299, 0.48729667, 0.46217485],
                                         [0.37252296, 0.63062867, 0.504824  , 0.52336255, 0.55739432,
                                          0.49561938, 0.48572253, 0.48198813, 0.45012286, 0.59467673,
                                          0.55435092, 0.38695287, 0.53432442, 0.47676637, 0.50862611,
                                          0.41253874, 0.46182589, 0.48332374, 0.47428668, 0.54903787],
                                         [0.42500643, 0.54558215, 0.47353567, 0.48548281, 0.50231371,
                                          0.51240519, 0.46619329, 0.47241953, 0.45909955, 0.55562628,
                                          0.58132156, 0.45478005, 0.53310118, 0.54784863, 0.47495015,
                                          0.4817479 , 0.54321249, 0.58288687, 0.50915776, 0.47254236],
                                         [0.47258876, 0.48908605, 0.49056801, 0.55310662, 0.65789466,
                                          0.56912281, 0.59585348, 0.5372052 , 0.53167562, 0.45888481,
                                          0.39769294, 0.46414352, 0.49553688, 0.58052805, 0.39814518,
                                          0.52548734, 0.54697939, 0.50181692, 0.4965746 , 0.49904745],
                                         [0.50088195, 0.54168772, 0.57419232, 0.41886657, 0.38611848,
                                          0.47733109, 0.38949948, 0.49882494, 0.56745354, 0.48883225,
                                          0.45925509, 0.46427744, 0.46913585, 0.48387636, 0.4858951 ,
                                          0.49172214, 0.4274939 , 0.5149629 , 0.49961429, 0.5545395 ],
                                         [0.61660792, 0.49845409, 0.52420791, 0.62849252, 0.4095893 ,
                                          0.4342628 , 0.48062861, 0.60759781, 0.42381866, 0.47382528,
                                          0.3666136 , 0.60449872, 0.5064963 , 0.47201591, 0.5073689 ,
                                          0.46681923, 0.59064752, 0.56942485, 0.50850607, 0.51502411],
                                         [0.43756182, 0.52026911, 0.51650468, 0.4421631 , 0.57237277,
                                          0.4562016 , 0.50262569, 0.52268015, 0.50294336, 0.43493523,
                                          0.57542953, 0.49229615, 0.48417751, 0.51001346, 0.52059171,
                                          0.56349546, 0.4723652 , 0.47318401, 0.49233718, 0.48746748],
                                         [0.58153839, 0.57155364, 0.47683888, 0.61983997, 0.44372875,
                                          0.47244084, 0.44871863, 0.45402516, 0.50615253, 0.56414166,
                                          0.55305249, 0.43315097, 0.50376687, 0.44821274, 0.6061095 ,
                                          0.42601648, 0.55315635, 0.44721002, 0.55584759, 0.5059227 ]])
                                  Dimensions without coordinates: x, y
                                  xarray.DataArray
                                  'image'
                                  • x: 10
                                  • y: 20
                                  • 0.4784 0.4697 0.4064 0.5074 0.5043 ... 0.5532 0.4472 0.5558 0.5059
                                    array([[0.47843092, 0.46969608, 0.40644396, 0.50741432, 0.50431421,
                                            0.48563248, 0.53645447, 0.45694158, 0.47792393, 0.49790484,
                                            0.47847525, 0.45882576, 0.50619661, 0.50320613, 0.50071791,
                                            0.41972101, 0.45554273, 0.48766826, 0.5085251 , 0.50531774],
                                           [0.49078497, 0.52379173, 0.48343256, 0.52503334, 0.45745951,
                                            0.54254514, 0.61334868, 0.40266028, 0.47112354, 0.46579182,
                                            0.54729357, 0.46197744, 0.47021517, 0.46994462, 0.47692702,
                                            0.51305033, 0.56584108, 0.42563489, 0.51674477, 0.58459544],
                                           [0.44949127, 0.60548923, 0.48180034, 0.45403246, 0.59818813,
                                            0.56062785, 0.45900812, 0.57637942, 0.49849704, 0.49103374,
                                            0.54039484, 0.42094949, 0.48788003, 0.47691933, 0.42459847,
                                            0.58556951, 0.53766951, 0.54855299, 0.48729667, 0.46217485],
                                           [0.37252296, 0.63062867, 0.504824  , 0.52336255, 0.55739432,
                                            0.49561938, 0.48572253, 0.48198813, 0.45012286, 0.59467673,
                                            0.55435092, 0.38695287, 0.53432442, 0.47676637, 0.50862611,
                                            0.41253874, 0.46182589, 0.48332374, 0.47428668, 0.54903787],
                                           [0.42500643, 0.54558215, 0.47353567, 0.48548281, 0.50231371,
                                            0.51240519, 0.46619329, 0.47241953, 0.45909955, 0.55562628,
                                            0.58132156, 0.45478005, 0.53310118, 0.54784863, 0.47495015,
                                            0.4817479 , 0.54321249, 0.58288687, 0.50915776, 0.47254236],
                                           [0.47258876, 0.48908605, 0.49056801, 0.55310662, 0.65789466,
                                            0.56912281, 0.59585348, 0.5372052 , 0.53167562, 0.45888481,
                                            0.39769294, 0.46414352, 0.49553688, 0.58052805, 0.39814518,
                                            0.52548734, 0.54697939, 0.50181692, 0.4965746 , 0.49904745],
                                           [0.50088195, 0.54168772, 0.57419232, 0.41886657, 0.38611848,
                                            0.47733109, 0.38949948, 0.49882494, 0.56745354, 0.48883225,
                                            0.45925509, 0.46427744, 0.46913585, 0.48387636, 0.4858951 ,
                                            0.49172214, 0.4274939 , 0.5149629 , 0.49961429, 0.5545395 ],
                                           [0.61660792, 0.49845409, 0.52420791, 0.62849252, 0.4095893 ,
                                            0.4342628 , 0.48062861, 0.60759781, 0.42381866, 0.47382528,
                                            0.3666136 , 0.60449872, 0.5064963 , 0.47201591, 0.5073689 ,
                                            0.46681923, 0.59064752, 0.56942485, 0.50850607, 0.51502411],
                                           [0.43756182, 0.52026911, 0.51650468, 0.4421631 , 0.57237277,
                                            0.4562016 , 0.50262569, 0.52268015, 0.50294336, 0.43493523,
                                            0.57542953, 0.49229615, 0.48417751, 0.51001346, 0.52059171,
                                            0.56349546, 0.4723652 , 0.47318401, 0.49233718, 0.48746748],
                                           [0.58153839, 0.57155364, 0.47683888, 0.61983997, 0.44372875,
                                            0.47244084, 0.44871863, 0.45402516, 0.50615253, 0.56414166,
                                            0.55305249, 0.43315097, 0.50376687, 0.44821274, 0.6061095 ,
                                            0.42601648, 0.55315635, 0.44721002, 0.55584759, 0.5059227 ]])

                                      Exercise: The time points are stored in the numpy array t below. Use mask = t > 40; da.sel(time=mask) to select only the data corresponding to time points greater than 40:

                                      t = np.linspace(0, 100, 30)
                                      Solution
                                      mask = t > 40
                                      da.sel(time=mask)
                                      <xarray.DataArray 'image' (x: 10, y: 20, time: 18)> Size: 29kB
                                      array([[[0.3801502 , 0.04820498, 0.89874884, ..., 0.19024812,
                                               0.57919888, 0.4943471 ],
                                              [0.62321878, 0.10624042, 0.68024044, ..., 0.42941126,
                                               0.02140642, 0.13401437],
                                              [0.05602627, 0.02598415, 0.32258628, ..., 0.11232079,
                                               0.15764563, 0.62205852],
                                              ...,
                                              [0.73502167, 0.7191653 , 0.52243755, ..., 0.09790786,
                                               0.66381097, 0.76691221],
                                              [0.4117305 , 0.09230545, 0.02878121, ..., 0.73606387,
                                               0.67496555, 0.01273267],
                                              [0.60857306, 0.96885987, 0.1138432 , ..., 0.67508891,
                                               0.21545972, 0.04647963]],
                                             [[0.09413247, 0.00475213, 0.51825391, ..., 0.7071857 ,
                                               0.89672212, 0.25277119],
                                              [0.40472999, 0.01018857, 0.11182434, ..., 0.90960717,
                                               0.91212683, 0.85858381],
                                              [0.24924832, 0.14484967, 0.62068823, ..., 0.85428115,
                                               0.03788034, 0.63837203],
                                      ...
                                              [0.47594781, 0.40575236, 0.59922647, ..., 0.33816704,
                                               0.07879941, 0.38753586],
                                              [0.58649034, 0.83478326, 0.98797737, ..., 0.52368795,
                                               0.6893739 , 0.49329903],
                                              [0.54401046, 0.55875543, 0.65777507, ..., 0.00986273,
                                               0.73754247, 0.11510364]],
                                             [[0.12587492, 0.33950829, 0.85387088, ..., 0.43339667,
                                               0.21292771, 0.95580234],
                                              [0.41198532, 0.84913138, 0.17903529, ..., 0.16513184,
                                               0.45325214, 0.84195001],
                                              [0.92331714, 0.61395529, 0.19310102, ..., 0.1163913 ,
                                               0.67460762, 0.55036986],
                                              ...,
                                              [0.92368158, 0.66873584, 0.20778289, ..., 0.75998778,
                                               0.54159547, 0.64966451],
                                              [0.93139689, 0.3658875 , 0.44009   , ..., 0.27428619,
                                               0.9356512 , 0.63459768],
                                              [0.99162561, 0.70124652, 0.29062643, ..., 0.35024879,
                                               0.94698808, 0.50808183]]])
                                      Dimensions without coordinates: x, y, time
                                      xarray.DataArray
                                      'image'
                                      • x: 10
                                      • y: 20
                                      • time: 18
                                      • 0.3802 0.0482 0.8987 0.2366 0.3519 ... 0.3813 0.3502 0.947 0.5081
                                        array([[[0.3801502 , 0.04820498, 0.89874884, ..., 0.19024812,
                                                 0.57919888, 0.4943471 ],
                                                [0.62321878, 0.10624042, 0.68024044, ..., 0.42941126,
                                                 0.02140642, 0.13401437],
                                                [0.05602627, 0.02598415, 0.32258628, ..., 0.11232079,
                                                 0.15764563, 0.62205852],
                                                ...,
                                                [0.73502167, 0.7191653 , 0.52243755, ..., 0.09790786,
                                                 0.66381097, 0.76691221],
                                                [0.4117305 , 0.09230545, 0.02878121, ..., 0.73606387,
                                                 0.67496555, 0.01273267],
                                                [0.60857306, 0.96885987, 0.1138432 , ..., 0.67508891,
                                                 0.21545972, 0.04647963]],
                                               [[0.09413247, 0.00475213, 0.51825391, ..., 0.7071857 ,
                                                 0.89672212, 0.25277119],
                                                [0.40472999, 0.01018857, 0.11182434, ..., 0.90960717,
                                                 0.91212683, 0.85858381],
                                                [0.24924832, 0.14484967, 0.62068823, ..., 0.85428115,
                                                 0.03788034, 0.63837203],
                                        ...
                                                [0.47594781, 0.40575236, 0.59922647, ..., 0.33816704,
                                                 0.07879941, 0.38753586],
                                                [0.58649034, 0.83478326, 0.98797737, ..., 0.52368795,
                                                 0.6893739 , 0.49329903],
                                                [0.54401046, 0.55875543, 0.65777507, ..., 0.00986273,
                                                 0.73754247, 0.11510364]],
                                               [[0.12587492, 0.33950829, 0.85387088, ..., 0.43339667,
                                                 0.21292771, 0.95580234],
                                                [0.41198532, 0.84913138, 0.17903529, ..., 0.16513184,
                                                 0.45325214, 0.84195001],
                                                [0.92331714, 0.61395529, 0.19310102, ..., 0.1163913 ,
                                                 0.67460762, 0.55036986],
                                                ...,
                                                [0.92368158, 0.66873584, 0.20778289, ..., 0.75998778,
                                                 0.54159547, 0.64966451],
                                                [0.93139689, 0.3658875 , 0.44009   , ..., 0.27428619,
                                                 0.9356512 , 0.63459768],
                                                [0.99162561, 0.70124652, 0.29062643, ..., 0.35024879,
                                                 0.94698808, 0.50808183]]])

                                          Labeling each Axis using Coordinates and Attributes

                                          Beyond naming dimensions, XArray allows each axis to have coordinate values that describe the physical meaning of each index. For example, a time axis might correspond to timestamps, or a spatial axis might correspond to pixel positions.

                                          Coordinates make it possible to select data based on meaningful values rather than raw indices. For example, selecting frames after a particular time point can be done using coordinate values instead of computing index positions manually.

                                          Example: Run the code below to make a new da using xr.DataArray(), this time additionally mapping the time axis to the time points themselves using coords=:

                                          da = xr.DataArray(
                                              data=np.random.random(size=(10, 20, 30)),
                                              name='image',
                                              dims=['x', 'y', 'time'],
                                              coords = {
                                                  'time': np.linspace(0, 100, 30),
                                              }
                                          )
                                          da
                                          <xarray.DataArray 'image' (x: 10, y: 20, time: 30)> Size: 48kB
                                          array([[[0.46188057, 0.56766766, 0.35679791, ..., 0.05633519,
                                                   0.6566564 , 0.28389431],
                                                  [0.58015032, 0.64239196, 0.59039491, ..., 0.49590085,
                                                   0.02281968, 0.11329427],
                                                  [0.5039821 , 0.60760961, 0.83575356, ..., 0.37388575,
                                                   0.41638332, 0.27047329],
                                                  ...,
                                                  [0.66524982, 0.38093567, 0.52518851, ..., 0.09357031,
                                                   0.12206716, 0.54444632],
                                                  [0.84532887, 0.97704918, 0.91157941, ..., 0.37410448,
                                                   0.14586067, 0.74378509],
                                                  [0.32688785, 0.61309651, 0.95989322, ..., 0.83371582,
                                                   0.64154624, 0.77752146]],
                                                 [[0.09202725, 0.90281392, 0.49982754, ..., 0.73266458,
                                                   0.25561141, 0.48023462],
                                                  [0.02234091, 0.98852295, 0.62247615, ..., 0.63447814,
                                                   0.94441917, 0.09651057],
                                                  [0.01004742, 0.66161957, 0.50444871, ..., 0.02655767,
                                                   0.97403606, 0.16546788],
                                          ...
                                                  [0.85371778, 0.43043134, 0.76959364, ..., 0.71519278,
                                                   0.67391388, 0.76497901],
                                                  [0.22501216, 0.52742085, 0.1762034 , ..., 0.80517868,
                                                   0.93740406, 0.40259905],
                                                  [0.0798965 , 0.12092623, 0.00821162, ..., 0.9359866 ,
                                                   0.07810915, 0.77949279]],
                                                 [[0.34816838, 0.71699975, 0.24623201, ..., 0.97537935,
                                                   0.07112402, 0.79217533],
                                                  [0.21306198, 0.2743723 , 0.63966886, ..., 0.86183832,
                                                   0.36204873, 0.92305822],
                                                  [0.15225632, 0.27889629, 0.62959152, ..., 0.82866566,
                                                   0.23851943, 0.98939333],
                                                  ...,
                                                  [0.80892984, 0.67120648, 0.79454465, ..., 0.80222849,
                                                   0.22851522, 0.57351216],
                                                  [0.67077438, 0.14226303, 0.90061353, ..., 0.1229384 ,
                                                   0.90847576, 0.03312194],
                                                  [0.48467367, 0.77017557, 0.23214696, ..., 0.46349595,
                                                   0.64632291, 0.27783926]]])
                                          Coordinates:
                                            * time     (time) float64 240B 0.0 3.448 6.897 10.34 ... 93.1 96.55 100.0
                                          Dimensions without coordinates: x, y
                                          xarray.DataArray
                                          'image'
                                          • x: 10
                                          • y: 20
                                          • time: 30
                                          • 0.4619 0.5677 0.3568 0.6823 0.8425 ... 0.1187 0.4635 0.6463 0.2778
                                            array([[[0.46188057, 0.56766766, 0.35679791, ..., 0.05633519,
                                                     0.6566564 , 0.28389431],
                                                    [0.58015032, 0.64239196, 0.59039491, ..., 0.49590085,
                                                     0.02281968, 0.11329427],
                                                    [0.5039821 , 0.60760961, 0.83575356, ..., 0.37388575,
                                                     0.41638332, 0.27047329],
                                                    ...,
                                                    [0.66524982, 0.38093567, 0.52518851, ..., 0.09357031,
                                                     0.12206716, 0.54444632],
                                                    [0.84532887, 0.97704918, 0.91157941, ..., 0.37410448,
                                                     0.14586067, 0.74378509],
                                                    [0.32688785, 0.61309651, 0.95989322, ..., 0.83371582,
                                                     0.64154624, 0.77752146]],
                                                   [[0.09202725, 0.90281392, 0.49982754, ..., 0.73266458,
                                                     0.25561141, 0.48023462],
                                                    [0.02234091, 0.98852295, 0.62247615, ..., 0.63447814,
                                                     0.94441917, 0.09651057],
                                                    [0.01004742, 0.66161957, 0.50444871, ..., 0.02655767,
                                                     0.97403606, 0.16546788],
                                            ...
                                                    [0.85371778, 0.43043134, 0.76959364, ..., 0.71519278,
                                                     0.67391388, 0.76497901],
                                                    [0.22501216, 0.52742085, 0.1762034 , ..., 0.80517868,
                                                     0.93740406, 0.40259905],
                                                    [0.0798965 , 0.12092623, 0.00821162, ..., 0.9359866 ,
                                                     0.07810915, 0.77949279]],
                                                   [[0.34816838, 0.71699975, 0.24623201, ..., 0.97537935,
                                                     0.07112402, 0.79217533],
                                                    [0.21306198, 0.2743723 , 0.63966886, ..., 0.86183832,
                                                     0.36204873, 0.92305822],
                                                    [0.15225632, 0.27889629, 0.62959152, ..., 0.82866566,
                                                     0.23851943, 0.98939333],
                                                    ...,
                                                    [0.80892984, 0.67120648, 0.79454465, ..., 0.80222849,
                                                     0.22851522, 0.57351216],
                                                    [0.67077438, 0.14226303, 0.90061353, ..., 0.1229384 ,
                                                     0.90847576, 0.03312194],
                                                    [0.48467367, 0.77017557, 0.23214696, ..., 0.46349595,
                                                     0.64632291, 0.27783926]]])
                                            • time
                                              (time)
                                              float64
                                              0.0 3.448 6.897 ... 96.55 100.0
                                              array([  0.      ,   3.448276,   6.896552,  10.344828,  13.793103,  17.241379,
                                                      20.689655,  24.137931,  27.586207,  31.034483,  34.482759,  37.931034,
                                                      41.37931 ,  44.827586,  48.275862,  51.724138,  55.172414,  58.62069 ,
                                                      62.068966,  65.517241,  68.965517,  72.413793,  75.862069,  79.310345,
                                                      82.758621,  86.206897,  89.655172,  93.103448,  96.551724, 100.      ])
                                            • time
                                              PandasIndex
                                              PandasIndex(Index([               0.0, 3.4482758620689653,  6.896551724137931,
                                                     10.344827586206897, 13.793103448275861, 17.241379310344826,
                                                     20.689655172413794, 24.137931034482758, 27.586206896551722,
                                                     31.034482758620687,  34.48275862068965,  37.93103448275862,
                                                      41.37931034482759,  44.82758620689655, 48.275862068965516,
                                                      51.72413793103448, 55.172413793103445,  58.62068965517241,
                                                     62.068965517241374,  65.51724137931033,   68.9655172413793,
                                                      72.41379310344827,  75.86206896551724,   79.3103448275862,
                                                      82.75862068965517,  86.20689655172413,   89.6551724137931,
                                                      93.10344827586206,  96.55172413793103,              100.0],
                                                    dtype='float64', name='time'))

                                          Exercise: Use da.sel(time = slice(40, None)) to select only the data corresponding to time points greater than 40, without first creating a mask:

                                          Solution
                                          da.sel(time = slice(40, None))
                                          <xarray.DataArray 'image' (x: 10, y: 20, time: 18)> Size: 29kB
                                          array([[[0.83686132, 0.42845474, 0.44557469, ..., 0.05633519,
                                                   0.6566564 , 0.28389431],
                                                  [0.57298279, 0.17370296, 0.69313108, ..., 0.49590085,
                                                   0.02281968, 0.11329427],
                                                  [0.69213285, 0.21450153, 0.99096409, ..., 0.37388575,
                                                   0.41638332, 0.27047329],
                                                  ...,
                                                  [0.07942635, 0.58554461, 0.01508784, ..., 0.09357031,
                                                   0.12206716, 0.54444632],
                                                  [0.17035337, 0.79151936, 0.58735636, ..., 0.37410448,
                                                   0.14586067, 0.74378509],
                                                  [0.4267523 , 0.78713569, 0.92095774, ..., 0.83371582,
                                                   0.64154624, 0.77752146]],
                                                 [[0.04377784, 0.11835502, 0.2540089 , ..., 0.73266458,
                                                   0.25561141, 0.48023462],
                                                  [0.84141178, 0.65626098, 0.20320923, ..., 0.63447814,
                                                   0.94441917, 0.09651057],
                                                  [0.31242241, 0.40528523, 0.19022224, ..., 0.02655767,
                                                   0.97403606, 0.16546788],
                                          ...
                                                  [0.33485026, 0.17928519, 0.06393263, ..., 0.71519278,
                                                   0.67391388, 0.76497901],
                                                  [0.57569317, 0.8408589 , 0.28079959, ..., 0.80517868,
                                                   0.93740406, 0.40259905],
                                                  [0.48696678, 0.27770608, 0.90097863, ..., 0.9359866 ,
                                                   0.07810915, 0.77949279]],
                                                 [[0.57542   , 0.86748088, 0.39047068, ..., 0.97537935,
                                                   0.07112402, 0.79217533],
                                                  [0.48733402, 0.14941463, 0.75924524, ..., 0.86183832,
                                                   0.36204873, 0.92305822],
                                                  [0.13135676, 0.8255285 , 0.28076253, ..., 0.82866566,
                                                   0.23851943, 0.98939333],
                                                  ...,
                                                  [0.48887299, 0.48989821, 0.35578766, ..., 0.80222849,
                                                   0.22851522, 0.57351216],
                                                  [0.63862842, 0.18332819, 0.56574068, ..., 0.1229384 ,
                                                   0.90847576, 0.03312194],
                                                  [0.02250793, 0.41551062, 0.52261051, ..., 0.46349595,
                                                   0.64632291, 0.27783926]]])
                                          Coordinates:
                                            * time     (time) float64 144B 41.38 44.83 48.28 51.72 ... 93.1 96.55 100.0
                                          Dimensions without coordinates: x, y
                                          xarray.DataArray
                                          'image'
                                          • x: 10
                                          • y: 20
                                          • time: 18
                                          • 0.8369 0.4285 0.4456 0.4524 0.7207 ... 0.1187 0.4635 0.6463 0.2778
                                            array([[[0.83686132, 0.42845474, 0.44557469, ..., 0.05633519,
                                                     0.6566564 , 0.28389431],
                                                    [0.57298279, 0.17370296, 0.69313108, ..., 0.49590085,
                                                     0.02281968, 0.11329427],
                                                    [0.69213285, 0.21450153, 0.99096409, ..., 0.37388575,
                                                     0.41638332, 0.27047329],
                                                    ...,
                                                    [0.07942635, 0.58554461, 0.01508784, ..., 0.09357031,
                                                     0.12206716, 0.54444632],
                                                    [0.17035337, 0.79151936, 0.58735636, ..., 0.37410448,
                                                     0.14586067, 0.74378509],
                                                    [0.4267523 , 0.78713569, 0.92095774, ..., 0.83371582,
                                                     0.64154624, 0.77752146]],
                                                   [[0.04377784, 0.11835502, 0.2540089 , ..., 0.73266458,
                                                     0.25561141, 0.48023462],
                                                    [0.84141178, 0.65626098, 0.20320923, ..., 0.63447814,
                                                     0.94441917, 0.09651057],
                                                    [0.31242241, 0.40528523, 0.19022224, ..., 0.02655767,
                                                     0.97403606, 0.16546788],
                                            ...
                                                    [0.33485026, 0.17928519, 0.06393263, ..., 0.71519278,
                                                     0.67391388, 0.76497901],
                                                    [0.57569317, 0.8408589 , 0.28079959, ..., 0.80517868,
                                                     0.93740406, 0.40259905],
                                                    [0.48696678, 0.27770608, 0.90097863, ..., 0.9359866 ,
                                                     0.07810915, 0.77949279]],
                                                   [[0.57542   , 0.86748088, 0.39047068, ..., 0.97537935,
                                                     0.07112402, 0.79217533],
                                                    [0.48733402, 0.14941463, 0.75924524, ..., 0.86183832,
                                                     0.36204873, 0.92305822],
                                                    [0.13135676, 0.8255285 , 0.28076253, ..., 0.82866566,
                                                     0.23851943, 0.98939333],
                                                    ...,
                                                    [0.48887299, 0.48989821, 0.35578766, ..., 0.80222849,
                                                     0.22851522, 0.57351216],
                                                    [0.63862842, 0.18332819, 0.56574068, ..., 0.1229384 ,
                                                     0.90847576, 0.03312194],
                                                    [0.02250793, 0.41551062, 0.52261051, ..., 0.46349595,
                                                     0.64632291, 0.27783926]]])
                                            • time
                                              (time)
                                              float64
                                              41.38 44.83 48.28 ... 96.55 100.0
                                              array([ 41.37931 ,  44.827586,  48.275862,  51.724138,  55.172414,  58.62069 ,
                                                      62.068966,  65.517241,  68.965517,  72.413793,  75.862069,  79.310345,
                                                      82.758621,  86.206897,  89.655172,  93.103448,  96.551724, 100.      ])
                                            • time
                                              PandasIndex
                                              PandasIndex(Index([ 41.37931034482759,  44.82758620689655, 48.275862068965516,
                                                      51.72413793103448, 55.172413793103445,  58.62068965517241,
                                                     62.068965517241374,  65.51724137931033,   68.9655172413793,
                                                      72.41379310344827,  75.86206896551724,   79.3103448275862,
                                                      82.75862068965517,  86.20689655172413,   89.6551724137931,
                                                      93.10344827586206,  96.55172413793103,              100.0],
                                                    dtype='float64', name='time'))

                                          Adding Descriptions to the data

                                          Support for basic data descriptions is quite extensive. Things like units, long names for plotting, processing history, and even descriptions for explaining each part of the data are supported by adding the data to a dictionary attached to xarray objects called attrs. Some keys are recognized by other tooling (e.g. units, description, long_name), but for the most part, any kind of key-value combination is supported for metadata.

                                          Example: Run the code below to create a new da DataArray using DataArray, this time with extra attributes describing the main variables.

                                          time = xr.DataArray(
                                              data = np.linspace(0, 100, 30),
                                              name = 'time',
                                              dims=['time'],
                                              attrs = {
                                                  'units': 's',
                                                  'description': 'time samples for each image frame'
                                              }
                                          )
                                          
                                          da = xr.DataArray(
                                              data=np.random.random(size=(10, 20, 30)),
                                              name='image',
                                              dims=['x', 'y', 'time'],
                                              coords = {
                                                  'time': time,
                                              },
                                              attrs = {
                                                  'units': 'brightness',
                                                  'description': 'a generated random image stack',
                                                  'long_name': 'calcium image pixel brightness',
                                              }
                                          )
                                          da
                                          <xarray.DataArray 'image' (x: 10, y: 20, time: 30)> Size: 48kB
                                          array([[[4.35496650e-01, 1.39860289e-01, 5.76075894e-01, ...,
                                                   1.35828490e-01, 3.23799397e-01, 1.07731768e-01],
                                                  [4.15392354e-01, 6.89231091e-01, 1.62120190e-01, ...,
                                                   9.73108830e-01, 1.85465688e-01, 9.39971562e-01],
                                                  [6.75804059e-01, 6.30469961e-01, 6.44439396e-01, ...,
                                                   2.10046370e-01, 7.32814223e-01, 8.73014448e-01],
                                                  ...,
                                                  [8.47559592e-01, 8.73858054e-01, 8.35539489e-01, ...,
                                                   1.89592622e-01, 9.04254151e-01, 2.20712074e-01],
                                                  [9.35658273e-01, 4.81834729e-01, 1.27097967e-02, ...,
                                                   5.67628947e-01, 6.67647998e-01, 7.29321847e-01],
                                                  [6.13921318e-01, 1.64942405e-01, 2.86663115e-01, ...,
                                                   8.63679855e-01, 4.42827205e-01, 7.77789488e-01]],
                                                 [[1.51234474e-01, 6.54155400e-01, 5.48704160e-01, ...,
                                                   6.14159540e-01, 2.34606080e-01, 3.36681628e-01],
                                                  [9.92149874e-01, 9.63872664e-01, 4.92213151e-01, ...,
                                                   3.93095095e-01, 4.89412507e-01, 6.08654658e-01],
                                                  [5.37480845e-01, 5.00420566e-01, 1.92164140e-01, ...,
                                                   9.06850090e-01, 3.10698928e-01, 3.07288262e-01],
                                          ...
                                                  [1.99030780e-01, 2.33889088e-01, 4.25963944e-01, ...,
                                                   7.57721772e-02, 3.09179984e-02, 1.52252551e-02],
                                                  [2.75607620e-01, 8.97809607e-01, 1.09684499e-04, ...,
                                                   1.16773014e-01, 6.32046940e-01, 1.89419867e-01],
                                                  [3.49003337e-01, 2.21797962e-01, 7.39140922e-01, ...,
                                                   7.78807244e-01, 7.95221138e-02, 2.98407146e-01]],
                                                 [[1.41137599e-01, 1.20473007e-01, 5.20769885e-01, ...,
                                                   4.56489725e-01, 9.25368638e-01, 9.80714344e-01],
                                                  [9.73212545e-01, 1.00600066e-01, 5.95941059e-01, ...,
                                                   5.77546236e-01, 5.35467949e-01, 4.31949006e-01],
                                                  [3.76814095e-01, 9.85196306e-01, 1.18982638e-01, ...,
                                                   5.14847642e-01, 7.32485135e-01, 4.54910505e-01],
                                                  ...,
                                                  [9.22120925e-01, 6.09786725e-01, 6.11135546e-01, ...,
                                                   7.14720948e-01, 3.05674544e-01, 1.08655577e-01],
                                                  [9.31272102e-01, 7.09868764e-02, 9.22907586e-01, ...,
                                                   8.87035004e-01, 2.62764053e-01, 6.04462399e-01],
                                                  [1.10832118e-01, 4.95192941e-01, 9.50893051e-01, ...,
                                                   4.16903438e-01, 5.13032394e-01, 6.69712544e-01]]])
                                          Coordinates:
                                            * time     (time) float64 240B 0.0 3.448 6.897 10.34 ... 93.1 96.55 100.0
                                          Dimensions without coordinates: x, y
                                          Attributes:
                                              units:        brightness
                                              description:  A generated random image stack
                                              long_name:    calcium image pixel brightness
                                          xarray.DataArray
                                          'image'
                                          • x: 10
                                          • y: 20
                                          • time: 30
                                          • 0.4355 0.1399 0.5761 0.4103 0.7932 ... 0.1676 0.4169 0.513 0.6697
                                            array([[[4.35496650e-01, 1.39860289e-01, 5.76075894e-01, ...,
                                                     1.35828490e-01, 3.23799397e-01, 1.07731768e-01],
                                                    [4.15392354e-01, 6.89231091e-01, 1.62120190e-01, ...,
                                                     9.73108830e-01, 1.85465688e-01, 9.39971562e-01],
                                                    [6.75804059e-01, 6.30469961e-01, 6.44439396e-01, ...,
                                                     2.10046370e-01, 7.32814223e-01, 8.73014448e-01],
                                                    ...,
                                                    [8.47559592e-01, 8.73858054e-01, 8.35539489e-01, ...,
                                                     1.89592622e-01, 9.04254151e-01, 2.20712074e-01],
                                                    [9.35658273e-01, 4.81834729e-01, 1.27097967e-02, ...,
                                                     5.67628947e-01, 6.67647998e-01, 7.29321847e-01],
                                                    [6.13921318e-01, 1.64942405e-01, 2.86663115e-01, ...,
                                                     8.63679855e-01, 4.42827205e-01, 7.77789488e-01]],
                                                   [[1.51234474e-01, 6.54155400e-01, 5.48704160e-01, ...,
                                                     6.14159540e-01, 2.34606080e-01, 3.36681628e-01],
                                                    [9.92149874e-01, 9.63872664e-01, 4.92213151e-01, ...,
                                                     3.93095095e-01, 4.89412507e-01, 6.08654658e-01],
                                                    [5.37480845e-01, 5.00420566e-01, 1.92164140e-01, ...,
                                                     9.06850090e-01, 3.10698928e-01, 3.07288262e-01],
                                            ...
                                                    [1.99030780e-01, 2.33889088e-01, 4.25963944e-01, ...,
                                                     7.57721772e-02, 3.09179984e-02, 1.52252551e-02],
                                                    [2.75607620e-01, 8.97809607e-01, 1.09684499e-04, ...,
                                                     1.16773014e-01, 6.32046940e-01, 1.89419867e-01],
                                                    [3.49003337e-01, 2.21797962e-01, 7.39140922e-01, ...,
                                                     7.78807244e-01, 7.95221138e-02, 2.98407146e-01]],
                                                   [[1.41137599e-01, 1.20473007e-01, 5.20769885e-01, ...,
                                                     4.56489725e-01, 9.25368638e-01, 9.80714344e-01],
                                                    [9.73212545e-01, 1.00600066e-01, 5.95941059e-01, ...,
                                                     5.77546236e-01, 5.35467949e-01, 4.31949006e-01],
                                                    [3.76814095e-01, 9.85196306e-01, 1.18982638e-01, ...,
                                                     5.14847642e-01, 7.32485135e-01, 4.54910505e-01],
                                                    ...,
                                                    [9.22120925e-01, 6.09786725e-01, 6.11135546e-01, ...,
                                                     7.14720948e-01, 3.05674544e-01, 1.08655577e-01],
                                                    [9.31272102e-01, 7.09868764e-02, 9.22907586e-01, ...,
                                                     8.87035004e-01, 2.62764053e-01, 6.04462399e-01],
                                                    [1.10832118e-01, 4.95192941e-01, 9.50893051e-01, ...,
                                                     4.16903438e-01, 5.13032394e-01, 6.69712544e-01]]])
                                            • time
                                              (time)
                                              float64
                                              0.0 3.448 6.897 ... 96.55 100.0
                                              units :
                                              s
                                              description :
                                              time samples for each image frame
                                              array([  0.      ,   3.448276,   6.896552,  10.344828,  13.793103,  17.241379,
                                                      20.689655,  24.137931,  27.586207,  31.034483,  34.482759,  37.931034,
                                                      41.37931 ,  44.827586,  48.275862,  51.724138,  55.172414,  58.62069 ,
                                                      62.068966,  65.517241,  68.965517,  72.413793,  75.862069,  79.310345,
                                                      82.758621,  86.206897,  89.655172,  93.103448,  96.551724, 100.      ])
                                            • time
                                              PandasIndex
                                              PandasIndex(Index([               0.0, 3.4482758620689653,  6.896551724137931,
                                                     10.344827586206897, 13.793103448275861, 17.241379310344826,
                                                     20.689655172413794, 24.137931034482758, 27.586206896551722,
                                                     31.034482758620687,  34.48275862068965,  37.93103448275862,
                                                      41.37931034482759,  44.82758620689655, 48.275862068965516,
                                                      51.72413793103448, 55.172413793103445,  58.62068965517241,
                                                     62.068965517241374,  65.51724137931033,   68.9655172413793,
                                                      72.41379310344827,  75.86206896551724,   79.3103448275862,
                                                      82.75862068965517,  86.20689655172413,   89.6551724137931,
                                                      93.10344827586206,  96.55172413793103,              100.0],
                                                    dtype='float64', name='time'))
                                          • units :
                                            brightness
                                            description :
                                            A generated random image stack
                                            long_name :
                                            calcium image pixel brightness

                                          Exercise: View the attributes of the da DataArray with da.attrs

                                          Solution
                                          da.attrs
                                          {'units': 'brightness',
                                           'description': 'A generated random image stack',
                                           'long_name': 'calcium image pixel brightness'}

                                          Exercise: View the attributes of the time coordinate on the da DataArray with da.time.attrs:

                                          Solution
                                          da.time.attrs
                                          {'units': 's', 'description': 'time samples for each image frame'}

                                          Exercise: Plot the mean pixel brightness over time and check that some attributes are used automatically in the plot, with da.mean(dim=['x', 'y']).plot():

                                          Solution
                                          da.mean(dim=['x', 'y']).plot();

                                          There are many, many more features that XArray provides to add convenience to an analysis, but this should be enough to get us started.

                                          Section 2: Creating HDF5-based NetCDF4 Files with XArray

                                          Once data is organized in an XArray structure, it can easily be saved to disk using scientific file formats such as NetCDF4, which is built on top of the HDF5 storage system. These formats are widely used in scientific computing because they support structured metadata, multidimensional datasets, and efficient storage of large arrays.

                                          Saving data in these formats allows large datasets to be stored and accessed efficiently without requiring them to be fully loaded into memory. It also makes the data portable and accessible to tools outside of Python.

                                          Exercises

                                          Exercise: Use da.to_netcdf(), using the engine='netcdf4' option, to create an HDF5-compatible file called example.nc.

                                          time = xr.DataArray(
                                              data = np.linspace(0, 100, 30),
                                              name = 'time',
                                              dims=['time'],
                                              attrs = {
                                                  'units': 's',
                                                  'description': 'time samples for each image frame'
                                              }
                                          )
                                          
                                          da = xr.DataArray(
                                              data=np.random.random(size=(10, 20, 30)),
                                              name='image',
                                              dims=['x', 'y', 'time'],
                                              coords = {
                                                  'time': time,
                                              },
                                              attrs = {
                                                  'units': 'brightness',
                                                  'description': 'A generated random image stack',
                                                  'long_name': 'calcium image pixel brightness',
                                              }
                                          )
                                          da
                                          <xarray.DataArray 'image' (x: 10, y: 20, time: 30)> Size: 48kB
                                          array([[[0.29989757, 0.8463877 , 0.40663498, ..., 0.24629935,
                                                   0.95116593, 0.70166196],
                                                  [0.77588588, 0.65273534, 0.55998213, ..., 0.66929204,
                                                   0.77767039, 0.50514103],
                                                  [0.61162945, 0.12880813, 0.41674473, ..., 0.97652357,
                                                   0.00874667, 0.10117324],
                                                  ...,
                                                  [0.27236194, 0.80752062, 0.32870814, ..., 0.33127256,
                                                   0.27448837, 0.04049907],
                                                  [0.75803561, 0.3023437 , 0.40953296, ..., 0.41166149,
                                                   0.05782473, 0.60460466],
                                                  [0.54209502, 0.77177583, 0.81081577, ..., 0.32133847,
                                                   0.86516611, 0.92231743]],
                                                 [[0.52935632, 0.71359921, 0.95500389, ..., 0.49523286,
                                                   0.34773767, 0.60304061],
                                                  [0.50776562, 0.085269  , 0.38566092, ..., 0.81686683,
                                                   0.78306769, 0.67995772],
                                                  [0.8333419 , 0.44137973, 0.33703999, ..., 0.46596195,
                                                   0.34835274, 0.87634407],
                                          ...
                                                  [0.1856812 , 0.03144001, 0.95300717, ..., 0.06460216,
                                                   0.06975456, 0.59354467],
                                                  [0.81160282, 0.56820701, 0.14425859, ..., 0.19822809,
                                                   0.22702473, 0.1643339 ],
                                                  [0.4079898 , 0.48072856, 0.6481851 , ..., 0.59914815,
                                                   0.49579475, 0.40973475]],
                                                 [[0.36633622, 0.15680477, 0.59611613, ..., 0.53226308,
                                                   0.95268626, 0.97610412],
                                                  [0.03067763, 0.93109174, 0.20496723, ..., 0.55886482,
                                                   0.95605829, 0.55176005],
                                                  [0.12329406, 0.4358826 , 0.46839423, ..., 0.69402656,
                                                   0.06989202, 0.84391449],
                                                  ...,
                                                  [0.6833302 , 0.21588141, 0.66163522, ..., 0.73072985,
                                                   0.72357252, 0.15604505],
                                                  [0.20859619, 0.53943158, 0.67767281, ..., 0.83549851,
                                                   0.02629358, 0.46670397],
                                                  [0.92274174, 0.07095888, 0.63037707, ..., 0.53206717,
                                                   0.12700903, 0.25080094]]])
                                          Coordinates:
                                            * time     (time) float64 240B 0.0 3.448 6.897 10.34 ... 93.1 96.55 100.0
                                          Dimensions without coordinates: x, y
                                          Attributes:
                                              units:        brightness
                                              description:  A generated random image stack
                                              long_name:    calcium image pixel brightness
                                          xarray.DataArray
                                          'image'
                                          • x: 10
                                          • y: 20
                                          • time: 30
                                          • 0.2999 0.8464 0.4066 0.9765 0.7405 ... 0.03308 0.5321 0.127 0.2508
                                            array([[[0.29989757, 0.8463877 , 0.40663498, ..., 0.24629935,
                                                     0.95116593, 0.70166196],
                                                    [0.77588588, 0.65273534, 0.55998213, ..., 0.66929204,
                                                     0.77767039, 0.50514103],
                                                    [0.61162945, 0.12880813, 0.41674473, ..., 0.97652357,
                                                     0.00874667, 0.10117324],
                                                    ...,
                                                    [0.27236194, 0.80752062, 0.32870814, ..., 0.33127256,
                                                     0.27448837, 0.04049907],
                                                    [0.75803561, 0.3023437 , 0.40953296, ..., 0.41166149,
                                                     0.05782473, 0.60460466],
                                                    [0.54209502, 0.77177583, 0.81081577, ..., 0.32133847,
                                                     0.86516611, 0.92231743]],
                                                   [[0.52935632, 0.71359921, 0.95500389, ..., 0.49523286,
                                                     0.34773767, 0.60304061],
                                                    [0.50776562, 0.085269  , 0.38566092, ..., 0.81686683,
                                                     0.78306769, 0.67995772],
                                                    [0.8333419 , 0.44137973, 0.33703999, ..., 0.46596195,
                                                     0.34835274, 0.87634407],
                                            ...
                                                    [0.1856812 , 0.03144001, 0.95300717, ..., 0.06460216,
                                                     0.06975456, 0.59354467],
                                                    [0.81160282, 0.56820701, 0.14425859, ..., 0.19822809,
                                                     0.22702473, 0.1643339 ],
                                                    [0.4079898 , 0.48072856, 0.6481851 , ..., 0.59914815,
                                                     0.49579475, 0.40973475]],
                                                   [[0.36633622, 0.15680477, 0.59611613, ..., 0.53226308,
                                                     0.95268626, 0.97610412],
                                                    [0.03067763, 0.93109174, 0.20496723, ..., 0.55886482,
                                                     0.95605829, 0.55176005],
                                                    [0.12329406, 0.4358826 , 0.46839423, ..., 0.69402656,
                                                     0.06989202, 0.84391449],
                                                    ...,
                                                    [0.6833302 , 0.21588141, 0.66163522, ..., 0.73072985,
                                                     0.72357252, 0.15604505],
                                                    [0.20859619, 0.53943158, 0.67767281, ..., 0.83549851,
                                                     0.02629358, 0.46670397],
                                                    [0.92274174, 0.07095888, 0.63037707, ..., 0.53206717,
                                                     0.12700903, 0.25080094]]])
                                            • time
                                              (time)
                                              float64
                                              0.0 3.448 6.897 ... 96.55 100.0
                                              units :
                                              s
                                              description :
                                              time samples for each image frame
                                              array([  0.      ,   3.448276,   6.896552,  10.344828,  13.793103,  17.241379,
                                                      20.689655,  24.137931,  27.586207,  31.034483,  34.482759,  37.931034,
                                                      41.37931 ,  44.827586,  48.275862,  51.724138,  55.172414,  58.62069 ,
                                                      62.068966,  65.517241,  68.965517,  72.413793,  75.862069,  79.310345,
                                                      82.758621,  86.206897,  89.655172,  93.103448,  96.551724, 100.      ])
                                            • time
                                              PandasIndex
                                              PandasIndex(Index([               0.0, 3.4482758620689653,  6.896551724137931,
                                                     10.344827586206897, 13.793103448275861, 17.241379310344826,
                                                     20.689655172413794, 24.137931034482758, 27.586206896551722,
                                                     31.034482758620687,  34.48275862068965,  37.93103448275862,
                                                      41.37931034482759,  44.82758620689655, 48.275862068965516,
                                                      51.72413793103448, 55.172413793103445,  58.62068965517241,
                                                     62.068965517241374,  65.51724137931033,   68.9655172413793,
                                                      72.41379310344827,  75.86206896551724,   79.3103448275862,
                                                      82.75862068965517,  86.20689655172413,   89.6551724137931,
                                                      93.10344827586206,  96.55172413793103,              100.0],
                                                    dtype='float64', name='time'))
                                          • units :
                                            brightness
                                            description :
                                            A generated random image stack
                                            long_name :
                                            calcium image pixel brightness
                                          Solution
                                          da.to_netcdf('example.nc', engine='netcdf4')

                                          Exercise: Open the example.nc file in the HDF5 Viewer at https://myhdf5.hdfgroup.org/ to verify it is a valid HDF5 file, and use it to do the following tasks:

                                          1. View the time variable as a line plot.
                                          2. View the time values themselves in a matrix.
                                          3. View the Image data as a heatmap.
                                          4. Find the “description” attribute for the image variable (hint: check the inspect tab)

                                          Changing the encoding for a Variable to Save Space using Compression: zlib and complevel

                                          Large scientific datasets often contain patterns that can be compressed effectively. NetCDF and HDF5 support built-in compression options, which can significantly reduce file size without altering the meaning of the data.

                                          Compression settings such as zlib and complevel allow the file format to store the data more efficiently on disk. In some cases—particularly for structured or low-entropy data—the resulting file can be much smaller while remaining fully compatible with standard tools. Other compression libraries are also supported, but here we’ll just focus on the big picture: when and where compression helps (all options found here , for the interested)

                                          The exercises in this section explore how compression affects file size for different types of data.

                                          Example: Save the da DataArray below with two different encodings: one without zlib compression, and one with zlib compression. How big of a file size reduction is there?

                                          da = xr.DataArray(np.random.random(1_000_000), name='data')
                                          da.to_netcdf('data1.nc', engine='netcdf4')
                                          utils.print_file_size('data1.nc', 'No Compression ')
                                          
                                          da.to_netcdf('data2.nc', engine='netcdf4', encoding={'data': {'zlib': True, 'complevel': 4}})
                                          utils.print_file_size('data2.nc', 'Yes Compression', )
                                          No Compression : 8.01 MB
                                          Yes Compression: 6.75 MB

                                          Exercise: Save the da DataArray below with two different encodings: one without zlib compression, and one with zlib compression. How big of a file size reduction is there?

                                          da = xr.DataArray(np.linspace(0, 10, 1_000_000), name='data')
                                          Solution
                                          da.to_netcdf('data1.nc', engine='netcdf4')
                                          utils.print_file_size('data1.nc', 'No Compression ')
                                          
                                          da.to_netcdf('data2.nc', engine='netcdf4', encoding={'data': {'zlib': True, 'complevel': 4}})
                                          utils.print_file_size('data2.nc', 'Yes Compression', )
                                          No Compression : 8.01 MB
                                          Yes Compression: 321.15 KB

                                          Exercise: Save the da DataArray below with two different encodings: one without zlib compression, and one with zlib compression. How big of a file size reduction is there?

                                          da = xr.DataArray(np.arange(1_000_000), name='data')
                                          da.to_netcdf('data1.nc', engine='netcdf4')
                                          utils.print_file_size('data1.nc', 'No Compression ')
                                          
                                          da.to_netcdf('data2.nc', engine='netcdf4', encoding={'data': {'zlib': True, 'complevel': 4}})
                                          utils.print_file_size('data2.nc', 'Yes Compression', )
                                          No Compression : 8.01 MB
                                          Yes Compression: 27.65 KB

                                          Exercise: Save the da DataArray below with two different encodings: one without zlib compression, and one with zlib compression. How big of a file size reduction is there?

                                          da = xr.DataArray(np.zeros(1_000_000), name='data')
                                          da.to_netcdf('data1.nc', engine='netcdf4')
                                          utils.print_file_size('data1.nc', 'No Compression ')
                                          
                                          da.to_netcdf('data2.nc', engine='netcdf4', encoding={'data': {'zlib': True, 'complevel': 4}})
                                          utils.print_file_size('data2.nc', 'Yes Compression', )
                                          No Compression : 8.01 MB
                                          Yes Compression: 16.03 KB

                                          Section 3: Analysis Requires Memory: Monitoring Memory Usage for Chained Pipelines

                                          Even when data is stored efficiently on disk, analysis pipelines can still consume large amounts of memory. This is particularly true when multiple operations are chained together, since intermediate results may temporarily allocate large arrays.

                                          In this section, we monitor the memory usage of different analysis pipelines while performing simple computations on an imaging dataset. By observing how memory usage changes over time, it becomes easier to see how different data-loading strategies affect resource consumption.

                                          These experiments illustrate an important principle: the way data is loaded and processed can matter just as much as the analysis itself.

                                          Exercises

                                          utils.generate_calcium_data_file('calcium_data.nc')

                                          Example: Compute Two Different Mean Projections: one over all pixels, and one over a selection of the frame (a “Region of Interest”)

                                          1. Update the functions and run the cell
                                          def mean_all_data():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .mean(dim='time')
                                              )
                                          
                                          def mean_roi_data():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          1. Check that the functions work: plot each of the generated images.
                                          mean_all_data().plot()
                                          plt.figure()
                                          mean_roi_data().plot()

                                          1. How much memory do each of the two functions use? Plot a comparison between the two functions.
                                          utils.analyze_memory(
                                              mean_all_data,
                                              mean_roi_data,
                                          )

                                          Exercise: Modify the second of the mean-projection functions below to use xr.open_dataarray(), which opens the file but doesn’t load the data until it is requested.

                                          1. Update the functions and run the cell
                                          def mean_roi_data():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          
                                          def mean_roi_data2():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          Solution
                                          def mean_roi_data():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          
                                          def mean_roi_data2():
                                              return (
                                                  xr.open_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          1. Check that the functions still work as before: plot each of the generated images and compare them; despite having different code, they should show the same result.
                                          Solution
                                          mean_roi_data().plot()
                                          plt.figure()
                                          mean_roi_data2().plot()

                                          1. How much memory do each of the two functions use? Plot a comparison between the two functions. Is there a significant diffrence between the two?
                                          Solution
                                          utils.analyze_memory(
                                              mean_roi_data,
                                              mean_roi_data2,
                                          )

                                          Exercise: Modify the third of the mean-projection functions below to use xr.open_dataarray(chunks='auto'), which opens the file, but doesn’t load the data until the full computation is requested (i.e. add .compute()) to the end of the pipeline.

                                          1. Update the functions and run the cell
                                          def mean_roi_data():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          
                                          def mean_roi_data2():
                                              return (
                                                  xr.open_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          
                                          
                                          def mean_roi_data3():
                                              return (
                                                  xr.open_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                                  .compute()
                                              )
                                          Solution
                                          def mean_roi_data():
                                              return (
                                                  xr.load_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          
                                          def mean_roi_data2():
                                              return (
                                                  xr.open_dataarray('calcium_data.nc')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                              )
                                          
                                          
                                          def mean_roi_data3():
                                              return (
                                                  xr.open_dataarray('calcium_data.nc', chunks='auto')
                                                  .sel(x=slice(0, 100), y=slice(0, 100))
                                                  .mean(dim='time')
                                                  .compute()
                                              )
                                          1. Check that the the functions all still work as before: plot each of the generated images and compare them; despite having different code, they should show the same result.
                                          Solution
                                          mean_roi_data().plot()
                                          plt.figure()
                                          mean_roi_data2().plot()
                                          plt.figure()
                                          mean_roi_data3().plot()

                                          1. How much memory do each of the three functions use? Plot a comparison between the three functions. Is there a significant diffrence between the three?
                                          Solution
                                          utils.analyze_memory(
                                              mean_roi_data,
                                              mean_roi_data2,
                                              mean_roi_data3,
                                          )

                                          Section 4: Monitoring Dask’s Workflow

                                          When Dask executes chunked computations, it constructs a task graph describing how different pieces of the computation depend on each other. A distributed Dask client provides a dashboard that visualizes this process in real time, showing how tasks are scheduled, executed, and completed across workers.

                                          The dashboard allows users to observe how work is distributed, how memory usage changes over time, and how intermediate results flow through the computation graph. This visibility is especially valuable when analyzing large datasets or debugging performance issues.

                                          Exercise: Uncomment and run the following code to shift computations out of the process, into a “Distributed Dask” client (note: this will make it so that the utils.analyze_memory() function can no longer access the dask-processed data; monitoring will have to be done from the dask workers). Open the resulting web page and browse through the sections.

                                          #import dask.distributed
                                          #client = dask.distributed.Client()
                                          #client

                                          Exercise: Run the following code over and over, while simultaneously viewing the client monitoring dashboard, looking at different sections of the dashboard, and answer the following questions:

                                          1. How many workers are processing the data?
                                          2. Do the workers just break up the work evenly from the beginning, or is there more complex cooperation happening between them?
                                          3. How many different tasks was the workflow broken down into?
                                          4. Is memory released between tasks?
                                          5. Is there a simple relationship between each each task, or is there a more complex compute graph being run?
                                          (xr.open_dataarray('calcium_data.nc', chunks='auto')
                                              .sel(x=slice(0, 100), y=slice(0, 100))
                                              .mean(dim='time')
                                              .rolling(x=7, center=True).mean()
                                              .dropna('x')
                                              .rolling(y=7, center=True).mean()
                                              .dropna('y')
                                              .compute()
                                              .plot()
                                          )