Representing Electrophysiological Data with Neo

Representing Electrophysiological Data with Neo

Authors
Dr. Atle E. Rimehaug | Dr. Michael Denker

Neo (Part 1): Representing Electrophysiology Data

What is Neo?

Neo is a Python library to represent neurophysiological data stemming from heterogeneous sources. It aims to provide a standardized way work on such data with the idea that tools, such as analysis libraries, can accept Neo as a common data representation as input. In addition, Neo simplifies the interaction with proprietary data file formats.

Features:

  • Vast file format support: Neo reads from many open and proprietory file formats and organizes data in a common data representation.
  • Streamlined data objects: Neo data objects are optimized to be generic to capture almost any experiment, yet capture all the required aspects of a piece of data.
  • Proxy Objects: Growing file format support for on-demand reading of data from disk to support large files.

Why Use Neo?

Neo allows to simplify reading data from a large variety of file formats. Moreover, Neo structures give users the flexibility to forge data in a representation optimal for their needs, while retaining all important information about individual data assets.

  • Data Models: Neo defines common data modalities in a fully self-describing way, including physical units and relationships between individual signals.
  • Metadata aware: Individual Neo objects can be annotated with metadata, thus ensuring that data stays meaningful and connected to the experiment during the analysis process, supporting literate programming.
  • Interoperability: By adhering to Neo data representations, it becomes easy to incorporate other tools based on the same standard, giving more flexibility and less errors in data conversion.

Setup

Import Libraries

Neo objects are built on top of numpy arrays and use the quantities library (imported as pq) to attach physical units to the arrays.

import neo
import numpy as np
import quantities as pq

Section 1: Working With Physical Quantities

In research, we work with many physical quantities (e.g. voltage, frequency or time) and we use them at different orders of magnitude (e.g. seconds and milliseconds). This can make it difficult to keep track of a variable’s unit when we are combining and manipulating different quantities in our analysis. The quantities module addresses this problem by allowing us to attach physical units to our variables. When we combine variables with units, quantities will automatically perform the correct conversion or raise an error if we are trying to combine incompatible quantities.

Code Description
freq = 2 * pq.Hz Define a quantity of 2 Hertz and assign it to the variable freq
time = 2500 * pq.ms Define a quantity of 2500 milliseconds and assign it to the variable time
time.units Get the unit of time
time.magnitude Get the magnitude of time

Exercise: Create a variable called firing_rate and assign it the value 5 * pq.Hz.

Solution
firing_rate = 5 * pq.Hz

Exercise: Create a variable called t_start and assign it the value 100*pq.ms.

Solution
t_start = 100 * pq.ms

Exercise: Create a variable called t_stop and assign it the value 3 * pq.s

Solution
t_stop = 3 * pq.s

Exercise: Compute the difference between t_stop and t_start and assign the result to a new variable duration. Then, print that variable.

Solution
duration = t_stop - t_start
duration
array(2.9) * s

Exercise: Get the .magnitude and the .units of duration.

Solution
duration.units, duration.magnitude
(array(1.) * s, array(2.9))

Exercise: Try adding the variables firing_rate and duration. What error message do you observe?

Solution
firing_rate+duration
ValueError Unable to convert between units of "s" and "Hz"

Exercise: Multiply firing_rate by duration and assign the result to a new variable n_spikes. Print that variable.

Solution
n_spikes = firing_rate * duration
n_spikes
array(14.5) * s*Hz

Section 2: Continuous Signals: AnalogSignal

Background

An AnalogSignal represents data that is sampled at regular intervals, such as a local field potential (LFP), an EEG channel, or an intracellular membrane potential. It is essentially a NumPy array with three extra pieces of information: the physical units of the samples, the sampling_rate, and the time of the first sample (t_start). From these, Neo can compute the time stamp of every sample for you.

An AnalogSignal is two-dimensional with shape (time, channel): each column is one channel and all channels share the same time axis. Because the units are attached to the data, Neo also converts between compatible units automatically when you combine signals.

Exercises

In this section you will create AnalogSignal objects, inspect their time axis and metadata, select channels and time samples by slicing, and combine signals with automatic unit handling.

Code Description
neo.AnalogSignal(arr, units='uV', sampling_rate=1000*pq.Hz) Create a signal from a NumPy array arr of shape (time, channel).
neo.AnalogSignal(..., t_start=120*pq.ms) Set the time of the first sample (default is 0 s).
sig.shape The (time, channel) shape of the signal.
sig.times The time stamp of every sample, with units.
sig.t_start, sig.t_stop Time of the first and last sample.
sig.sampling_rate Samples per second.
sig.sampling_period Time between samples.
sig[:, 0] Select a channel (column).
sig[:100] Select a range of time samples (rows).

Example: Create an AnalogSignal from a random (1000, 4) array with units of millivolts ('mV') and a sampling rate of 30 kHz, and assign it to sig. Display sig to check that it has 4 channels with 1000 samples each.

sig = neo.AnalogSignal(np.random.random((1000, 4)), units='mV',
                       sampling_rate=30000*pq.Hz)
sig
AnalogSignal with 4 channels of length 1000; units mV; datatype float64
sampling rate: 30000.0 Hz
time: 0.0 s to 0.03333333333333333 s

Exercise: Create an AnalogSignal from a random (50, 2) array with unit of microvolts, a sampling rate of 10 kHz, a first sample at 120 ms, and assign it to sig. Display sig to check that it has 2 channels with 50 samples each.

Solution
sig = neo.AnalogSignal(np.random.random((50, 2)), units='uV',
                       sampling_rate=10000*pq.Hz,
                       t_start=120*pq.ms)
sig
AnalogSignal with 2 channels of length 50; units uV; datatype float64
sampling rate: 10000.0 Hz
time: 120.0 ms to 125.0 ms

Exercise: Inspect the shape of sig. The first number is the number of time samples, the second is the number of channels.

Solution
sig.shape
(50, 2)

Exercise: Get the time stamps of all samples with sig.times.

Solution
sig.times
array([120. , 120.1, 120.2, 120.3, 120.4, 120.5, 120.6, 120.7, 120.8,
       120.9, 121. , 121.1, 121.2, 121.3, 121.4, 121.5, 121.6, 121.7,
       121.8, 121.9, 122. , 122.1, 122.2, 122.3, 122.4, 122.5, 122.6,
       122.7, 122.8, 122.9, 123. , 123.1, 123.2, 123.3, 123.4, 123.5,
       123.6, 123.7, 123.8, 123.9, 124. , 124.1, 124.2, 124.3, 124.4,
       124.5, 124.6, 124.7, 124.8, 124.9]) * ms

Exercise: Get the time of the first sample (t_start) and the end of the signal (t_stop).

Solution
sig.t_start, sig.t_stop
(array(120.) * ms, array(125.) * ms)

Exercise: Get the duration of sig which is the total length of the signal in time. How does it relate to t_start and t_stop?

Solution
sig.duration
array(0.005) * 1/Hz

Exercise: Get the sampling_period (the time between two consecutive samples).

Solution
sig.sampling_period
array(0.0001) * 1/Hz

Exercise: Select the first channel of sig (the first column).

Solution
sig[:, 0]
AnalogSignal with 1 channels of length 50; units uV; datatype float64
sampling rate: 10000.0 Hz
time: 120.0 ms to 125.0 ms

Exercise: Select the second channel (index 1) of sig (the second column).

Solution
sig[:, 1]
AnalogSignal with 1 channels of length 50; units uV; datatype float64
sampling rate: 10000.0 Hz
time: 120.0 ms to 125.0 ms

Exercise: Select the first 10 time samples of sig (the first 10 rows).

Solution
sig[:10]
AnalogSignal with 2 channels of length 10; units uV; datatype float64
sampling rate: 10000.0 Hz
time: 120.0 ms to 121.0 ms

Exercise: Create two single-sample signals, one in microvolts and one in millivolts, then add them together. Notice that Neo converts the units automatically, so the result is expressed in microvolts.

Solution
sig_uv = neo.AnalogSignal([[1, 2, 3]], units='uV', sampling_rate=1000*pq.Hz)
sig_mv = neo.AnalogSignal([[1, 2, 3]], units='mV', sampling_rate=1000*pq.Hz)
sig_uv + sig_mv
AnalogSignal with 3 channels of length 1; units uV; datatype float64
sampling rate: 1000.0 Hz
time: 0.0 s to 0.001 s

Exercise: Create two single-sample signals with the same units ('uV') but different sampling rates (1000*pq.Hz and 2000*pq.Hz). Try to add them. What error do you observe? How does this differ from the incompatible-units error earlier?

Solution
sig_slow = neo.AnalogSignal([[1, 2, 3]], units='uV', sampling_rate=1000*pq.Hz)
sig_fast = neo.AnalogSignal([[1, 2, 3]], units='uV', sampling_rate=2000*pq.Hz)
try:
    sig_slow + sig_fast
except Exception as e:
    print(type(e).__name__, e)
ValueError Inconsistent values of sampling_rate

Section 3: Spikes, Events and Epochs

Background

Not all data is a continuous signal. Neo provides three further objects for the event-like data that is typical of electrophysiology experiments:

  • SpikeTrain: the times at which a neuron (or sorted unit) fired. A spike train object needs a start time (t_start) and a stop time (t_stop) so that the recording window is unambiguous, even if the first or last spike occurs well inside it.
  • Event: discrete reference time points, such as a stimulus onset or a trial trigger. Each time point can carry a text label.
  • Epoch: reference time ranges, each with a time, a duration, and a label; useful for marking trials, stimulation periods, or analysis windows.

Exercises

In this section you will create spike trains, events, and epochs, read back their times, labels, and derived quantities, and select epochs by their label.

Code Description
neo.SpikeTrain([1.5, 8.0, 11.2], units='ms', t_start=0*pq.ms, t_stop=50*pq.ms) Create a spike train from a list of spike times.
st.t_start, st.t_stop The start and stop time of a recording window of a spike train.
neo.Event(times=[1, 2, 3], units='ms', labels=['a', 'b', 'c']) Create an event from time points and matching labels.
neo.Epoch(times=[3, 9], durations=[1, 1], units='ms', labels=['x', 'y']) Create epochs from start times, durations, and labels.
ep.times, ep.durations, ep.labels The start times, durations, and labels of an epoch.
ep[ep.labels == 'Go'] Select the epoch(s) with a given label using a boolean mask.
ep[np.isin(ep.labels, ['Go', 'Reward'])] Select epochs matching any of several labels.
np.unique(ev.labels) The set of distinct labels in an event.

Example: Create a SpikeTrain with three spikes (in milliseconds) recorded in a window from 0 to 300 ms and assign it to a variable st. Display st to check that the number of spikes and start and stop times of the recording window are correct.

st = neo.SpikeTrain([1, 4, 5.7], units='ms', t_start=0*pq.ms, t_stop=300*pq.ms)
st
SpikeTrain containing 3 spikes; units ms; datatype float64 
time: 0.0 ms to 300.0 ms

Exercise: Create a SpikeTrain named st with spikes at 10, 25, 80 and 120 ms, recorded in a window from 0 to 200 ms and assign it to a variable st. Display st to check that the number of spikes and start and stop times of the recording window are correct.

Solution
st = neo.SpikeTrain([10, 25, 80, 120], units='ms', t_start=0*pq.ms, t_stop=200*pq.ms)
st
SpikeTrain containing 4 spikes; units ms; datatype float64 
time: 0.0 ms to 200.0 ms

Exercise: Print st.t_start, st.t_stop, and st.duration for the spike train st created in the previous exercise (spikes at 10, 25, 80, 120 ms). What is the total length of the observation window?

Solution
print(st.t_start)
print(st.t_stop)
print(st.duration)
0.0 ms
200.0 ms
200.0 ms

Exercise: Rescale the spike times of st to seconds using st.rescale('s'). Print the result. How do the values relate to the original millisecond times?

Solution
print(st.rescale('s'))
[0.01  0.025 0.08  0.12 ] s

Exercise: Use boolean indexing to select only spikes from st that fired after 50*pq.ms: st[st.times > 50*pq.ms]. Print the result and its len. Which of the original spikes at 10, 25, 80, 120 ms remain?

Solution
late_spikes = st[st.times > 50*pq.ms]
print(late_spikes)
print(len(late_spikes))
[ 80. 120.] ms
2

Exercise: Create an Event named ev marking three stimulus onsets at 3, 9 and 15 ms, with labels 'cue', 'go' and 'reward'.

Solution
ev = neo.Event(times=[3, 9, 15], units='ms', labels=['cue', 'go', 'reward'])
ev
Event containing 3 events with labels; time units ms; datatype int64 

Exercise: Create an Epoch named ep with four epochs starting at 3, 9, 15 and 21 ms, each lasting 1 ms, labelled 'Start', 'Wait', 'Go', 'Reward'.

Solution
ep = neo.Epoch(times=[3, 9, 15, 21], durations=[1, 1, 1, 1], units='ms',
               labels=['Start', 'Wait', 'Go', 'Reward'])
ep
Epoch containing 4 epochs with labels; time units ms; datatype int64 

Exercise: Select the epoch labelled 'Go' by building a boolean mask with ep.labels == 'Go', use it to index ep, and read the start time(s) of the matching epoch(s) with .times.

Solution
ep[ep.labels == 'Go'].times
array([15]) * ms

Exercise: Select the epoch labelled 'Reward' and read its durations.

Solution
ep[ep.labels == 'Reward'].durations
array([1]) * ms

Exercise: Use np.isin(ep.labels, [...]) to build a mask that matches the epochs labelled either 'Go' or 'Reward', then read their start times.

Solution
ep[np.isin(ep.labels, ['Go', 'Reward'])].times
array([15, 21]) * ms

Exercise: Compute the end time of each epoch in ep by adding its durations to its times.

Solution
ep.times + ep.durations
array([ 4, 10, 16, 22]) * ms

Exercise: Set ep.name = 'trial_periods' and ep.description = 'Time intervals for each behavioral event'. Print both attributes.

Solution
ep.name = 'trial_periods'
ep.description = 'Time intervals for each behavioral event'
print(ep.name)
print(ep.description)
trial_periods
Time intervals for each behavioral event

Section 4: Annotations and Saving

Background

A strength of Neo is that data stays connected to its description. Every Neo object can carry metadata in several ways:

  • The name and description attributes give a human-readable label.
  • annotate(key=value) attaches arbitrary metadata to the whole object, stored in annotations.
  • array_annotate(key=[...]) attaches one value per channel (or per spike, or per event), stored in array_annotations. This is ideal for per-channel information such as electrode IDs, and it lets you select data by its metadata.

Finally, Neo can write this fully-annotated structure to disk. The NIX format preserves the complete Neo data model, including all annotations. You open a file with neo.io.NixIO, using mode 'ow' to overwrite (write) and 'ro' to read.

Neo’s file I/O works at the level of a Block, which is the top-level container that groups all data objects belonging to a recording. A Block holds one or more Segment objects, each of which groups all data objects that share a common time axis (e.g., everything from one continuous recording epoch). Think of Block and Segment simply as grouping elements, envelopes you need to wrap your signals, spike trains, events, and epochs before saving them to disk. We will use these elements here to demonstrate saving, but you will work with Block and Segment in detail in the next session.

Exercises

In this section you will annotate a signal at the object and channel level, use an array annotation to select a subset of channels, and save and reload a block in the NIX format.

Code Description
sig.name = 'LFP', sig.description = '...' Set human-readable labels.
sig.annotate(quality='excellent') Attach a key-value annotation to the whole object.
sig.annotations The dictionary of object-level annotations.
sig.array_annotate(electrode_id=[1, 2, 3, 4]) Attach one value per channel.
sig.array_annotations The dictionary of per-channel annotations.
np.nonzero(sig.array_annotations['group'] == 'B')[0] Find the channel indices matching a condition.
sig[:, idx] Select the channels at the given indices.
neo.Block(name='my recording') Create a Block, the top-level grouping container (covered in detail in the next session).
neo.Segment() Create a Segment, groups all data sharing a time axis (covered in detail in the next session).
seg.analogsignals.append(sig) Place a signal inside a segment.
block.segments.append(seg) Place a segment inside a block.
with neo.io.NixIO('data.nix', 'ow') as io:
         io.write_block(block)
Write a block to a NIX file.
with neo.io.NixIO('data.nix', 'ro') as io:
         block = io.read_block()
Read a block from a NIX file.

Example: Create an 8-channel signal with 100 samples and a sampling rate of 1000 Hz. Name it “MUA” and give it a description.

sig = neo.AnalogSignal(np.random.randn(50, 8), units='uV', sampling_rate=1000*pq.Hz)
sig.name = 'LFP'
sig.description = 'simulated multi-unit activity'

Exercise: Create an 8-channel signal with 50 samples and a sampling rate of 1000 Hz. Name it “LFP” and give it a description.

Solution
sig = neo.AnalogSignal(np.random.randn(50, 8), units='uV', sampling_rate=1000*pq.Hz)
sig.name = 'LFP'
sig.description = 'simulated local field potential'

Example: Attach an annotation stating that the quality of the signal was excellent.

sig.annotate(quality='excellent')

Exercise: Attach an annotation stating that the quality of the signal was poor. Print the annotation.

Solution
sig.annotate(quality='poor')
sig.annotations
{'quality': 'poor'}

Exercise: Add an annotation recorded_by='experimenter_1' to sig. Print sig.annotations again. Does a second .annotate() call extend or overwrite the previous annotations?

Solution
sig.annotate(recorded_by='experimenter_1')
print(sig.annotations)
{'quality': 'poor', 'recorded_by': 'experimenter_1'}

Exercise: Attach a per-channel array annotation to sig giving each of the 8 channels an electrode_id from 1 to 8.

Solution
sig.array_annotate(
    electrode_id=[1, 2, 3, 4, 5, 6, 7, 8],)
sig.array_annotations
{'electrode_id': array([1, 2, 3, 4, 5, 6, 7, 8])}

Exercise: Attach a per-channel array annotation to sig giving each of the 8 channels an electrode_id from 1 to 8 and a tetrode_group label, with the first four channels in group 'A' and the last four in group 'B'.

Solution
sig.array_annotate(
    electrode_id=[1, 2, 3, 4, 5, 6, 7, 8],
    tetrode_group=['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'])
sig.array_annotations
{'electrode_id': array([1, 2, 3, 4, 5, 6, 7, 8]),
 'tetrode_group': array(['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], dtype='<U1')}

Exercise: Use the tetrode_group array annotation to find the indices of the channels in group 'B', then select those channels into a new signal. Verify that its array_annotations only contain group 'B'.

Solution
group_b = np.nonzero(sig.array_annotations['tetrode_group'] == 'B')[0]
sig_b = sig[:, group_b]
sig_b.array_annotations
{'electrode_id': array([5, 6, 7, 8]),
 'tetrode_group': array(['B', 'B', 'B', 'B'], dtype='<U1')}

Exercise: Create an Event named ev_trials with 6 time points spaced 200 ms apart starting at 0 ms, each labelled 'trial_start'. Use .array_annotate() to attach trial_id=np.arange(6) and trial_type=np.array(['A','B','A','B','A','B']). Then build a boolean mask that selects only type 'A' trials and slice the event. Print len and .array_annotations of the result.

Solution
ev_trials = neo.Event(
    times=np.arange(6) * 200 * pq.ms,
    labels=np.array(['trial_start'] * 6)
)
ev_trials.array_annotate(
    trial_id=np.arange(6),
    trial_type=np.array(['A', 'B', 'A', 'B', 'A', 'B'])
)
mask_a = ev_trials.array_annotations['trial_type'] == 'A'
ev_a = ev_trials[mask_a]
print(len(ev_a))
print(ev_a.array_annotations)
3
{'trial_id': array([0, 2, 4]), 'trial_type': array(['A', 'A', 'A'], dtype='<U1')}

Exercise: Put sig into a block with one segment, then save the block to a file named my_dataset.nix using neo.io.NixIO in 'ow' (overwrite) mode.

Solution
block = neo.Block(name='my dataset')
seg = neo.Segment()
seg.analogsignals.append(sig)
block.segments.append(seg)

with neo.io.NixIO('my_dataset.nix', 'ow') as io:
    io.write_block(block)

Exercise: Read the block back from my_dataset.nix using 'ro' (read-only) mode, and confirm that the channel annotations survived the round trip.

Solution
with neo.io.NixIO('my_dataset.nix', 'ro') as io:
    loaded = io.read_block()
loaded.segments[0].analogsignals[0].array_annotations
{'electrode_id': array([1, 2, 3, 4, 5, 6, 7, 8]),
 'tetrode_group': array(['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], dtype='<U1')}