Process a recording by channel group

In this tutorial, we will walk through how to preprocess and sort a recording separately for channel groups. A channel group is a subset of channels grouped by some feature - for example a multi-shank Neuropixels recording in which the channels are grouped by shank.

Why preprocess by channel group?

Certain preprocessing steps depend on the spatial arrangement of the channels. For example, common average referencing (CAR) averages over channels (separately for each time point) and subtracts the average. In such a scenario it may make sense to group channels so that this averaging is performed only over spatially close channel groups.

Why sort by channel group?

When sorting, we may want to completely separately channel groups so we can consider their signals in isolation. If recording from a long silicon probe, we might want to sort different brain areas separately, for example using a different sorter for the hippocampus, the thalamus, or the cerebellum.

Splitting a Recording by Channel Group

In this example, we create a 384-channel recording with 4 shanks. However this could be any recording in which the channel are grouped in some way, for example a multi-tetrode recording with channel groups representing the channels on each individual tetrodes.

First, let’s import the parts of SpikeInterface we need into Python, and generate our toy recording:

import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
from spikeinterface import aggregate_channels
from probeinterface import generate_tetrode, ProbeGroup
import numpy as np

# Create a toy 384 channel recording with 4 shanks (each shank contain 96 channels)
recording, _ = se.toy_example(duration=[1.00], num_segments=1, num_channels=384)
four_shank_groupings = np.repeat([0, 1, 2, 3], 96)
recording.set_property("group", four_shank_groupings)

print(recording.get_channel_groups())
"""
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
"""

We can split a recording into multiple recordings, one for each channel group, with the split_by() method.

split_recording_dict = recording.split_by("group")

Splitting a recording by channel group returns a dictionary containing separate recordings, one for each channel group:

print(split_recording_dict)
"""
{0: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
                       10.99 MiB, 1: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
                       10.99 MiB, 2: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
                       10.99 MiB, 3: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
                       10.99 MiB}
"""

Preprocessing a Recording by Channel Group

If a preprocessing function is given a dictionary of recordings, it will apply the preprocessing seperately to each recording in the dict, and return a dictionary of preprocessed recordings. Hence we can pass the split_recording_dict in the same way as we would pass a single recording to any preprocessing function.

shifted_recordings = spre.phase_shift(split_recording_dict)
filtered_recording = spre.bandpass_filter(shifted_recording)
referenced_recording = spre.common_reference(filtered_recording)
good_channels_recording = spre.detect_and_remove_bad_channels(filtered_recording)

If needed, we could aggregate the recordings back together using the aggregate_channels function. Note: you do not need to do this to sort the data (see Sorting a Recording by Channel Group).

combined_preprocessed_recording = aggregate_channels(good_channels_recording)

Now, when combined_preprocessed_recording is used in sorting, plotting, or whenever calling its get_traces() method, the data will have been preprocessed separately per-channel group (then concatenated back together under the hood).

Note

The splitting and aggregation of channels for preprocessing is flexible. Under the hood, aggregate_channels() keeps track of when a recording was split. When get_traces() is called, the preprocessing is still performed per-group, even though the recording is now aggregated.

To ensure data is preprocessed by channel group, the preprocessing step must be applied separately to each split channel group recording. For example, the below example will NOT preprocess by channel group:

split_recording = recording.split_by("group")
split_recording_as_list = list(**split_recording.values())
combined_recording = aggregate_channels(split_recording_as_list)

# will NOT preprocess by channel group.
filtered_recording = common_reference(combined_recording)

In general, it is not recommended to apply aggregate_channels() more than once. This will slow down get_traces() calls and may result in unpredictable behaviour.

Sorting a Recording by Channel Group

We can also sort a recording for each channel group separately. It is not necessary to preprocess a recording by channel group in order to sort by channel group.

There are two ways to sort a recording by channel group. First, we can pass a dictionary to the run_sorter function. Since the preprocessing-by-group method above returns a dict, we can simply pass this output. Alternatively, for more control, we can loop over the recordings ourselves.

Option 1 : Automatic splitting (Recommended)

Simply pass the split recording to the run_sorter function, as if it was a non-split recording. This will return a dict of sortings, with the same keys as the dict of recordings that were passed to run_sorter.

split_recording = raw_recording.split_by("group")
# is a dict of recordings

# do preprocessing if needed
pp_recording = spre.bandpass_filter(split_recording)

dict_of_sortings = run_sorter(
    sorter_name='kilosort4',
    recording=pp_recording,
    folder='my_kilosort4_sorting'
)

Option 2: Manual splitting

In this example, we loop over all preprocessed recordings that are grouped by channel, and apply the sorting separately. We store the sorting objects in a dictionary for later use.

You might do this if you want extra control e.g. to apply bespoke steps to different groups.

split_preprocessed_recording = preprocessed_recording.split_by("group")

sortings = {}
for group, sub_recording in split_preprocessed_recording.items():
    sorting = run_sorter(
        sorter_name='kilosort2',
        recording=split_preprocessed_recording,
        folder=f"folder_KS2_group{group}"
        )
    sortings[group] = sorting

Creating a SortingAnalyzer by Channel Group

The code above generates a dictionary of recording objects and a dictionary of sorting objects. When making a SortingAnalyzer, we can pass these dictionaries and a single analyzer will be created, with the recordings and sortings appropriately aggregated.

The dictionary of recordings and dictionary of sortings must have the same keys. E.g. if you use split_by("group"), the keys of your dict of recordings will be the values of the group property of the recording. Then the dict of sortings should also have these keys. Note that if you use the internal functions, like we do in the code-block below, you don’t need to keep track of keys yourself. SpikeInterface will do this for you automatically.

The code for create SortingAnalyzer from dicts of recordings and sortings is very similar to that for creating a sorting analyzer from a single recording and sorting:

dict_of_recordings = preprocessed_recording.split_by("group")
dict_of_sortings = run_sorter(sorter_name="mountainsort5", recording = dict_of_recordings)

analyzer = create_sorting_analyzer(sorting=dict_of_sortings, recording=dict_of_recordings)

The code above creates a single sorting analyzer called analyzer. You can select the units from one of the “group”s as follows:

aggretation_keys = analyzer.get_sorting_property("aggregation_key")
unit_ids_group_0 = analyzer.unit_ids[aggretation_keys == 0]
group_0_analzyer = analyzer.select_units(unit_ids = unit_ids_group_0)