.. _process_by_group: Process a recording by channel group ==================================== In this tutorial, we will walk through how to preprocess and sort a recording separately for *channel groups*. A channel group is a subset of channels grouped by some feature - for example a multi-shank Neuropixels recording in which the channels are grouped by shank. **Why preprocess by channel group?** Certain preprocessing steps depend on the spatial arrangement of the channels. For example, common average referencing (CAR) averages over channels (separately for each time point) and subtracts the average. In such a scenario it may make sense to group channels so that this averaging is performed only over spatially close channel groups. **Why sort by channel group?** When sorting, we may want to completely separately channel groups so we can consider their signals in isolation. If recording from a long silicon probe, we might want to sort different brain areas separately, for example using a different sorter for the hippocampus, the thalamus, or the cerebellum. Splitting a Recording by Channel Group -------------------------------------- In this example, we create a 384-channel recording with 4 shanks. However this could be any recording in which the channel are grouped in some way, for example a multi-tetrode recording with channel groups representing the channels on each individual tetrodes. First, let's import the parts of SpikeInterface we need into Python, and generate our toy recording: .. code-block:: python import spikeinterface.extractors as se import spikeinterface.preprocessing as spre from spikeinterface import aggregate_channels from probeinterface import generate_tetrode, ProbeGroup import numpy as np # Create a toy 384 channel recording with 4 shanks (each shank contain 96 channels) recording, _ = se.toy_example(duration=[1.00], num_segments=1, num_channels=384) four_shank_groupings = np.repeat([0, 1, 2, 3], 96) recording.set_property("group", four_shank_groupings) print(recording.get_channel_groups()) """ array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) """ We can split a recording into multiple recordings, one for each channel group, with the :py:func:`~split_by` method. .. code-block:: python split_recording_dict = recording.split_by("group") Splitting a recording by channel group returns a dictionary containing separate recordings, one for each channel group: .. code-block:: python print(split_recording_dict) """ {0: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype 10.99 MiB, 1: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype 10.99 MiB, 2: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype 10.99 MiB, 3: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype 10.99 MiB} """ Preprocessing a Recording by Channel Group ------------------------------------------ If a preprocessing function is given a dictionary of recordings, it will apply the preprocessing seperately to each recording in the dict, and return a dictionary of preprocessed recordings. Hence we can pass the ``split_recording_dict`` in the same way as we would pass a single recording to any preprocessing function. .. code-block:: python shifted_recordings = spre.phase_shift(split_recording_dict) filtered_recording = spre.bandpass_filter(shifted_recording) referenced_recording = spre.common_reference(filtered_recording) good_channels_recording = spre.detect_and_remove_bad_channels(filtered_recording) If needed, we could aggregate the recordings back together using the ``aggregate_channels`` function. Note: you do not need to do this to sort the data (see :ref:`sorting-by-channel-group`). .. code-block:: python combined_preprocessed_recording = aggregate_channels(good_channels_recording) Now, when ``combined_preprocessed_recording`` is used in sorting, plotting, or whenever calling its :py:func:`~get_traces` method, the data will have been preprocessed separately per-channel group (then concatenated back together under the hood). .. note:: The splitting and aggregation of channels for preprocessing is flexible. Under the hood, :py:func:`~aggregate_channels` keeps track of when a recording was split. When :py:func:`~get_traces` is called, the preprocessing is still performed per-group, even though the recording is now aggregated. To ensure data is preprocessed by channel group, the preprocessing step must be applied separately to each split channel group recording. For example, the below example will NOT preprocess by channel group: .. code-block:: python split_recording = recording.split_by("group") split_recording_as_list = list(**split_recording.values()) combined_recording = aggregate_channels(split_recording_as_list) # will NOT preprocess by channel group. filtered_recording = common_reference(combined_recording) In general, it is not recommended to apply :py:func:`~aggregate_channels` more than once. This will slow down :py:func:`~get_traces` calls and may result in unpredictable behaviour. .. _sorting-by-channel-group: Sorting a Recording by Channel Group ------------------------------------ We can also sort a recording for each channel group separately. It is not necessary to preprocess a recording by channel group in order to sort by channel group. There are two ways to sort a recording by channel group. First, we can pass a dictionary to the ``run_sorter`` function. Since the preprocessing-by-group method above returns a dict, we can simply pass this output. Alternatively, for more control, we can loop over the recordings ourselves. **Option 1 : Automatic splitting (Recommended)** Simply pass the split recording to the ``run_sorter`` function, as if it was a non-split recording. This will return a dict of sortings, with the same keys as the dict of recordings that were passed to ``run_sorter``. .. code-block:: python split_recording = raw_recording.split_by("group") # is a dict of recordings # do preprocessing if needed pp_recording = spre.bandpass_filter(split_recording) dict_of_sortings = run_sorter( sorter_name='kilosort4', recording=pp_recording, folder='my_kilosort4_sorting' ) **Option 2: Manual splitting** In this example, we loop over all preprocessed recordings that are grouped by channel, and apply the sorting separately. We store the sorting objects in a dictionary for later use. You might do this if you want extra control e.g. to apply bespoke steps to different groups. .. code-block:: python split_preprocessed_recording = preprocessed_recording.split_by("group") sortings = {} for group, sub_recording in split_preprocessed_recording.items(): sorting = run_sorter( sorter_name='kilosort2', recording=split_preprocessed_recording, folder=f"folder_KS2_group{group}" ) sortings[group] = sorting Creating a SortingAnalyzer by Channel Group ------------------------------------------- The code above generates a dictionary of recording objects and a dictionary of sorting objects. When making a :ref:`SortingAnalyzer `, we can pass these dictionaries and a single analyzer will be created, with the recordings and sortings appropriately aggregated. The dictionary of recordings and dictionary of sortings must have the same keys. E.g. if you use ``split_by("group")``, the keys of your dict of recordings will be the values of the ``group`` property of the recording. Then the dict of sortings should also have these keys. Note that if you use the internal functions, like we do in the code-block below, you don't need to keep track of keys yourself. SpikeInterface will do this for you automatically. The code for create ``SortingAnalyzer`` from dicts of recordings and sortings is very similar to that for creating a sorting analyzer from a single recording and sorting: .. code-block:: python dict_of_recordings = preprocessed_recording.split_by("group") dict_of_sortings = run_sorter(sorter_name="mountainsort5", recording = dict_of_recordings) analyzer = create_sorting_analyzer(sorting=dict_of_sortings, recording=dict_of_recordings) The code above creates a *single* sorting analyzer called :code:`analyzer`. You can select the units from one of the "group"s as follows: .. code-block:: python aggretation_keys = analyzer.get_sorting_property("aggregation_key") unit_ids_group_0 = analyzer.unit_ids[aggretation_keys == 0] group_0_analzyer = analyzer.select_units(unit_ids = unit_ids_group_0)