Motor and Speech Imagery EEG Dataset

dataset

posted on 2023-11-01, 06:25 authored by Natasha PadfieldNatasha Padfield, KENNETH P CAMILLERIKENNETH P CAMILLERI, TRACEY CAMILLERITRACEY CAMILLERI, MARVIN K BUGEJAMARVIN K BUGEJA, SIMON G FABRISIMON G FABRI

Overview and Methodology

This dataset contains motor imagery (MI) and speech imagery (SI) electroencephalogram (EEG) data recorded from 5 healthy subjects with a mean age of 24.4 years. MI involves the subject imagining movements in their limbs, whereas SI involves the subject imagining speaking words in their mind (thought-speech).

The data was recorded using the BioSemi ActiveTwo electroencephalogram (EEG) recording equipment, at a sampling frequency of 2.048kHz. 24 channels of EEG data from the 10-20 system are available in the dataset. Four classes of data were recorded for each of the MI and SI paradigms. In the case of MI, left-hand, right-hand, legs and tongue MI tasks were recorded, and in the case of SI, the words, ‘left’, ‘right’, ‘up’ and ‘down’ were recorded. Data for the idle state, when the subject is mentally relaxed and not executing any tasks, was also recorded.

Forty trials were recorded for each of the classes. These trials were recorded over four runs, with two runs being used to record MI trials, and two to record SI trials. The runs were interleaved, meaning that the first and third runs were used to record MI, and the second and fourth runs were used to record SI trials. During each run, twenty trials for each class in the paradigm were recorded. These trials were randomly ordered. Note that during each run, twenty trials of the idle state were also recorded. This means that in this database there are actually eighty idle state trials, with forty being recorded during MI runs and forty being recorded during SI runs.

Subjects were guided through the data recording runs by a graphical user interface which issued instructions to them. At the start of a run, subjects are given one minute to settle down before the cued trials began. During a trial, a fixation cross first appears on-screen, indicating to the subject to remain relaxed but aware that the next trial will soon begin. After 2s a cue appears on-screen for 1.25s, indicating the particular task the subject should execute. The subject starts executing the task as soon as they see the cue, and continue even when it has disappeared, until the fixation cross appears again. The cues consist of a left-facing arrow (for left-hand MI or ‘left’ SI), a right-facing arrow (for right-hand MI or ‘right’ SI), an upward facing (for tongue MI or ‘up’ SI) and a downward facing arrow (for legs MI or ‘down’ SI). Each trial lasted 4 seconds. Between each run, subjects were given a 3–5-minute break.

The data was re-referenced using channel Cz and then mean-centered it. The data was also passed through an anti-aliasing filter and down-sampled to 1kHz before being stored in .mat files for the data repository. The anti-aliasing filter was a low-pass filter with a cutoff frequency of 500Hz, implemented using the lowpass function in MATLAB, which produces a 60dB attenuation above the cutoff and automatically compensates for filter-induced delays.

Files

The dataset consists of 10 MAT-files, named X_Subject_Y.mat, where X is the acronym denoting the brain imagery type, either MI for motor imagery data or SI for speech imagery data, and Z is the subject number. Each file contains the trials for each run in the structure variables ‘run_1’ and ‘run_2’. Within each run structure there are two variables:

‘EEG_Data’, a matrix containing the EEG data formatted as: [number of trials x channels x data samples]. The number of data samples is 4000 since the length of each trial was 4s, sampled at 1kHz. The relationship between the EEG channels and the channel number in the second dimension of this matrix is documented in the table stored within the ‘ChannelLocations.mat’ file, which is included with the dataset;
‘labels’, a vector indicating which cue was issued, with the following numbers being used to represent the different cues: 1 – Right, 2 – Left, 3 – Up, 4 – Down, 5 – Idle, 6 – Fixation Cross.

Acknowledgements

The authors acknowledge that data collection for this project was funded through the project: “Setting up of transdisciplinary research and knowledge exchange (TRAKE) complex at the University of Malta (ERDF.01.124)”, which is being co-financed through the European Union through the European Regional Development Fund 2014–2020. The data was recorded by the Centre for Biomedical Cybernetics at the University of Malta.

Funding

ERDF.01.124

History

Project Name

User-intuitive Continuous Brain Control of a Smart Wheelchair (BrainCon)

Usage metrics

Keywords

Electroencephalography Brain-computer interfaces Speech processing systems Thought and thinking

Licence

CC BY 4.0