Motor and Speech Imagery EEG Dataset
Overview and Methodology
This dataset contains motor imagery (MI) and speech imagery (SI) electroencephalogram (EEG) data recorded from 5 healthy subjects with a mean age of 24.4 years. MI involves the subject imagining movements in their limbs, whereas SI involves the subject imagining speaking words in their mind (thought-speech).
The data was recorded using the BioSemi ActiveTwo electroencephalogram (EEG) recording equipment, at a sampling frequency of 2.048kHz. 24 channels of EEG data from the 10-20 system are available in the dataset. Four classes of data were recorded for each of the MI and SI paradigms. In the case of MI, left-hand, right-hand, legs and tongue MI tasks were recorded, and in the case of SI, the words, ‘left’, ‘right’, ‘up’ and ‘down’ were recorded. Data for the idle state, when the subject is mentally relaxed and not executing any tasks, was also recorded.
Forty trials were recorded for each of the classes. These trials were recorded over four runs, with two runs being used to record MI trials, and two to record SI trials. The runs were interleaved, meaning that the first and third runs were used to record MI, and the second and fourth runs were used to record SI trials. During each run, twenty trials for each class in the paradigm were recorded. These trials were randomly ordered. Note that during each run, twenty trials of the idle state were also recorded. This means that in this database there are actually eighty idle state trials, with forty being recorded during MI runs and forty being recorded during SI runs.
Subjects were guided through the data recording runs by a graphical user interface which issued instructions to them. At the start of a run, subjects are given one minute to settle down before the cued trials began. During a trial, a fixation cross first appears on-screen, indicating to the subject to remain relaxed but aware that the next trial will soon begin. After 2s a cue appears on-screen for 1.25s, indicating the particular task the subject should execute. The subject starts executing the task as soon as they see the cue, and continue even when it has disappeared, until the fixation cross appears again. The cues consist of a left-facing arrow (for left-hand MI or ‘left’ SI), a right-facing arrow (for right-hand MI or ‘right’ SI), an upward facing (for tongue MI or ‘up’ SI) and a downward facing arrow (for legs MI or ‘down’ SI). Each trial lasted 4 seconds. Between each run, subjects were given a 3–5-minute break.
The data was re-referenced using channel Cz and then mean-centered it. The data was also passed through an anti-aliasing filter and down-sampled to 1kHz before being stored in .mat files for the data repository. The anti-aliasing filter was a low-pass filter with a cutoff frequency of 500Hz, implemented using the lowpass function in MATLAB, which produces a 60dB attenuation above the cutoff and automatically compensates for filter-induced delays.
The dataset consists of 10 MAT-files, named X_Subject_Y.mat, where X is the acronym denoting the brain imagery type, either MI for motor imagery data or SI for speech imagery data, and Z is the subject number. Each file contains the trials for each run in the structure variables ‘run_1’ and ‘run_2’. Within each run structure there are two variables:
The authors acknowledge that data collection for this project was funded through the project: “Setting up of transdisciplinary research and knowledge exchange (TRAKE) complex at the University of Malta (ERDF.01.124)”, which is being co-financed through the European Union through the European Regional Development Fund 2014–2020. The data was recorded by the Centre for Biomedical Cybernetics at the University of Malta.