loading page

IIST BCI Dataset-2 for Selected Common Marathi Words
  • +3
  • Shubham Tayade,
  • Parvathy S S,
  • Nancy Sunil,
  • Charu Chauhan,
  • S Sumitra,
  • B S Manoj
Shubham Tayade
Indian Institute of Space Science and Technology (IIST)

Corresponding Author:[email protected]

Author Profile
Parvathy S S
A J College of Science and Technology,Thonnakkal
Author Profile
Nancy Sunil
A J College of Science and Technology,Thonnakkal
Author Profile
Charu Chauhan
Indian Institute of Space Science and Technology (IIST)
S Sumitra
Indian Institute of Space Science and Technology (IIST)
B S Manoj
Indian Institute of Space Science and Technology (IIST)

Abstract

To solve problems of neurodegenerative disorder patients, Brain-Computer Interface (BCI) based solutions require datasets relevant to the languages spoken by patients. BCI Research sometimes gets restricted due to the lack of datasets. For example, Marathi, a prominent language spoken by over 83 million people in India, lacks BCI datasets based on the language for research purposes. To tackle this gap, we created a dataset comprising of Electroencephalograph (EEG) signal samples of selected common Marathi words. EEG samples were captured using the OpenBCI Cyton device for constructing a dataset by volunteers who speak commonly used Marathi words. The dataset contains EEG recordings involving volunteers pronouncing commonly used Marathi words. It encompasses three main categories: (i) Utterances of Marathi words (Vocal), (ii) English translations of these Marathi words (Vocal), and (iii) Silent pronunciation (sub-vocalization) of the Marathi words. We compiled data for 100 distinct words, each with recordings for these three categories. Ten trials were conducted for each phrase. This dataset is valuable for developing BCI solutions to assist Marathi-speaking patients with neurodegenerative diseases. BCI solutions using Machine Learning (ML) classifiers and Deep Learning methods can be used to translate EEG signals into Marathi words, both vocal and sub-vocal.
07 Mar 2024Submitted to TechRxiv
14 Mar 2024Published in TechRxiv