This paper introduces a novel algorithm designed for speech synthesis from neural activity recordings obtained using invasive electroencephalography (EEG) techniques. The proposed system offers a promising communication solution for individuals with severe speech impairments. Central to our approach is the integration of time-frequency features in the high-gamma band computed from EEG recordings with an advanced NeuroIncept Decoder architecture. This neural network architecture combines Convolutional Neural Networks (CNNs) and Gated Recurrent Units (GRUs) to reconstruct audio spectrograms from neural patterns. Our model demonstrates robust mean correlation coefficients between predicted and actual spectrograms, though inter-subject variability indicates distinct neural processing mechanisms among participants. Overall, our study highlights the potential of neural decoding techniques to restore communicative abilities in individuals with speech disorders and paves the way for future advancements in brain-computer interface technologies.
翻译:本文介绍了一种新颖的算法,用于从侵入式脑电图技术记录的神经活动中合成语音。所提出的系统为严重言语障碍患者提供了一种有前景的沟通解决方案。我们方法的核心在于将脑电图记录中计算得到的高伽马频带时频特征与先进的NeuroIncept解码器架构相结合。该神经网络架构结合了卷积神经网络和门控循环单元,以从神经模式中重建音频频谱图。我们的模型在预测频谱图与实际频谱图之间展现出稳健的平均相关系数,尽管受试者间的变异性表明参与者之间存在不同的神经处理机制。总体而言,我们的研究突显了神经解码技术在恢复言语障碍患者沟通能力方面的潜力,并为脑机接口技术的未来发展铺平了道路。