This work critically analyzes existing models for open-vocabulary EEG-to-Text translation. We identify a crucial limitation: previous studies often employed implicit teacher-forcing during evaluation, artificially inflating performance metrics. Additionally, they lacked a critical benchmark - comparing model performance on pure noise inputs. We propose a methodology to differentiate between models that truly learn from EEG signals and those that simply memorize training data. Our analysis reveals that model performance on noise data can be comparable to that on EEG data. These findings highlight the need for stricter evaluation practices in EEG-to-Text research, emphasizing transparent reporting and rigorous benchmarking with noise inputs. This approach will lead to more reliable assessments of model capabilities and pave the way for robust EEG-to-Text communication systems.
翻译:本研究对现有开放词汇脑电信号转文本翻译模型进行了批判性分析。我们指出了一个关键局限:先前研究在评估阶段常采用隐式教师强制策略,人为地提升了性能指标。此外,这些研究缺乏关键基准测试——未将模型在纯噪声输入上的表现纳入对比。我们提出了一种方法论,用于区分真正从脑电信号中学习的模型与仅记忆训练数据的模型。分析表明,模型在噪声数据上的表现可与脑电数据上的表现相媲美。这些发现凸显了脑电转文本研究领域需要更严格的评估实践,强调透明化报告和采用噪声输入的严谨基准测试。该方法将带来更可靠的模型能力评估,并为构建稳健的脑电转文本通信系统奠定基础。