Decoding speech from non-invasive brain signals is challenging. For the LibriBrain 2025 Speech Detection task, we propose a novel two-step framework that bypasses direct reconstruction. First, a contrastive learning model retrieves the matching speech segment for the given test MEG from a large-scale audio library (LibriVox). Second, a speech detection model generates the binary silence/speech sequence directly from this retrieved audio. With this approach, our team Sherlock Holmes achieved first place in the extended track (F1-score: 0.962), demonstrating that leveraging external audio databases is a highly effective strategy.
翻译:从非侵入性脑信号中解码语音极具挑战性。针对LibriBrain 2025语音检测任务,我们提出了一种绕过直接重建的新型两阶段框架。首先,对比学习模型从大规模音频库(LibriVox)中检索与给定测试MEG信号对应的语音片段;其次,语音检测模型直接从该检索到的音频中生成二值静默/语音序列。通过此方法,我们的Sherlock Holmes团队在扩展赛道中获得第一名(F1分数:0.962),证明利用外部音频数据库是一种高效策略。