Surface electromyography (EMG) is a promising modality for silent speech interfaces, but its effectiveness depends heavily on sensor placement and channel availability. In this work, we investigate the contribution of individual and combined EMG channels to speech reconstruction performance. Our findings reveal that while certain EMG channels are individually more informative, the highest performance arises from subsets that leverage complementary relationships among channels. We also analyzed phoneme classification accuracy under channel ablations and observed interpretable patterns reflecting the anatomical roles of the underlying muscles. To address performance degradation from channel reduction, we pretrained models on full 8-channel data using random channel dropout and fine-tuned them on reduced-channel subsets. Fine-tuning consistently outperformed training from scratch for 4 - 6 channel settings, with the best dropout strategy depending on the number of channels. These results suggest that performance degradation from sensor reduction can be mitigated through pretraining and channel-aware design, supporting the development of lightweight and practical EMG-based silent speech systems.
翻译:表面肌电图(EMG)是一种极具潜力的无声语音交互模态,但其效果高度依赖于传感器放置位置与通道可用性。本研究探究了单个及组合EMG通道对语音重建性能的贡献。研究发现,虽然某些EMG通道单独提供更多信息,但最高性能来自于利用通道间互补关系的子集。我们还分析了通道消融下的音素分类准确率,观察到反映底层肌肉解剖学功能的可解释模式。为缓解通道减少导致的性能下降,我们采用随机通道丢弃策略在完整8通道数据上预训练模型,并在缩减通道子集上进行微调。在4-6通道配置中,微调策略始终优于从头训练,最佳丢弃策略的选择取决于通道数量。这些结果表明,通过预训练与通道感知设计可以缓解传感器减少导致的性能下降,为开发轻量化、实用化的基于EMG的无声语音系统提供了支持。