从肌肉到文本：MyoText通过手指分类与基于Transformer的解码实现表面肌电信号到文本的转换 (From Muscle to Text with MyoText: sEMG to Text via Finger Classification and Transformer-Based Decoding)

Surface electromyography (sEMG) provides a direct neural interface for decoding muscle activity and offers a promising foundation for keyboard-free text input in wearable and mixed-reality systems. Previous sEMG-to-text studies mainly focused on recognizing letters directly from sEMG signals, forming an important first step toward translating muscle activity into text. Building on this foundation, we present MyoText, a hierarchical framework that decodes sEMG signals to text through physiologically grounded intermediate stages. MyoText first classifies finger activations from multichannel sEMG using a CNN-BiLSTM-Attention model, applies ergonomic typing priors to infer letters, and reconstructs full sentences with a fine-tuned T5 transformer. This modular design mirrors the natural hierarchy of typing, linking muscle intent to language output and reducing the search space for decoding. Evaluated on 30 users from the emg2qwerty dataset, MyoText outperforms baselines by achieving 85.4% finger-classification accuracy, 5.4% character error rate (CER), and 6.5% word error rate (WER). Beyond accuracy gains, this methodology establishes a principled pathway from neuromuscular signals to text, providing a blueprint for virtual and augmented-reality typing interfaces that operate entirely without physical keyboards. By integrating ergonomic structure with transformer-based linguistic reasoning, MyoText advances the feasibility of seamless, wearable neural input for future ubiquitous computing environments.

翻译：表面肌电信号为解码肌肉活动提供了直接的神经接口，并为可穿戴与混合现实系统中的无键盘文本输入奠定了重要基础。先前将表面肌电信号转换为文本的研究主要集中于直接从信号中识别字母，这是将肌肉活动转化为文本的关键第一步。在此基础上，我们提出了MyoText——一种通过生理学驱动的中间阶段将表面肌电信号解码为文本的分层框架。MyoText首先使用CNN-BiLSTM-Attention模型从多通道表面肌电信号中分类手指激活动作，应用符合人体工程学的打字先验知识推断字母，并通过微调的T5 Transformer重构完整句子。这种模块化设计反映了打字的自然层次结构，将肌肉意图与语言输出相连接，并缩小了解码的搜索空间。在emg2qwerty数据集的30名用户上评估，MyoText以85.4%的手指分类准确率、5.4%的字符错误率和6.5%的单词错误率优于基线方法。除了精度提升，该方法建立了从神经肌肉信号到文本的规范化路径，为完全无需物理键盘的虚拟与增强现实打字界面提供了蓝图。通过将人体工程学结构与基于Transformer的语言推理相结合，MyoText推进了未来泛在计算环境中无缝可穿戴神经输入技术的可行性。