Voice dictation is increasingly used for text entry, especially in mobile scenarios. However, the speech-based experience gets disrupted when users must go back to a screen and keyboard to review and edit the text. While existing dictation systems focus on improving transcription and error correction, little is known about how to support speech input for the entire text creation process, including composition, reviewing and editing. We conducted an experiment in which ten pairs of participants took on the roles of authors and typists to work on a text authoring task. By analysing the natural language patterns of both authors and typists, we identified new challenges and opportunities for the design of future dictation interfaces, including the ambiguity of human dictation, the differences between audio-only and with screen, and various passive and active assistance that can potentially be provided by future systems.
翻译:语音口述在文本输入中日益普及,尤其在移动场景下。然而,当用户必须返回屏幕和键盘进行文本审阅与编辑时,基于语音的体验会受到干扰。现有语音输入系统主要聚焦于提升转录准确率与纠错能力,但如何支持涵盖草稿撰写、审阅与修改的完整文本创作流程,仍鲜有研究。我们开展了一项实验:十组参与者分别扮演作者与打字员角色,协作完成文本创作任务。通过分析作者与打字员的自然语言模式,我们发现了未来语音输入界面设计的全新挑战与机遇,包括人类口述的歧义性、纯语音与带屏幕场景的差异,以及未来系统可提供的被动与主动辅助功能。