This paper explores the design and development of a language-based interface for dynamic mission programming of autonomous underwater vehicles (AUVs). The proposed 'Word2Wave' (W2W) framework enables interactive programming and parameter configuration of AUVs for remote subsea missions. The W2W framework includes: (i) a set of novel language rules and command structures for efficient language-to-mission mapping; (ii) a GPT-based prompt engineering module for training data generation; (iii) a small language model (SLM)-based sequence-to-sequence learning pipeline for mission command generation from human speech or text; and (iv) a novel user interface for 2D mission map visualization and human-machine interfacing. The proposed learning pipeline adapts an SLM named T5-Small that can learn language-to-mission mapping from processed language data effectively, providing robust and efficient performance. In addition to a benchmark evaluation with state-of-the-art, we conduct a user interaction study to demonstrate the effectiveness of W2W over commercial AUV programming interfaces. Across participants, W2W-based programming required less than 10% time for mission programming compared to traditional interfaces; it is deemed to be a simpler and more natural paradigm for subsea mission programming with a usability score of 76.25. W2W opens up promising future research opportunities on hands-free AUV mission programming for efficient subsea deployments.
翻译:本文探讨了面向自主水下航行器动态任务编程的语言接口的设计与开发。所提出的“Word2Wave”框架支持对远程水下任务的AUV进行交互式编程与参数配置。该框架包含:(i)一套用于高效语言-任务映射的新型语言规则与命令结构;(ii)一个基于GPT的提示工程模块,用于生成训练数据;(iii)一个基于小型语言模型的序列到序列学习流程,用于从人类语音或文本生成任务指令;以及(iv)一个用于二维任务地图可视化与人机交互的新型用户界面。所提出的学习流程适配了一个名为T5-Small的SLM,该模型能够从处理后的语言数据中有效学习语言-任务映射,提供稳健且高效的性能。除了与前沿技术进行基准评估外,我们还开展了一项用户交互研究,以证明W2W相较于商用AUV编程接口的有效性。在所有参与者中,基于W2W的编程所需时间仅为传统接口的不到10%;它被认为是一种更简单、更自然的水下任务编程范式,可用性得分为76.25。W2W为未来实现高效水下部署的免提式AUV任务编程开辟了广阔的研究前景。