Autonomous Underwater Robots (AURs) operate in challenging underwater environments, including low visibility and harsh water conditions. Such conditions present challenges for software engineers developing perception modules for the AUR software. To successfully carry out these tasks, deep learning has been incorporated into the AUR software to support its operations. However, the unique challenges of underwater environments pose difficulties for deep learning models, which often rely on labeled data that is scarce and noisy. This may undermine the trustworthiness of AUR software that relies on perception modules. Vision-Language Models (VLMs) offer promising solutions for AUR software as they generalize to unseen objects and remain robust in noisy conditions by inferring information from contextual cues. Despite this potential, their performance and uncertainty in underwater environments remain understudied from a software engineering perspective. Motivated by the needs of an industrial partner in assurance and risk management for maritime systems to assess the potential use of VLMs in this context, we present an empirical evaluation of VLM-based perception modules within the AUR software. We assess their ability to detect underwater trash by computing performance, uncertainty, and their relationship, to enable software engineers to select appropriate VLMs for their AUR software.
翻译:自主水下机器人(AURs)在具有挑战性的水下环境中运行,包括低能见度和恶劣的水体条件。这些条件为开发AUR软件感知模块的软件工程师带来了挑战。为成功执行这些任务,深度学习已被集成到AUR软件中以支持其操作。然而,水下环境的独特挑战给深度学习模型带来了困难,这些模型通常依赖于稀缺且带噪声的标注数据。这可能会损害依赖感知模块的AUR软件的可信度。视觉语言模型(VLMs)为AUR软件提供了有前景的解决方案,因为它们能够泛化到未见过的物体,并通过从上下文线索推断信息在噪声条件下保持鲁棒性。尽管具有这种潜力,从软件工程的角度来看,它们在水下环境中的性能和不确定性仍未得到充分研究。基于一家海事系统保证与风险管理领域工业合作伙伴评估VLMs在此背景下潜在应用的需求,我们对AUR软件中基于VLM的感知模块进行了实证评估。我们通过计算性能、不确定性及其相互关系来评估它们检测水下垃圾的能力,以使软件工程师能够为其AUR软件选择合适的VLMs。