PEACE: Prompt Engineering Automation for CLIPSeg Enhancement in Aerial Robotics

From industrial to space robotics, safe landing is an essential component for flight operations. With the growing interest in artificial intelligence, we direct our attention to learning based safe landing approaches. This paper extends our previous work, DOVESEI, which focused on a reactive UAV system by harnessing the capabilities of open vocabulary image segmentation. Prompt-based safe landing zone segmentation using an open vocabulary based model is no more just an idea, but proven to be feasible by the work of DOVESEI. However, a heuristic selection of words for prompt is not a reliable solution since it cannot take the changing environment into consideration and detrimental consequences can occur if the observed environment is not well represented by the given prompt. Therefore, we introduce PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), powering DOVESEI to automate the prompt generation and engineering to adapt to data distribution shifts. Our system is capable of performing safe landing operations with collision avoidance at altitudes as low as 20 meters using only monocular cameras and image segmentation. We take advantage of DOVESEI's dynamic focus to circumvent abrupt fluctuations in the terrain segmentation between frames in a video stream. PEACE shows promising improvements in prompt generation and engineering for aerial images compared to the standard prompt used for CLIP and CLIPSeg. Combining DOVESEI and PEACE, our system was able improve successful safe landing zone selections by 58.62% compared to using only DOVESEI. All the source code is open source and available online.

翻译：从工业机器人到太空机器人，安全着陆是飞行操作的关键环节。随着人工智能技术的日益发展，我们将研究重点转向基于学习的安全着陆方法。本文是对前期工作DOVESEI的延伸拓展，该工作利用开放词汇图像分割能力构建了反应式无人机系统。基于开放词汇模型的提示驱动安全着陆区域分割已不再仅是理论构想，DOVESEI的研究已证明其可行性。然而，由于无法适应动态环境变化，经验性选择提示词并非可靠方案——当观测环境与给定提示词表征不符时可能产生严重后果。为此，我们提出PEACE（面向CLIPSeg优化的提示工程自动化）框架，增强DOVESEI系统自动生成和优化提示词的能力，以应对数据分布偏移。本系统仅凭单目摄像头与图像分割技术，即可在低至20米的高度执行含碰撞规避功能的安全着陆操作。我们利用DOVESEI的动态聚焦机制，有效抑制视频流帧间地形分割的突变。实验表明，相较于CLIP与CLIPSeg的标准提示方法，PEACE在空中图像提示词生成与优化方面取得显著提升。通过DOVESEI与PEACE的协同运作，本系统的安全着陆区域选择成功率较纯DOVESEI方案提升58.62%。所有源代码均已开源并上线。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日