To autonomously navigate in real-world environments, special in search and rescue operations, Unmanned Aerial Vehicles (UAVs) necessitate comprehensive maps to ensure safety. However, the prevalent metric map often lacks semantic information crucial for holistic scene comprehension. In this paper, we proposed a system to construct a probabilistic metric map enriched with object information extracted from the environment from RGB-D images. Our approach combines a state-of-the-art YOLOv8-based object detection framework at the front end and a 2D SLAM method - CartoGrapher at the back end. To effectively track and position semantic object classes extracted from the front-end interface, we employ the innovative BoT-SORT methodology. A novel association method is introduced to extract the position of objects and then project it with the metric map. Unlike previous research, our approach takes into reliable navigating in the environment with various hollow bottom objects. The output of our system is a probabilistic map, which significantly enhances the map's representation by incorporating object-specific attributes, encompassing class distinctions, accurate positioning, and object heights. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively produce augmented semantic maps containing several objects (notably chairs and desks). Furthermore, our system is evaluated within an embedded computer - Jetson Xavier AGX unit to demonstrate the use case in real-world applications.
翻译:为实现真实环境中的自主导航,尤其在搜索与救援任务中,无人机需要构建全面的地图以确保安全性。然而,当前主流的度量地图往往缺乏对于整体场景理解至关重要的语义信息。本文提出了一种系统,能从RGB-D图像中提取环境中的对象信息,并构建富含对象信息的概率度量地图。该方法在前端结合了基于最新YOLOv8的目标检测框架,在后端采用了2D同时定位与建图方法——CartoGrapher。为有效追踪和定位前端接口提取的语义对象类别,我们采用了创新的BoT-SORT方法。本文引入了一种新颖的关联方法,用于提取对象位置,并将其与度量地图进行投影映射。与以往研究不同,我们的方法考虑了环境中各类空心底部物体的可靠导航。系统输出为概率地图,通过融入对象特有属性(包括类别区分、精确位置及对象高度),显著增强了地图的表征能力。为评估所提方法,我们进行了大量实验。结果表明,机器人能够有效生成包含多个对象(尤其是椅子和桌子)的增强语义地图。此外,系统在嵌入式计算机Jetson Xavier AGX上进行了评估,以展示其在真实应用场景中的实用性。