Enriching the robot representation of the operational environment is a challenging task that aims at bridging the gap between low-level sensor readings and high-level semantic understanding. Having a rich representation often requires computationally demanding architectures and pure point cloud based detection systems that struggle when dealing with everyday objects that have to be handled by the robot. To overcome these issues, we propose a graph-based representation that addresses this gap by providing a semantic representation of robot environments from multiple sources. In fact, to acquire information from the environment, the framework combines classical computer vision tools with modern computer vision cloud services, ensuring computational feasibility on onboard hardware. By incorporating an ontology hierarchy with over 800 object classes, the framework achieves cross-domain adaptability, eliminating the need for environment-specific tools. The proposed approach allows us to handle also small objects and integrate them into the semantic representation of the environment. The approach is implemented in the Robot Operating System (ROS) using the RViz visualizer for environment representation. This work is a first step towards the development of a general-purpose framework, to facilitate intuitive interaction and navigation across different domains.
翻译:丰富机器人对操作环境的表示是一项具有挑战性的任务,旨在弥合低层级传感器读数与高层级语义理解之间的鸿沟。获得丰富的表示通常需要高计算复杂度的架构以及纯点云检测系统,而这类系统在处理机器人必须操控的日常物体时常常面临困难。为克服这些问题,我们提出一种基于图的表示方法,通过从多源信息中提供机器人环境的语义表示来弥合这一差距。具体而言,该框架结合了经典计算机视觉工具与现代计算机视觉云服务,以从环境中获取信息,同时确保机载硬件的计算可行性。通过整合包含800多个物体类别的本体层次结构,该框架实现了跨域适应性,无需针对特定环境定制工具。所提出的方法还能够处理小型物体,并将其融入环境的语义表示中。该方法的实现基于机器人操作系统(ROS),并采用RViz可视化器进行环境表示。本工作是迈向开发通用框架的第一步,旨在促进跨领域中的直观交互与导航。