ROS (Robot Operating System) packages have become increasingly popular as a type of software artifact that can be effectively reused in robotic software development. Indeed, finding suitable ROS packages that closely match the software's functional requirements from the vast number of available packages is a nontrivial task using current search methods. The traditional search methods for ROS packages often involve inputting keywords related to robotic tasks into general-purpose search engines or code hosting platforms to obtain approximate results of all potentially suitable ROS packages. However, the accuracy of these search methods remains relatively low because the task-related keywords may not precisely match the functionalities offered by the ROS packages. To improve the search accuracy of ROS packages, this paper presents a novel semantic-based search approach that relies on the semantic-level ROS Package Knowledge Graph (RPKG) to automatically retrieve the most suitable ROS packages. Firstly, to construct the RPKG, we employ multi-dimensional feature extraction techniques to extract semantic concepts from the dataset of ROS package text descriptions. The semantic features extracted from this process result in a substantial number of entities and relationships. Subsequently, we create a robot domain-specific small corpus and further fine-tune a pre-trained language model, BERT-ROS, to generate embeddings that effectively represent the semantics of the extracted features. These embeddings play a crucial role in facilitating semantic-level understanding and comparisons during the ROS package search process within the RPKG. Secondly, we introduce a novel semantic matching-based search algorithm that incorporates the weighted similarities of multiple features from user search queries, which searches out more accurate ROS packages than the traditional keyword search method.
翻译:ROS(机器人操作系统)包作为可有效复用于机器人软件开发的软件制品类型日益普及。事实上,从海量可用包中精准匹配软件功能需求的ROS包,采用现有搜索方法仍是一项具有挑战性的任务。传统ROS包搜索方法通常通过向通用搜索引擎或代码托管平台输入与机器人任务相关的关键词,获取所有潜在适用ROS包的近似结果。然而,由于任务相关关键词可能与ROS包功能描述存在语义偏差,这类搜索方法的准确率仍相对较低。为提升ROS包搜索精度,本文提出一种新颖的语义搜索方法,该方法基于语义层面的ROS包知识图谱(RPKG)自动检索最匹配的ROS包。首先,为构建RPKG,我们采用多维特征提取技术从ROS包文本描述数据集中抽取语义概念,该过程提取的语义特征将产生大量实体与关系。随后,我们构建机器人领域专用小型语料库,并进一步微调预训练语言模型BERT-ROS,生成能有效表征所提取特征语义的嵌入向量。这些嵌入向量在RPKG的ROS包搜索过程中对实现语义层面理解与比较具有关键作用。其次,我们提出一种融合用户搜索查询中多特征加权相似度的新型语义匹配搜索算法,该算法相比传统关键词搜索方法能检索出更精准的ROS包。