Tracking objects that move within dynamic environments is a core challenge in robotics. Recent research has advanced this topic significantly; however, many existing approaches remain inefficient due to their reliance on heavy foundation models. To address this limitation, we propose LOST-3DSG, a lightweight open-vocabulary 3D scene graph designed to track dynamic objects in real-world environments. Our method adopts a semantic approach to entity tracking based on word2vec and sentence embeddings, enabling an open-vocabulary representation while avoiding the necessity of storing dense CLIP visual features. As a result, LOST-3DSG achieves superior performance compared to approaches that rely on high-dimensional visual embeddings. We evaluate our method through qualitative and quantitative experiments conducted in a real 3D environment using a TIAGo robot. The results demonstrate the effectiveness and efficiency of LOST-3DSG in dynamic object tracking. Code and supplementary material are publicly available on the project website at https://lab-rococo-sapienza.github.io/lost-3dsg/.
翻译:在动态环境中跟踪移动物体是机器人学中的一个核心挑战。近期研究已在这一课题上取得显著进展;然而,许多现有方法由于依赖繁重的基础模型,效率仍然不高。为解决这一局限,我们提出了LOST-3DSG,一种轻量级的开放词汇3D场景图,旨在跟踪真实世界环境中的动态物体。我们的方法采用基于word2vec和句子嵌入的语义实体跟踪方法,实现了开放词汇表征,同时避免了存储稠密CLIP视觉特征的必要性。因此,与依赖高维视觉嵌入的方法相比,LOST-3DSG实现了更优的性能。我们通过在真实3D环境中使用TIAGo机器人进行的定性与定量实验来评估我们的方法。结果证明了LOST-3DSG在动态物体跟踪中的有效性和高效性。代码及补充材料已在项目网站 https://lab-rococo-sapienza.github.io/lost-3dsg/ 上公开。