Tracking objects that move within dynamic environments is a core challenge in robotics. Recent research has advanced this topic significantly; however, many existing approaches remain inefficient due to their reliance on heavy foundation models. To address this limitation, we propose LOST-3DSG, a lightweight open-vocabulary 3D scene graph designed to track dynamic objects in real-world environments. Our method adopts a semantic approach to entity tracking based on word2vec and sentence embeddings, enabling an open-vocabulary representation while avoiding the necessity of storing dense CLIP visual features. As a result, LOST-3DSG achieves superior performance compared to approaches that rely on high-dimensional visual embeddings. We evaluate our method through qualitative and quantitative experiments conducted in a real 3D environment using a TIAGo robot. The results demonstrate the effectiveness and efficiency of LOST-3DSG in dynamic object tracking. Code and supplementary material are publicly available on the project website at https://lab-rococo-sapienza.github.io/lost-3dsg/.
翻译:在动态环境中跟踪移动物体是机器人技术中的一个核心挑战。尽管近期研究已在该领域取得显著进展,但许多现有方法因依赖重型基础模型而效率低下。为克服这一局限,我们提出了LOST-3DSG——一种专为在真实世界环境中跟踪动态物体而设计的轻量级开放词汇3D场景图。我们的方法采用基于word2vec和句子嵌入的语义实体跟踪策略,既能实现开放词汇表征,又无需存储稠密的CLIP视觉特征。因此,与依赖高维视觉嵌入的方法相比,LOST-3DSG展现出更优越的性能。我们通过在真实3D环境中使用TIAGo机器人进行定性与定量实验来评估该方法。结果表明,LOST-3DSG在动态物体跟踪任务中兼具高效性与有效性。相关代码及补充材料已在项目网站https://lab-rococo-sapienza.github.io/lost-3dsg/公开。