FindAnything：面向任意环境的机器人探索的开放词汇与以物体为中心的建图方法 (FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment) - 专知论文

会员服务 ·

0

语言特征 · 映射 · 内存 · 机器人 · 图表示 ·

FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

翻译：FindAnything：面向任意环境的机器人探索的开放词汇与以物体为中心的建图方法

Sebastián Barbas Laina,Simon Boche,Sotiris Papatheodorou,Simon Schaefer,Jaehyung Jung,Stefan Leutenegger

from arxiv, 11 pages, 5 figures

Geometrically accurate and semantically expressive map representations have proven invaluable for robot deployment and task planning in unknown environments. Nevertheless, real-time, open-vocabulary semantic understanding of large-scale unknown environments still presents open challenges, mainly due to computational requirements. In this paper we present FindAnything, an open-world mapping framework that incorporates vision-language information into dense volumetric submaps. Thanks to the use of vision-language features, FindAnything combines pure geometric and open-vocabulary semantic information for a higher level of understanding. It proposes an efficient storage of open-vocabulary information through the aggregation of features at the object level. Pixelwise vision-language features are aggregated based on eSAM segments, which are in turn integrated into object-centric volumetric submaps, providing a mapping from open-vocabulary queries to 3D geometry that is scalable also in terms of memory usage. We demonstrate that FindAnything performs on par with the state-of-the-art in terms of semantic accuracy while being substantially faster and more memory-efficient, allowing its deployment in large-scale environments and on resourceconstrained devices, such as MAVs. We show that the real-time capabilities of FindAnything make it useful for downstream tasks, such as autonomous MAV exploration in a simulated Search and Rescue scenario. Project Page: https://ethz-mrl.github.io/findanything/.

翻译：几何精确且语义丰富的地图表示已被证明对机器人在未知环境中的部署与任务规划具有重要价值。然而，对大规模未知环境进行实时、开放词汇的语义理解仍然存在公开挑战，这主要源于计算需求。本文提出FindAnything，一个将视觉-语言信息融入稠密体素子地图的开放世界建图框架。得益于视觉-语言特征的使用，FindAnything结合了纯几何信息与开放词汇语义信息，实现了更高层次的理解。它通过特征在物体层面的聚合，提出了一种高效的开放词汇信息存储方法。像素级的视觉-语言特征基于eSAM分割结果进行聚合，进而被集成到以物体为中心的体素子地图中，从而提供了一个从开放词汇查询到三维几何的映射，该映射在内存使用方面也具有可扩展性。我们证明，FindAnything在语义准确性方面与最先进方法相当，同时速度显著更快、内存效率更高，使其能够部署于大规模环境以及资源受限的设备（如微型飞行器MAV）上。我们展示了FindAnything的实时能力使其对下游任务（例如在模拟搜救场景中的自主MAV探索）非常有用。项目页面：https://ethz-mrl.github.io/findanything/。

0

相关内容

语言特征

OpenEarthAgent：一种面向工具增强型地理空间智能体的统一框架

OpenEarthAgent：一种面向工具增强型地理空间智能体的统一框架

专知会员服务

11+阅读 · 2月20日

以数据为中心的图机器学习

以数据为中心的图机器学习

专知会员服务

37+阅读 · 2023年9月25日

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

专知会员服务

66+阅读 · 2023年6月7日

【机器人自主导航】《移动与机动人工智能（AIMM）世界模型进展报告：路线侦察中的空间概念》美陆军25页技术报告

【机器人自主导航】《移动与机动人工智能（AIMM）世界模型进展报告：路线侦察中的空间概念》美陆军25页技术报告

专知会员服务

63+阅读 · 2022年12月18日

《探索基于深度学习的机器人感知技术，用于在户外地形中导航》美空军研究实验室2022最新20页报告

《探索基于深度学习的机器人感知技术，用于在户外地形中导航》美空军研究实验室2022最新20页报告

专知会员服务

34+阅读 · 2022年12月12日

【视觉和语言导航:任务、方法和未来方向的综述】Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

【视觉和语言导航:任务、方法和未来方向的综述】Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

专知会员服务

37+阅读 · 2022年3月25日

图神经网络: 方法，应用，与机会

专知会员服务

85+阅读 · 2021年8月25日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

2022人工智能十大关键词: 从大模型到可信落地，附人工智能白皮书下载

2022人工智能十大关键词: 从大模型到可信落地，附人工智能白皮书下载

专知

10+阅读 · 2022年8月18日

NeurIPS2020最新Google《图学习与挖掘》综述教程，附312页ppt与视频

NeurIPS2020最新Google《图学习与挖掘》综述教程，附312页ppt与视频

专知

22+阅读 · 2020年12月9日

IROS2020|机器人自主探索与建图算法，代码已开源！

IROS2020|机器人自主探索与建图算法，代码已开源！

中国图象图形学报

34+阅读 · 2020年9月8日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知

76+阅读 · 2020年1月27日

通俗易懂！《图机器学习导论》附69页PPT

通俗易懂！《图机器学习导论》附69页PPT

专知

55+阅读 · 2019年12月27日

【泡泡图灵智库】基于草图的图像检索的零元学习

【泡泡图灵智库】基于草图的图像检索的零元学习

泡泡机器人SLAM

12+阅读 · 2019年9月16日

搜狗开源机器阅读理解工具箱

搜狗开源机器阅读理解工具箱

专知

19+阅读 · 2019年5月16日

GitHub万星的中文机器学习资源：路线图、视频、电子书、学习建议全在这

GitHub万星的中文机器学习资源：路线图、视频、电子书、学习建议全在这

量子位

38+阅读 · 2019年4月19日

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

AI100

33+阅读 · 2019年3月16日

报名 | 让机器读懂你的意图——人体姿态估计入门

报名 | 让机器读懂你的意图——人体姿态估计入门

人工智能头条

10+阅读 · 2017年9月19日

基于三维激光测距的移动机器人室外环境语义地图构建

国家自然科学基金

2+阅读 · 2015年12月31日

不确定环境下的自主移动机器人目标搜索问题研究

国家自然科学基金

50+阅读 · 2015年12月31日

“自然语言-草图”耦合的地理场景查询方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

定位系统细胞启发的机器人情景认知地图构建与行为规划研究

国家自然科学基金

3+阅读 · 2015年12月31日

广域动态的野外环境中移动机器人六维全局定位方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

状态空间搜索的anytime模式及其高效算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

未知环境中移动机器人探索式路径规划方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

中文句子语义概念图自动构建方法及应用研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于深度学习的特征融合在移动机器人视觉中的场景理解及研究

国家自然科学基金

12+阅读 · 2014年12月31日

开放动态环境下在线机器学习理论与方法

国家自然科学基金

11+阅读 · 2013年12月31日

I-FailSense: Towards General Robotic Failure Detection with Vision-Language Models

I-FailSense: Towards General Robotic Failure Detection with Vision-Language Models

Arxiv

0+阅读 · 2月19日

LAMP: Implicit Language Map for Robot Navigation

Arxiv

0+阅读 · 2月12日

AgenticLab: A Real-World Robot Agent Platform that Can See, Think, and Act

Arxiv

0+阅读 · 2月9日

Lan-grasp: Using Large Language Models for Semantic Object Grasping and Placement

Arxiv

0+阅读 · 2月6日

RANGER: A Monocular Zero-Shot Semantic Navigation Framework through Contextual Adaptation

Arxiv

0+阅读 · 2月5日

Applying Ground Robot Fleets in Urban Search: Understanding Professionals' Operational Challenges and Design Opportunities

Arxiv

0+阅读 · 2月4日

LIEREx: Language-Image Embeddings for Robotic Exploration

Arxiv

0+阅读 · 2月2日

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Arxiv

0+阅读 · 1月23日

HumanDiffusion: A Vision-Based Diffusion Trajectory Planner with Human-Conditioned Goals for Search and Rescue UAV

Arxiv

0+阅读 · 1月23日

HumanDiffusion: A Vision-Based Diffusion Trajectory Planner with Human-Conditioned Goals for Search and Rescue UAV

Arxiv

0+阅读 · 1月21日

VIP会员

文章信息

相关主题

相关VIP内容

OpenEarthAgent：一种面向工具增强型地理空间智能体的统一框架

OpenEarthAgent：一种面向工具增强型地理空间智能体的统一框架

专知会员服务

11+阅读 · 2月20日

以数据为中心的图机器学习

以数据为中心的图机器学习

专知会员服务

37+阅读 · 2023年9月25日

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

什么是Machine Unlearning?悉尼科大等最新《机器遗忘》综述，36页pdf详述其技术体系

专知会员服务

66+阅读 · 2023年6月7日

【机器人自主导航】《移动与机动人工智能（AIMM）世界模型进展报告：路线侦察中的空间概念》美陆军25页技术报告

【机器人自主导航】《移动与机动人工智能（AIMM）世界模型进展报告：路线侦察中的空间概念》美陆军25页技术报告

专知会员服务

63+阅读 · 2022年12月18日

《探索基于深度学习的机器人感知技术，用于在户外地形中导航》美空军研究实验室2022最新20页报告

《探索基于深度学习的机器人感知技术，用于在户外地形中导航》美空军研究实验室2022最新20页报告

专知会员服务

34+阅读 · 2022年12月12日

【视觉和语言导航:任务、方法和未来方向的综述】Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

【视觉和语言导航:任务、方法和未来方向的综述】Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

专知会员服务

37+阅读 · 2022年3月25日

图神经网络: 方法，应用，与机会

专知会员服务

85+阅读 · 2021年8月25日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

热门VIP内容

开通专知VIP会员享更多权益服务

《可信人工智能赋能系统的支柱》

《从经典神经网络到不确定性下的拓扑神经网络：军事应用》2026最新40页报告

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

《人工智能：对战略与力量的影响》slides

相关资讯

2022人工智能十大关键词: 从大模型到可信落地，附人工智能白皮书下载

2022人工智能十大关键词: 从大模型到可信落地，附人工智能白皮书下载

专知

10+阅读 · 2022年8月18日

NeurIPS2020最新Google《图学习与挖掘》综述教程，附312页ppt与视频

NeurIPS2020最新Google《图学习与挖掘》综述教程，附312页ppt与视频

专知

22+阅读 · 2020年12月9日

IROS2020|机器人自主探索与建图算法，代码已开源！

IROS2020|机器人自主探索与建图算法，代码已开源！

中国图象图形学报

34+阅读 · 2020年9月8日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知

76+阅读 · 2020年1月27日

通俗易懂！《图机器学习导论》附69页PPT

通俗易懂！《图机器学习导论》附69页PPT

专知

55+阅读 · 2019年12月27日

【泡泡图灵智库】基于草图的图像检索的零元学习

【泡泡图灵智库】基于草图的图像检索的零元学习

泡泡机器人SLAM

12+阅读 · 2019年9月16日

搜狗开源机器阅读理解工具箱

搜狗开源机器阅读理解工具箱

专知

19+阅读 · 2019年5月16日

GitHub万星的中文机器学习资源：路线图、视频、电子书、学习建议全在这

GitHub万星的中文机器学习资源：路线图、视频、电子书、学习建议全在这

量子位

38+阅读 · 2019年4月19日

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

AI100

33+阅读 · 2019年3月16日

报名 | 让机器读懂你的意图——人体姿态估计入门

报名 | 让机器读懂你的意图——人体姿态估计入门

人工智能头条

10+阅读 · 2017年9月19日

相关论文

I-FailSense: Towards General Robotic Failure Detection with Vision-Language Models

I-FailSense: Towards General Robotic Failure Detection with Vision-Language Models

Arxiv

0+阅读 · 2月19日

LAMP: Implicit Language Map for Robot Navigation

Arxiv

0+阅读 · 2月12日

AgenticLab: A Real-World Robot Agent Platform that Can See, Think, and Act

Arxiv

0+阅读 · 2月9日

Lan-grasp: Using Large Language Models for Semantic Object Grasping and Placement

Arxiv

0+阅读 · 2月6日

RANGER: A Monocular Zero-Shot Semantic Navigation Framework through Contextual Adaptation

Arxiv

0+阅读 · 2月5日

Applying Ground Robot Fleets in Urban Search: Understanding Professionals' Operational Challenges and Design Opportunities

Arxiv

0+阅读 · 2月4日

LIEREx: Language-Image Embeddings for Robotic Exploration

Arxiv

0+阅读 · 2月2日

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Arxiv

0+阅读 · 1月23日

HumanDiffusion: A Vision-Based Diffusion Trajectory Planner with Human-Conditioned Goals for Search and Rescue UAV

Arxiv

0+阅读 · 1月23日

HumanDiffusion: A Vision-Based Diffusion Trajectory Planner with Human-Conditioned Goals for Search and Rescue UAV

Arxiv

0+阅读 · 1月21日

相关基金

基于三维激光测距的移动机器人室外环境语义地图构建

国家自然科学基金

2+阅读 · 2015年12月31日

不确定环境下的自主移动机器人目标搜索问题研究

国家自然科学基金

50+阅读 · 2015年12月31日

“自然语言-草图”耦合的地理场景查询方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

定位系统细胞启发的机器人情景认知地图构建与行为规划研究

国家自然科学基金

3+阅读 · 2015年12月31日

广域动态的野外环境中移动机器人六维全局定位方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

状态空间搜索的anytime模式及其高效算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

未知环境中移动机器人探索式路径规划方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

中文句子语义概念图自动构建方法及应用研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于深度学习的特征融合在移动机器人视觉中的场景理解及研究

国家自然科学基金

12+阅读 · 2014年12月31日

开放动态环境下在线机器学习理论与方法

国家自然科学基金

11+阅读 · 2013年12月31日

微信扫码咨询专知VIP会员