RFM-Pose：基于强化引导流匹配的快速类别级6D姿态估计 (RFM-Pose:Reinforcement-Guided Flow Matching for Fast Category-Level 6D Pose Estimation) - 专知论文

会员服务 ·

0

类别 · 姿态估计 · 流匹配 · 生成模型 · CVPR 2022 ·

RFM-Pose:Reinforcement-Guided Flow Matching for Fast Category-Level 6D Pose Estimation

翻译：RFM-Pose：基于强化引导流匹配的快速类别级6D姿态估计

Diya He,Qingchen Liu,Cong Zhang,Jiahu Qin

from arxiv, This work has been submitted to the IEEE for possible publication

Object pose estimation is a fundamental problem in computer vision and plays a critical role in virtual reality and embodied intelligence, where agents must understand and interact with objects in 3D space. Recently, score based generative models have to some extent solved the rotational symmetry ambiguity problem in category level pose estimation, but their efficiency remains limited by the high sampling cost of score-based diffusion. In this work, we propose a new framework, RFM-Pose, that accelerates category-level 6D object pose generation while actively evaluating sampled hypotheses. To improve sampling efficiency, we adopt a flow-matching generative model and generate pose candidates along an optimal transport path from a simple prior to the pose distribution. To further refine these candidates, we cast the flow-matching sampling process as a Markov decision process and apply proximal policy optimization to fine-tune the sampling policy. In particular, we interpret the flow field as a learnable policy and map an estimator to a value network, enabling joint optimization of pose generation and hypothesis scoring within a reinforcement learning framework. Experiments on the REAL275 benchmark demonstrate that RFM-Pose achieves favorable performance while significantly reducing computational cost. Moreover, similar to prior work, our approach can be readily adapted to object pose tracking and attains competitive results in this setting.

翻译：物体姿态估计是计算机视觉中的一个基本问题，在虚拟现实和具身智能中起着关键作用，其中智能体必须理解并与三维空间中的物体进行交互。近年来，基于分数的生成模型在一定程度上解决了类别级姿态估计中的旋转对称性模糊问题，但其效率仍受限于基于分数的扩散模型的高采样成本。在本工作中，我们提出了一个新框架RFM-Pose，该框架在主动评估采样假设的同时，加速了类别级6D物体姿态的生成。为提高采样效率，我们采用流匹配生成模型，并沿着从简单先验分布到姿态分布的最优传输路径生成姿态候选。为进一步优化这些候选姿态，我们将流匹配采样过程建模为马尔可夫决策过程，并应用近端策略优化对采样策略进行微调。具体而言，我们将流场解释为可学习的策略，并将一个估计器映射为价值网络，从而在强化学习框架内实现姿态生成与假设评分的联合优化。在REAL275基准测试上的实验表明，RFM-Pose在显著降低计算成本的同时，取得了优越的性能。此外，与先前工作类似，我们的方法可以轻松适配于物体姿态跟踪任务，并在该设定下获得了具有竞争力的结果。

0

相关内容

迈向深度基础模型：基于视觉的深度估计最新趋势

迈向深度基础模型：基于视觉的深度估计最新趋势

专知会员服务

23+阅读 · 2025年7月16日

基于深度学习的物体姿态估计综述

基于深度学习的物体姿态估计综述

专知会员服务

26+阅读 · 2024年5月15日

多模态认知计算

多模态认知计算

专知会员服务

182+阅读 · 2022年9月16日

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

专知会员服务

27+阅读 · 2022年3月22日

NeurIPS 2021 | AP-10K：学界最大动物姿态估计数据集问世，更多数量、更多种类、更多任务

NeurIPS 2021 | AP-10K：学界最大动物姿态估计数据集问世，更多数量、更多种类、更多任务

专知会员服务

14+阅读 · 2021年11月4日

ICCV 2021 Oral | 基于点云的类级别刚体与带关节物体位姿追踪

专知会员服务

11+阅读 · 2021年9月23日

最新《深度学习人体姿态估计》综述论文，26页pdf

最新《深度学习人体姿态估计》综述论文，26页pdf

专知会员服务

40+阅读 · 2020年12月29日

【NeurIPS 2020】一种端到端全自由度抓取姿态估计网络简介

【NeurIPS 2020】一种端到端全自由度抓取姿态估计网络简介

专知会员服务

20+阅读 · 2020年10月18日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

专知会员服务

25+阅读 · 2020年4月2日

数据驱动的态势认知技术及发展思考

数据驱动的态势认知技术及发展思考

专知

18+阅读 · 2022年7月12日

计算机视觉方向简介 | 人体姿态估计

计算机视觉方向简介 | 人体姿态估计

计算机视觉life

28+阅读 · 2019年6月6日

深度学习人体姿态估计算法综述

深度学习人体姿态估计算法综述

AI前线

25+阅读 · 2019年5月19日

刷新三项COCO纪录！姿态估计模型HRNet开源了，中科大微软出品 | CVPR

刷新三项COCO纪录！姿态估计模型HRNet开源了，中科大微软出品 | CVPR

量子位

11+阅读 · 2019年2月28日

SkeletonNet：完整的人体三维位姿重建方法

SkeletonNet：完整的人体三维位姿重建方法

计算机视觉life

21+阅读 · 2019年1月21日

重磅！头部姿态估计「原理详解 + 实战代码」来啦！

重磅！头部姿态估计「原理详解 + 实战代码」来啦！

计算机视觉life

57+阅读 · 2018年11月29日

六种人体姿态估计的深度学习模型和代码总结

六种人体姿态估计的深度学习模型和代码总结

论智

19+阅读 · 2018年6月27日

干货|张锋 2D单人人体姿态估计及其应用（PPT+视频）

干货|张锋 2D单人人体姿态估计及其应用（PPT+视频）

极市平台

12+阅读 · 2018年2月2日

干货｜基于双流递归神经网络的人体骨架行为识别！

干货｜基于双流递归神经网络的人体骨架行为识别！

全球人工智能

13+阅读 · 2017年12月15日

报名 | 让机器读懂你的意图——人体姿态估计入门

报名 | 让机器读懂你的意图——人体姿态估计入门

人工智能头条

10+阅读 · 2017年9月19日

面向估计性能优化的网络化控制系统传感器调度

国家自然科学基金

0+阅读 · 2015年12月31日

基于MEMS加速度传感器的智能终端手势识别及三维交互模型

国家自然科学基金

6+阅读 · 2015年12月31日

基于对象模型与多点空间统计的高分辨率遥感影像分类策略

国家自然科学基金

4+阅读 · 2015年12月31日

移动增强现实中基于视觉—惯性传感器的混合跟踪方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

三维空间中基于图结构的人体姿态估计算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

MRF模型的车载全景视觉位姿估计最优化方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于多个小型微惯性/磁强计测量单元的手势识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于光流和运动核估计的航天器姿态运动参数估计方法

国家自然科学基金

0+阅读 · 2015年12月31日

移动终端视频目标快速识别技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Equivariant symmetry-aware head pose estimation for fetal MRI

Arxiv

0+阅读 · 2月11日

Perception with Guarantees: Certified Pose Estimation via Reachability Analysis

Arxiv

0+阅读 · 2月10日

Benchmarking 3D Human Pose Estimation Models under Occlusions

Arxiv

0+阅读 · 2月10日

WiFlow: A Lightweight WiFi-based Continuous Human Pose Estimation Network with Spatio-Temporal Feature Decoupling

Arxiv

0+阅读 · 2月9日

FMPose3D: monocular 3D pose estimation via flow matching

Arxiv

0+阅读 · 2月5日

Flexible Geometric Guidance for Probabilistic Human Pose Estimation with Diffusion Models

Arxiv

0+阅读 · 2月3日

Flow Matching for Probabilistic Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月23日

On the Role of Rotation Equivariance in Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月20日

Diffusion-based Inverse Model of a Distributed Tactile Sensor for Object Pose Estimation

Arxiv

0+阅读 · 1月19日

COMPOSE: Hypergraph Cover Optimization for Multi-view 3D Human Pose Estimation

Arxiv

0+阅读 · 1月14日

VIP会员

文章信息

相关主题

相关VIP内容

迈向深度基础模型：基于视觉的深度估计最新趋势

迈向深度基础模型：基于视觉的深度估计最新趋势

专知会员服务

23+阅读 · 2025年7月16日

基于深度学习的物体姿态估计综述

基于深度学习的物体姿态估计综述

专知会员服务

26+阅读 · 2024年5月15日

多模态认知计算

多模态认知计算

专知会员服务

182+阅读 · 2022年9月16日

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

专知会员服务

27+阅读 · 2022年3月22日

NeurIPS 2021 | AP-10K：学界最大动物姿态估计数据集问世，更多数量、更多种类、更多任务

NeurIPS 2021 | AP-10K：学界最大动物姿态估计数据集问世，更多数量、更多种类、更多任务

专知会员服务

14+阅读 · 2021年11月4日

ICCV 2021 Oral | 基于点云的类级别刚体与带关节物体位姿追踪

专知会员服务

11+阅读 · 2021年9月23日

最新《深度学习人体姿态估计》综述论文，26页pdf

最新《深度学习人体姿态估计》综述论文，26页pdf

专知会员服务

40+阅读 · 2020年12月29日

【NeurIPS 2020】一种端到端全自由度抓取姿态估计网络简介

【NeurIPS 2020】一种端到端全自由度抓取姿态估计网络简介

专知会员服务

20+阅读 · 2020年10月18日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

专知会员服务

25+阅读 · 2020年4月2日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基于自适应表征的高效视觉建模

《多域作战中融合网络、电子战与动能机动》

AI智能体时代大模型安全风险与攻防新挑战

迈向个性化大语言模型驱动的智能体：基础、评估与未来方向

相关资讯

数据驱动的态势认知技术及发展思考

数据驱动的态势认知技术及发展思考

专知

18+阅读 · 2022年7月12日

计算机视觉方向简介 | 人体姿态估计

计算机视觉方向简介 | 人体姿态估计

计算机视觉life

28+阅读 · 2019年6月6日

深度学习人体姿态估计算法综述

深度学习人体姿态估计算法综述

AI前线

25+阅读 · 2019年5月19日

刷新三项COCO纪录！姿态估计模型HRNet开源了，中科大微软出品 | CVPR

刷新三项COCO纪录！姿态估计模型HRNet开源了，中科大微软出品 | CVPR

量子位

11+阅读 · 2019年2月28日

SkeletonNet：完整的人体三维位姿重建方法

SkeletonNet：完整的人体三维位姿重建方法

计算机视觉life

21+阅读 · 2019年1月21日

重磅！头部姿态估计「原理详解 + 实战代码」来啦！

重磅！头部姿态估计「原理详解 + 实战代码」来啦！

计算机视觉life

57+阅读 · 2018年11月29日

六种人体姿态估计的深度学习模型和代码总结

六种人体姿态估计的深度学习模型和代码总结

论智

19+阅读 · 2018年6月27日

干货|张锋 2D单人人体姿态估计及其应用（PPT+视频）

干货|张锋 2D单人人体姿态估计及其应用（PPT+视频）

极市平台

12+阅读 · 2018年2月2日

干货｜基于双流递归神经网络的人体骨架行为识别！

干货｜基于双流递归神经网络的人体骨架行为识别！

全球人工智能

13+阅读 · 2017年12月15日

报名 | 让机器读懂你的意图——人体姿态估计入门

报名 | 让机器读懂你的意图——人体姿态估计入门

人工智能头条

10+阅读 · 2017年9月19日

相关论文

Equivariant symmetry-aware head pose estimation for fetal MRI

Arxiv

0+阅读 · 2月11日

Perception with Guarantees: Certified Pose Estimation via Reachability Analysis

Arxiv

0+阅读 · 2月10日

Benchmarking 3D Human Pose Estimation Models under Occlusions

Arxiv

0+阅读 · 2月10日

WiFlow: A Lightweight WiFi-based Continuous Human Pose Estimation Network with Spatio-Temporal Feature Decoupling

Arxiv

0+阅读 · 2月9日

FMPose3D: monocular 3D pose estimation via flow matching

Arxiv

0+阅读 · 2月5日

Flexible Geometric Guidance for Probabilistic Human Pose Estimation with Diffusion Models

Arxiv

0+阅读 · 2月3日

Flow Matching for Probabilistic Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月23日

On the Role of Rotation Equivariance in Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月20日

Diffusion-based Inverse Model of a Distributed Tactile Sensor for Object Pose Estimation

Arxiv

0+阅读 · 1月19日

COMPOSE: Hypergraph Cover Optimization for Multi-view 3D Human Pose Estimation

Arxiv

0+阅读 · 1月14日

相关基金

面向估计性能优化的网络化控制系统传感器调度

国家自然科学基金

0+阅读 · 2015年12月31日

基于MEMS加速度传感器的智能终端手势识别及三维交互模型

国家自然科学基金

6+阅读 · 2015年12月31日

基于对象模型与多点空间统计的高分辨率遥感影像分类策略

国家自然科学基金

4+阅读 · 2015年12月31日

移动增强现实中基于视觉—惯性传感器的混合跟踪方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

三维空间中基于图结构的人体姿态估计算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

MRF模型的车载全景视觉位姿估计最优化方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于多个小型微惯性/磁强计测量单元的手势识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于光流和运动核估计的航天器姿态运动参数估计方法

国家自然科学基金

0+阅读 · 2015年12月31日

移动终端视频目标快速识别技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员