基于光照序列估计的单目法线估计 (Monocular Normal Estimation via Shading Sequence Estimation) - 专知论文

会员服务 ·

0

序列 · 重建 · 数据集 · RGB图像 · 深度模型 ·

Monocular Normal Estimation via Shading Sequence Estimation

翻译：基于光照序列估计的单目法线估计

Zongrui Li,Xinhua Ma,Minghui Hu,Yunqing Zhao,Yingchen Yu,Qian Zheng,Chang Liu,Xudong Jiang,Song Bai

from arxiv, Accepted by ICLR 2026 (Oral Presentation)

Monocular normal estimation aims to estimate the normal map from a single RGB image of an object under arbitrary lights. Existing methods rely on deep models to directly predict normal maps. However, they often suffer from 3D misalignment: while the estimated normal maps may appear to have a correct appearance, the reconstructed surfaces often fail to align with the geometric details. We argue that this misalignment stems from the current paradigm: the model struggles to distinguish and reconstruct varying geometry represented in normal maps, as the differences in underlying geometry are reflected only through relatively subtle color variations. To address this issue, we propose a new paradigm that reformulates normal estimation as shading sequence estimation, where shading sequences are more sensitive to various geometric information. Building on this paradigm, we present RoSE, a method that leverages image-to-video generative models to predict shading sequences. The predicted shading sequences are then converted into normal maps by solving a simple ordinary least-squares problem. To enhance robustness and better handle complex objects, RoSE is trained on a synthetic dataset, MultiShade, with diverse shapes, materials, and light conditions. Experiments demonstrate that RoSE achieves state-of-the-art performance on real-world benchmark datasets for object-based monocular normal estimation.

翻译：单目法线估计旨在从任意光照条件下物体的单张RGB图像中估计法线贴图。现有方法依赖深度模型直接预测法线贴图。然而，这些方法常受三维错位问题困扰：尽管估计的法线贴图可能呈现正确的表观，重建的表面往往无法与几何细节对齐。我们认为这种错位源于当前范式：模型难以区分和重建法线贴图中表征的变化几何结构，因为底层几何的差异仅通过相对细微的颜色变化反映。为解决此问题，我们提出一种新范式，将法线估计重新定义为光照序列估计，其中光照序列对各种几何信息更为敏感。基于此范式，我们提出RoSE方法，该方法利用图像到视频生成模型预测光照序列。预测的光照序列随后通过求解简单的普通最小二乘问题转换为法线贴图。为增强鲁棒性并更好地处理复杂物体，RoSE在包含多样形状、材质和光照条件的合成数据集MultiShade上进行训练。实验表明，RoSE在面向物体的单目法线估计真实世界基准数据集上取得了最先进的性能。

0

相关内容

数学上，序列是被排成一列的对象（或事件）；这样每个元素不是在其他元素之前，就是在其他元素之后。这里，元素之间的顺序非常重要。

【博士论文】单目三维目标检测的泛化

【博士论文】单目三维目标检测的泛化

专知会员服务

13+阅读 · 2025年8月28日

【剑桥博士论文】单目 3D 人体重建的概率方法

【剑桥博士论文】单目 3D 人体重建的概率方法

专知会员服务

11+阅读 · 2025年1月31日

基于深度学习的物体姿态估计综述

基于深度学习的物体姿态估计综述

专知会员服务

26+阅读 · 2024年5月15日

【TPAMI2022】基于立体深度估计的深度学习技术综述，A Survey on Deep Learning Techniques for Stereo-based Depth Estimation

【TPAMI2022】基于立体深度估计的深度学习技术综述，A Survey on Deep Learning Techniques for Stereo-based Depth Estimation

专知会员服务

21+阅读 · 2022年3月10日

【AAAI2022】基于特征纯化的视线估计算法

【AAAI2022】基于特征纯化的视线估计算法

专知会员服务

10+阅读 · 2022年2月11日

【博士论文】基于深度学习的单目场景深度估计方法研究

【博士论文】基于深度学习的单目场景深度估计方法研究

专知会员服务

57+阅读 · 2021年12月8日

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

专知会员服务

18+阅读 · 2021年5月3日

流体运动估计光流算法研究综述

专知会员服务

32+阅读 · 2021年2月17日

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

专知会员服务

23+阅读 · 2020年4月9日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

编辑推荐 | 红外弱小目标检测算法综述

编辑推荐 | 红外弱小目标检测算法综述

中国图象图形学报

21+阅读 · 2020年10月12日

视线估计(Gaze Estimation)简介(一)：概述

视线估计(Gaze Estimation)简介(一)：概述

CVer

10+阅读 · 2020年3月18日

深度学习人体姿态估计算法综述

深度学习人体姿态估计算法综述

AI前线

25+阅读 · 2019年5月19日

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

全球人工智能

13+阅读 · 2019年4月30日

基础目标检测算法介绍（一）：CNN、RCNN、Fast RCNN和Faster RCNN

基础目标检测算法介绍（一）：CNN、RCNN、Fast RCNN和Faster RCNN

论智

16+阅读 · 2018年10月16日

博客 | 基于深度学习的目标检测算法综述（二）

博客 | 基于深度学习的目标检测算法综述（二）

AI研习社

11+阅读 · 2018年8月22日

六种人体姿态估计的深度学习模型和代码总结

六种人体姿态估计的深度学习模型和代码总结

论智

19+阅读 · 2018年6月27日

CVPR 2018 | 商汤科技Spotlight论文详解：单目深度估计技术

CVPR 2018 | 商汤科技Spotlight论文详解：单目深度估计技术

商汤科技

14+阅读 · 2018年6月2日

基于深度学习的目标检测算法综述

基于深度学习的目标检测算法综述

AI研习社

15+阅读 · 2018年4月25日

【干货】结合单阶段和两阶段目标检测的优势：基于单次精化神经网络的目标检测方法

【干货】结合单阶段和两阶段目标检测的优势：基于单次精化神经网络的目标检测方法

专知

12+阅读 · 2018年1月12日

未知环境下基于单目视觉的移动平台目标跟踪方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

宽带光控相控阵空间谱估计测向问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于视觉特性的目标检测算法研究

国家自然科学基金

4+阅读 · 2015年12月31日

大规模参数估计的约束无导数优化信赖域方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于特征学习的空间非合作目标单目视觉位姿测量研究

国家自然科学基金

2+阅读 · 2015年12月31日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

基于稀疏互质阵列的DOA估计算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

光滑函数类的熵数估计

国家自然科学基金

0+阅读 · 2015年12月31日

基于结构光场照明的单像素成像技术及应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂场景点线光流三维重建模型的建立及鲁棒性分析

国家自然科学基金

2+阅读 · 2014年12月31日

Monocular Normal Estimation via Shading Sequence Estimation

Arxiv

0+阅读 · 2月10日

Online monotone density estimation and log-optimal calibration

Arxiv

0+阅读 · 2月9日

You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation

Arxiv

0+阅读 · 2月7日

FMPose3D: monocular 3D pose estimation via flow matching

Arxiv

0+阅读 · 2月5日

EAG3R: Event-Augmented 3D Geometry Estimation for Dynamic and Extreme-Lighting Scenes

Arxiv

0+阅读 · 2月4日

Multi-target DoA estimation with a single Rydberg atomic receiver by spectral analysis of spatially-resolved fluorescence

Arxiv

0+阅读 · 1月30日

Physically Guided Visual Mass Estimation from a Single RGB Image

Arxiv

0+阅读 · 1月28日

Low-Latency and Low-Complexity MLSE for Short-Reach Optical Interconnects

Arxiv

0+阅读 · 1月27日

Flow Matching for Probabilistic Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月23日

On the Role of Rotation Equivariance in Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月20日

VIP会员

文章信息

相关主题

相关VIP内容

【博士论文】单目三维目标检测的泛化

【博士论文】单目三维目标检测的泛化

专知会员服务

13+阅读 · 2025年8月28日

【剑桥博士论文】单目 3D 人体重建的概率方法

【剑桥博士论文】单目 3D 人体重建的概率方法

专知会员服务

11+阅读 · 2025年1月31日

基于深度学习的物体姿态估计综述

基于深度学习的物体姿态估计综述

专知会员服务

26+阅读 · 2024年5月15日

【TPAMI2022】基于立体深度估计的深度学习技术综述，A Survey on Deep Learning Techniques for Stereo-based Depth Estimation

【TPAMI2022】基于立体深度估计的深度学习技术综述，A Survey on Deep Learning Techniques for Stereo-based Depth Estimation

专知会员服务

21+阅读 · 2022年3月10日

【AAAI2022】基于特征纯化的视线估计算法

【AAAI2022】基于特征纯化的视线估计算法

专知会员服务

10+阅读 · 2022年2月11日

【博士论文】基于深度学习的单目场景深度估计方法研究

【博士论文】基于深度学习的单目场景深度估计方法研究

专知会员服务

57+阅读 · 2021年12月8日

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

专知会员服务

18+阅读 · 2021年5月3日

流体运动估计光流算法研究综述

专知会员服务

32+阅读 · 2021年2月17日

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

专知会员服务

23+阅读 · 2020年4月9日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

热门VIP内容

开通专知VIP会员享更多权益服务

《可信人工智能赋能系统的支柱》

《从经典神经网络到不确定性下的拓扑神经网络：军事应用》2026最新40页报告

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

《人工智能：对战略与力量的影响》slides

相关资讯

编辑推荐 | 红外弱小目标检测算法综述

编辑推荐 | 红外弱小目标检测算法综述

中国图象图形学报

21+阅读 · 2020年10月12日

视线估计(Gaze Estimation)简介(一)：概述

视线估计(Gaze Estimation)简介(一)：概述

CVer

10+阅读 · 2020年3月18日

深度学习人体姿态估计算法综述

深度学习人体姿态估计算法综述

AI前线

25+阅读 · 2019年5月19日

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

全球人工智能

13+阅读 · 2019年4月30日

基础目标检测算法介绍（一）：CNN、RCNN、Fast RCNN和Faster RCNN

基础目标检测算法介绍（一）：CNN、RCNN、Fast RCNN和Faster RCNN

论智

16+阅读 · 2018年10月16日

博客 | 基于深度学习的目标检测算法综述（二）

博客 | 基于深度学习的目标检测算法综述（二）

AI研习社

11+阅读 · 2018年8月22日

六种人体姿态估计的深度学习模型和代码总结

六种人体姿态估计的深度学习模型和代码总结

论智

19+阅读 · 2018年6月27日

CVPR 2018 | 商汤科技Spotlight论文详解：单目深度估计技术

CVPR 2018 | 商汤科技Spotlight论文详解：单目深度估计技术

商汤科技

14+阅读 · 2018年6月2日

基于深度学习的目标检测算法综述

基于深度学习的目标检测算法综述

AI研习社

15+阅读 · 2018年4月25日

【干货】结合单阶段和两阶段目标检测的优势：基于单次精化神经网络的目标检测方法

【干货】结合单阶段和两阶段目标检测的优势：基于单次精化神经网络的目标检测方法

专知

12+阅读 · 2018年1月12日

相关论文

Monocular Normal Estimation via Shading Sequence Estimation

Arxiv

0+阅读 · 2月10日

Online monotone density estimation and log-optimal calibration

Arxiv

0+阅读 · 2月9日

You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation

Arxiv

0+阅读 · 2月7日

FMPose3D: monocular 3D pose estimation via flow matching

Arxiv

0+阅读 · 2月5日

EAG3R: Event-Augmented 3D Geometry Estimation for Dynamic and Extreme-Lighting Scenes

Arxiv

0+阅读 · 2月4日

Multi-target DoA estimation with a single Rydberg atomic receiver by spectral analysis of spatially-resolved fluorescence

Arxiv

0+阅读 · 1月30日

Physically Guided Visual Mass Estimation from a Single RGB Image

Arxiv

0+阅读 · 1月28日

Low-Latency and Low-Complexity MLSE for Short-Reach Optical Interconnects

Arxiv

0+阅读 · 1月27日

Flow Matching for Probabilistic Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月23日

On the Role of Rotation Equivariance in Monocular 3D Human Pose Estimation

Arxiv

0+阅读 · 1月20日

相关基金

未知环境下基于单目视觉的移动平台目标跟踪方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

宽带光控相控阵空间谱估计测向问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于视觉特性的目标检测算法研究

国家自然科学基金

4+阅读 · 2015年12月31日

大规模参数估计的约束无导数优化信赖域方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于特征学习的空间非合作目标单目视觉位姿测量研究

国家自然科学基金

2+阅读 · 2015年12月31日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

基于稀疏互质阵列的DOA估计算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

光滑函数类的熵数估计

国家自然科学基金

0+阅读 · 2015年12月31日

基于结构光场照明的单像素成像技术及应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂场景点线光流三维重建模型的建立及鲁棒性分析

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员