Image Masking for Robust Self-Supervised Monocular Depth Estimation - 专知论文

会员服务 ·

0

估计/估计量 · 稳健性 · 掩码 · Learning · Performer ·

2023 年 2 月 1 日

Image Masking for Robust Self-Supervised Monocular Depth Estimation

翻译：面向鲁棒自监督单目深度估计的图像掩码方法

Hemang Chawla,Kishaan Jeeveswaran,Elahe Arani,Bahram Zonooz

from arxiv, Accepted at 2023 IEEE International Conference on Robotics and Automation (ICRA)

Self-supervised monocular depth estimation is a salient task for 3D scene understanding. Learned jointly with monocular ego-motion estimation, several methods have been proposed to predict accurate pixel-wise depth without using labeled data. Nevertheless, these methods focus on improving performance under ideal conditions without natural or digital corruptions. The general absence of occlusions is assumed even for object-specific depth estimation. These methods are also vulnerable to adversarial attacks, which is a pertinent concern for their reliable deployment in robots and autonomous driving systems. We propose MIMDepth, a method that adapts masked image modeling (MIM) for self-supervised monocular depth estimation. While MIM has been used to learn generalizable features during pre-training, we show how it could be adapted for direct training of monocular depth estimation. Our experiments show that MIMDepth is more robust to noise, blur, weather conditions, digital artifacts, occlusions, as well as untargeted and targeted adversarial attacks.

翻译：自监督单目深度估计是三维场景理解中的一项重要任务。通过与单目自运动估计联合学习，现有多种方法能够在无需标注数据的情况下预测精确的逐像素深度。然而，这些方法主要关注理想条件下的性能提升，并未考虑自然或数字形式的图像退化。即便是针对特定物体的深度估计，也普遍假设不存在遮挡。此外，这些方法对对抗性攻击较为脆弱，这对其在机器人和自动驾驶系统中的可靠部署构成了关键挑战。我们提出MIMDepth方法，该方法将掩码图像建模（MIM）自适应应用于自监督单目深度估计。尽管MIM此前多用于预训练阶段学习通用特征，我们展示了如何将其直接用于单目深度估计的训练过程。实验表明，MIMDepth对噪声、模糊、天气条件、数字伪影、遮挡以及无目标和有目标对抗攻击均具有更强的鲁棒性。

0

相关内容

估计/估计量

估计/估计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

PTEN和ChABC双基因干预的脂肪间充质干细胞移植治疗大鼠急性脊髓损伤的疗效及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

低温低氧应激相关分子在大脑学习记忆功能损伤中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

油酰乙醇胺对缺血性脑卒中神经血管稳态重构的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

框架的冗余度

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β/Smads信号通路在干细胞移植治疗化疗损伤性卵巢早衰中的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于MOF功能化的一维光子晶体的构建及其传感性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于三维差时投影法的超分辨率车辆重建算法研究

国家自然科学基金

0+阅读 · 2010年12月31日

Pim激酶家族参与Abl介导细胞转化的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

多基线InSAR获取高精度DEM技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling

Arxiv

0+阅读 · 2023年3月24日

Learning 3D-aware Image Synthesis with Unknown Pose Distribution

Arxiv

0+阅读 · 2023年3月23日

PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes

Arxiv

0+阅读 · 2023年3月23日

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Arxiv

0+阅读 · 2023年3月22日

Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video

Arxiv

0+阅读 · 2023年3月22日

Rigidity-Aware Detection for 6D Object Pose Estimation

Arxiv

1+阅读 · 2023年3月22日

Fully Self-Supervised Depth Estimation from Defocus Clue

Arxiv

0+阅读 · 2023年3月22日

When Doubly Robust Methods Meet Machine Learning for Estimating Treatment Effects from Real-World Data: A Comparative Study

Arxiv

0+阅读 · 2023年3月22日

Monocular Visual-Inertial Depth Estimation

Arxiv

0+阅读 · 2023年3月21日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

1+阅读 · 今天15:03

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

0+阅读 · 今天14:31

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

0+阅读 · 今天14:29

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

12+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

4+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

7+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

6+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

8+阅读 · 6月18日

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

专知会员服务

11+阅读 · 6月18日

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

专知会员服务

11+阅读 · 6月18日

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

7+阅读 · 6月17日

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

12+阅读 · 6月17日

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

8+阅读 · 6月17日

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

21+阅读 · 6月17日

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

10+阅读 · 6月17日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

美国从乌克兰无人机战争中学习经验

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling

Arxiv

0+阅读 · 2023年3月24日

Learning 3D-aware Image Synthesis with Unknown Pose Distribution

Arxiv

0+阅读 · 2023年3月23日

PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes

Arxiv

0+阅读 · 2023年3月23日

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Arxiv

0+阅读 · 2023年3月22日

Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video

Arxiv

0+阅读 · 2023年3月22日

Rigidity-Aware Detection for 6D Object Pose Estimation

Arxiv

1+阅读 · 2023年3月22日

Fully Self-Supervised Depth Estimation from Defocus Clue

Arxiv

0+阅读 · 2023年3月22日

When Doubly Robust Methods Meet Machine Learning for Estimating Treatment Effects from Real-World Data: A Comparative Study

Arxiv

0+阅读 · 2023年3月22日

Monocular Visual-Inertial Depth Estimation

Arxiv

0+阅读 · 2023年3月21日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

相关基金

PTEN和ChABC双基因干预的脂肪间充质干细胞移植治疗大鼠急性脊髓损伤的疗效及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

低温低氧应激相关分子在大脑学习记忆功能损伤中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

油酰乙醇胺对缺血性脑卒中神经血管稳态重构的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

框架的冗余度

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β/Smads信号通路在干细胞移植治疗化疗损伤性卵巢早衰中的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于MOF功能化的一维光子晶体的构建及其传感性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于三维差时投影法的超分辨率车辆重建算法研究

国家自然科学基金

0+阅读 · 2010年12月31日

Pim激酶家族参与Abl介导细胞转化的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

多基线InSAR获取高精度DEM技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员