Uncertainty and Self-Supervision in Single-View Depth

Single-view depth estimation refers to the ability to derive three-dimensional information per pixel from a single two-dimensional image. Single-view depth estimation is an ill-posed problem because there are multiple depth solutions that explain 3D geometry from a single view. While deep neural networks have been shown to be effective at capturing depth from a single view, the majority of current methodologies are deterministic in nature. Accounting for uncertainty in the predictions can avoid disastrous consequences when applied to fields such as autonomous driving or medical robotics. We have addressed this problem by quantifying the uncertainty of supervised single-view depth for Bayesian deep neural networks. There are scenarios, especially in medicine in the case of endoscopic images, where such annotated data is not available. To alleviate the lack of data, we present a method that improves the transition from synthetic to real domain methods. We introduce an uncertainty-aware teacher-student architecture that is trained in a self-supervised manner, taking into account the teacher uncertainty. Given the vast amount of unannotated data and the challenges associated with capturing annotated depth in medical minimally invasive procedures, we advocate a fully self-supervised approach that only requires RGB images and the geometric and photometric calibration of the endoscope. In endoscopic imaging, the camera and light sources are co-located at a small distance from the target surfaces. This setup indicates that brighter areas of the image are nearer to the camera, while darker areas are further away. Building on this observation, we exploit the fact that for any given albedo and surface orientation, pixel brightness is inversely proportional to the square of the distance. We propose the use of illumination as a strong single-view self-supervisory signal for deep neural networks.

翻译：单视角深度估计是指从单张二维图像中逐像素推导三维信息的能力。单视角深度估计是一个不适定问题，因为从单一视角解释三维几何结构存在多种深度解。尽管深度神经网络已被证明能有效从单视角捕捉深度信息，但当前大多数方法本质上是确定性的。在自动驾驶或医疗机器人等领域应用时，考虑预测中的不确定性可以避免灾难性后果。我们通过量化贝叶斯深度神经网络在监督式单视角深度估计中的不确定性来解决该问题。在某些场景下（特别是医学领域的内窥镜图像），此类标注数据难以获取。为缓解数据匮乏问题，我们提出了一种改进从合成域到真实域迁移的方法。我们引入了一种不确定性感知的师生架构，该架构以自监督方式进行训练，并充分考虑教师模型的不确定性。鉴于未标注数据量巨大且医疗微创手术中获取标注深度面临诸多挑战，我们倡导一种完全自监督的方法，该方法仅需RGB图像及内窥镜的几何与光度校准参数。在内窥镜成像中，相机与光源共置于距离目标表面较近的位置。这种设置意味着图像中较亮的区域更靠近相机，而较暗的区域则距离更远。基于此观察，我们利用以下事实：对于任意给定的反照率和表面方向，像素亮度与距离平方成反比。我们提出将光照信息作为深度神经网络的强单视角自监督信号。

相关内容

Neural Networks

关注 1653

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日