Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation

Despite advances in depth estimation, flying points remain a persistent failure mode: near object boundaries, depth estimators often predict spurious 3D points in the empty space between foreground and background surfaces. We trace this artifact to a standard modeling choice: assigning each pixel a single depth hypothesis. At boundaries, a pixel can straddle a foreground and a background surface, so its true depth is ambiguous between the two. A model that predicts a single depth cannot keep both possibilities, so training instead pulls the prediction toward an intermediate depth that lies on neither surface. We address this with MDA, a mixture-density representation that lets the model predict multiple depth hypotheses and their associated probabilities for each pixel. Near boundaries, different hypotheses can align with different surfaces, and the decoded depth is selected from one of these hypotheses rather than placed in the empty space between them. Across different backbones, MDA substantially improves boundary reconstruction and largely removes flying-point artifacts even under severe input blur, while adding negligible runtime overhead. The same mixture-density framework naturally extends to transparent objects, where it predicts multiple depth layers at transparent pixels, and to sky regions, where a dedicated component separates the unbounded sky from finite-depth regions, producing flying-point-free skylines. Project Page: https://biansy000.github.io/mda-site/.

翻译：尽管深度估计取得了进展，飞点（flying points）仍然是一个持续存在的失败模式：在物体边界附近，深度估计器常常在前景和背景表面之间的空白空间中预测出虚假的三维点。我们将这一伪影归因于一个标准的建模选择：为每个像素分配单一深度假设。在边界处，一个像素可能同时跨越前景和背景表面，因此其真实深度在这两者之间存在歧义。预测单一深度的模型无法保留两种可能性，因此训练过程反而将预测值拉向一个位于两者之间的中间深度，而这个深度并不位于任一表面上。针对这一问题，我们提出了MDA（混合密度表示法），该表示法允许模型为每个像素预测多个深度假设及其关联概率。在边界附近，不同的假设可以与不同的表面对齐，解码后的深度从这些假设之一中选择，而不是放置在它们之间的空白空间中。在不同的骨干网络下，MDA大幅改善了边界重建质量，并在甚至严重输入模糊的情况下基本消除了飞点伪影，同时仅增加了可忽略不计的运行时间开销。同一混合密度框架自然扩展到透明物体场景（在透明像素处预测多个深度层）以及天空区域（通过专用组件将无界天空与有限深度区域分离，生成无飞点天际线）。项目页面：https://biansy000.github.io/mda-site/。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

稀疏点云感知的表示学习

专知会员服务

9+阅读 · 2月9日

【NeurIPS2025】语义提示扩散变换器的像素级精确深度估计

专知会员服务

8+阅读 · 2025年10月9日

迈向深度基础模型：基于视觉的深度估计最新趋势

专知会员服务

23+阅读 · 2025年7月16日

【斯坦福博士论文】用于视觉理解及其扩展的几何深度表示

专知会员服务

16+阅读 · 2025年6月8日