Monocular depth estimation (MDE) provides a useful tool for robotic perception, but its predictions are often uncertain and inaccurate in challenging environments such as surgical scenes where textureless surfaces, specular reflections, and occlusions are common. To address this, we propose ProbeMDE, a cost-aware active sensing framework that combines RGB images with sparse proprioceptive measurements for MDE. Our approach utilizes an ensemble of MDE models to predict dense depth maps conditioned on both RGB images and on a sparse set of known depth measurements obtained via proprioception, where the robot has touched the environment in a known configuration. We quantify predictive uncertainty via the ensemble's variance and measure the gradient of the uncertainty with respect to candidate measurement locations. To prevent mode collapse while selecting maximally informative locations to propriocept (touch), we leverage Stein Variational Gradient Descent (SVGD) over this gradient map. We validate our method in both simulated and physical experiments on central airway obstruction surgical phantoms. Our results demonstrate that our approach outperforms baseline methods across standard depth estimation metrics, achieving higher accuracy while minimizing the number of required proprioceptive measurements. Project page: https://brittonjordan.github.io/probe_mde/
翻译:单目深度估计(MDE)为机器人感知提供了有效工具,但在挑战性环境(如纹理缺失表面、镜面反射和遮挡常见的手术场景)中,其预测常存在不确定性和不准确性。针对这一问题,我们提出ProbeMDE——一种成本感知的主动感知框架,将RGB图像与稀疏本体感知测量相结合用于MDE。该方法利用MDE模型集成,根据RGB图像和通过本体感知获得的稀疏已知深度测量值(机器人在已知构型下接触环境后获取)预测稠密深度图。我们通过集成方差量化预测不确定性,并计算候选测量位置处不确定性相对于测量的梯度。为在选择最大信息量位置进行本体感知(接触)时避免模式坍塌,我们在此梯度图上应用斯坦因变分梯度下降(SVGD)。在中央气道梗阻手术假体上开展的仿真和物理实验中,我们验证了该方法的有效性。结果表明,我们的方法在标准深度估计指标上均优于基线方法,在最小化所需本体感知测量次数的同时实现了更高精度。项目页面:https://brittonjordan.github.io/probe_mde/