Recent advances in machine learning have greatly benefited object detection and 6D pose estimation. However, textureless and metallic objects still pose a significant challenge due to few visual cues and the texture bias of CNNs. To address his issue, we propose a strategy for inducing a shape bias to CNN training. In particular, by randomizing textures applied to object surfaces during data rendering, we create training data without consistent textural cues. This methodology allows for seamless integration into existing data rendering engines, and results in negligible computational overhead for data rendering and network training. Our findings demonstrate that the shape bias we induce via randomized texturing, improves over existing approaches using style transfer. We evaluate with three detectors and two pose estimators. For the most recent object detector and for pose estimation in general, estimation accuracy improves for textureless and metallic objects. Additionally we show that our approach increases the pose estimation accuracy in the presence of image noise and strong illumination changes. Code and datasets are publicly available at github.com/hoenigpeter/randomized_texturing.
翻译:近年来机器学习领域的进展极大地促进了物体检测与6D位姿估计技术的发展。然而,无纹理物体和金属物体由于视觉线索稀少以及卷积神经网络(CNN)固有的纹理偏向特性,仍然构成重大挑战。为解决这一问题,我们提出了一种在CNN训练中引入形状偏向的策略。具体而言,通过在数据渲染过程中对物体表面施加随机化纹理,我们创建了不包含一致性纹理线索的训练数据。该方法能够无缝集成到现有数据渲染引擎中,且在数据渲染和网络训练过程中产生的计算开销可忽略不计。实验结果表明,我们通过随机化纹理处理所诱导的形状偏向,在性能上超越了现有基于风格迁移的方法。我们使用三种检测器和两种位姿估计器进行评估验证。对于最新的物体检测器及位姿估计任务整体而言,该方法显著提升了无纹理物体和金属物体的估计精度。此外,我们证明该方法在存在图像噪声和剧烈光照变化的场景下也能有效提高位姿估计的准确性。相关代码与数据集已在github.com/hoenigpeter/randomized_texturing公开。