Localizing target objects in images is an important task in computer vision. Often it is the first step towards solving a variety of applications in autonomous driving, maintenance, quality insurance, robotics, and augmented reality. Best in class solutions for this task rely on deep neural networks, which require a set of representative training data for best performance. Creating sets of sufficient quality, variety, and size is often difficult, error prone, and expensive. This is where the method of luminance keying can help: it provides a simple yet effective solution to record high quality data for training object detection and segmentation. We extend previous work that presented luminance keying on the common YCB-V set of household objects by recording the remaining objects of the YCB superset. The additional variety of objects - addition of transparency, multiple color variations, non-rigid objects - further demonstrates the usefulness of luminance keying and might be used to test the applicability of the approach on new 2D object detection and segmentation algorithms.
翻译:图像中目标物体的定位是计算机视觉领域的一项重要任务。这通常是解决自动驾驶、设备维护、质量保证、机器人技术和增强现实等各类应用的第一步。针对该任务的最先进解决方案依赖于深度神经网络,这类网络需要具有代表性、多样性和足够规模的训练数据才能达到最佳性能。然而,创建满足高质量、多样性和大规模要求的数据集往往困难重重,容易出错且成本高昂。亮度键控方法为此提供了解决方案:它通过一种简单而有效的方式,为物体检测和分割任务采集高质量的训练数据。本研究在先前基于YCB-V常见家居物体集开展亮度键控工作的基础上,进一步完成了YCB超集中剩余物体的数据采集。新增物体类型——包括透明物体、多色变体及非刚性物体——进一步验证了亮度键控方法的实用性,并可用于测试该方法在新兴二维物体检测与分割算法中的适用性。