Estimating perceptual attributes of materials directly from images is a challenging task due to their complex, not fully-understood interactions with external factors, such as geometry and lighting. Supervised deep learning models have recently been shown to outperform traditional approaches, but rely on large datasets of human-annotated images for accurate perception predictions. Obtaining reliable annotations is a costly endeavor, aggravated by the limited ability of these models to generalise to different aspects of appearance. In this work, we show how a much smaller set of human annotations ("strong labels") can be effectively augmented with automatically derived "weak labels" in the context of learning a low-dimensional image-computable gloss metric. We evaluate three alternative weak labels for predicting human gloss perception from limited annotated data. Incorporating weak labels enhances our gloss prediction beyond the current state of the art. Moreover, it enables a substantial reduction in human annotation costs without sacrificing accuracy, whether working with rendered images or real photographs.
翻译:直接从图像中估计材料的感知属性是一项具有挑战性的任务,因为其与几何形状、光照等外部因素的交互复杂且尚未完全理解。近年来,有监督深度学习模型已被证明优于传统方法,但其准确预测感知依赖于大规模人工标注图像数据集。获取可靠标注成本高昂,且这些模型对不同外观特征的泛化能力有限,进一步加剧了这一困难。在本工作中,我们展示了如何在学习低维图像可计算光泽度指标的背景下,用自动导出的“弱标签”有效扩充规模小得多的人工标注(“强标签”)。我们评估了三种替代弱标签,用于在有限标注数据下预测人类光泽感知。融合弱标签使我们的光泽预测超越了当前最优水平。此外,无论是处理渲染图像还是真实照片,该方法均能在不牺牲准确性的前提下大幅降低人工标注成本。