Image quality plays an important role in the performance of deep neural networks (DNNs) and DNNs have been widely shown to exhibit sensitivity to changes in imaging conditions. Large-scale datasets often contain images under a wide range of conditions prompting a need to quantify and understand their underlying quality distribution in order to better characterize DNN performance and robustness. Aligning the sensitivities of image quality metrics and DNNs ensures that estimates of quality can act as proxies for image/dataset difficulty independent of the task models trained/evaluated on the data. Conventional image quality assessment (IQA) seeks to measure and align quality relative to human perceptual judgments, but here we seek a quality measure that is not only sensitive to imaging conditions but also well-aligned with DNN sensitivities. We first ask whether conventional IQA metrics are also informative of DNN performance. In order to answer this question, we reframe IQA from a causal perspective and examine conditions under which quality metrics are predictive of DNN performance. We show theoretically and empirically that current IQA metrics are weak predictors of DNN performance in the context of classification. We then use our causal framework to provide an alternative formulation and a new image quality metric that is more strongly correlated with DNN performance and can act as a prior on performance without training new task models. Our approach provides a means to directly estimate the quality distribution of large-scale image datasets towards characterizing the relationship between dataset composition and DNN performance.
翻译:图像质量对深度神经网络(DNN)的性能具有重要影响,且DNN已被广泛证明对成像条件的变化表现出敏感性。大规模数据集通常包含各种条件下的图像,这促使我们需要量化和理解其潜在的质量分布,以更好地表征DNN的性能和鲁棒性。使图像质量度量的敏感性与DNN的敏感性对齐,可确保质量估计能够作为图像/数据集难度的代理指标,且独立于在该数据上训练/评估的任务模型。传统的图像质量评估(IQA)旨在测量并相对于人类感知判断来对齐质量,但在此我们寻求一种不仅对成像条件敏感,而且与DNN敏感性良好对齐的质量度量。我们首先探讨传统IQA度量是否也能反映DNN性能。为了回答这个问题,我们从因果视角重新构建IQA,并考察质量度量能够预测DNN性能的条件。我们从理论和实验上证明,在分类任务背景下,当前的IQA度量是DNN性能的弱预测指标。随后,我们利用所提出的因果框架,给出了一种替代性表述和一种新的图像质量度量。该度量与DNN性能的相关性更强,且无需训练新的任务模型即可作为性能的先验。我们的方法为直接估计大规模图像数据集的质量分布提供了一种途径,从而有助于刻画数据集构成与DNN性能之间的关系。