Super-resolution (SR) is an ill-posed inverse problem with a large set of feasible solutions that are consistent with a given low-resolution image. Various deterministic algorithms aim to find a single solution that balances fidelity and perceptual quality; however, this trade-off often causes visual artifacts that bring ambiguity in information-centric applications. On the other hand, diffusion models (DMs) excel in generating a diverse set of feasible SR images that span the solution space. The challenge is then how to determine the most likely solution among this set in a trustworthy manner. We observe that quantitative measures, such as PSNR, LPIPS, DISTS, are not reliable indicators to resolve ambiguous cases. To this effect, we propose employing human feedback, where we ask human subjects to select a small number of likely samples and we ensemble the averages of selected samples. This strategy leverages the high-quality image generation capabilities of DMs, while recognizing the importance of obtaining a single trustworthy solution, especially in use cases, such as identification of specific digits or letters, where generating multiple feasible solutions may not lead to a reliable outcome. Experimental results demonstrate that our proposed strategy provides more trustworthy solutions when compared to state-of-the art SR methods.
翻译:超分辨率(SR)是一个病态逆问题,存在大量与给定低分辨率图像一致的可行解。各种确定性算法致力于寻找平衡保真度与感知质量的单一解,然而这种权衡常导致视觉伪影,从而在信息密集型应用中引入歧义。另一方面,扩散模型(DMs)擅长生成跨越解空间的多样化可行SR图像。挑战在于如何以可信方式确定该集合中最可能的解。我们观察到PSNR、LPIPS、DISTS等定量指标并非解决歧义情况的可靠指示标。为此,我们提出采用人类反馈机制:邀请人类受试者选取少量可能样本,并对所选样本进行平均集成。该策略在利用扩散模型高质量图像生成能力的同时,认可获取单一可信解的重要性——尤其在识别特定数字或字母等用例中,生成多个可行解可能无法产生可靠结果。实验结果表明,与现有最先进SR方法相比,我们提出的策略能提供更可信的解。