Deep image prior (DIP) and its variants have showed remarkable potential for solving inverse problems in computer vision, without any extra training data. Practical DIP models are often substantially overparameterized. During the fitting process, these models learn mostly the desired visual content first, and then pick up the potential modeling and observational noise, i.e., overfitting. Thus, the practicality of DIP often depends critically on good early stopping (ES) that captures the transition period. In this regard, the majority of DIP works for vision tasks only demonstrates the potential of the models -- reporting the peak performance against the ground truth, but provides no clue about how to operationally obtain near-peak performance without access to the groundtruth. In this paper, we set to break this practicality barrier of DIP, and propose an efficient ES strategy, which consistently detects near-peak performance across several vision tasks and DIP variants. Based on a simple measure of dispersion of consecutive DIP reconstructions, our ES method not only outpaces the existing ones -- which only work in very narrow domains, but also remains effective when combined with a number of methods that try to mitigate the overfitting. The code is available at https://github.com/sun-umn/Early_Stopping_for_DIP.
翻译:深度图像先验(DIP)及其变体在无需额外训练数据的情况下,已展现出解决计算机视觉逆问题的显著潜力。实际应用的DIP模型通常存在严重过参数化现象。在拟合过程中,这类模型首先主要学习期望的视觉内容,随后逐渐捕捉潜在的建模噪声和观测噪声,即产生过拟合。因此,DIP的实用性往往关键依赖于能否通过良好的早期停止(ES)策略捕捉这一过渡阶段。就此而言,大多数面向视觉任务的DIP研究仅展示了模型潜力——报告相对于真值的最佳性能,但并未提供如何在无法获取真值的情况下,实际操作达到接近最优性能的线索。本文旨在突破DIP的这一实用性障碍,提出一种高效的ES策略,该策略能在多个视觉任务和DIP变体中稳定检测接近最优的性能。基于连续DIP重建结果离散度的简单度量,我们的ES方法不仅超越了仅在极窄领域有效的现有方法,而且在结合多种旨在缓解过拟合的方法时仍保持有效性。代码开源地址为https://github.com/sun-umn/Early_Stopping_for_DIP。