Deep neural networks trained end-to-end to map a measurement of a (noisy) image to a clean image perform excellent for a variety of linear inverse problems. Current methods are only trained on a few hundreds or thousands of images as opposed to the millions of examples deep networks are trained on in other domains. In this work, we study whether major performance gains are expected from scaling up the training set size. We consider image denoising, accelerated magnetic resonance imaging, and super-resolution and empirically determine the reconstruction quality as a function of training set size, while simultaneously scaling the network size. For all three tasks we find that an initially steep power-law scaling slows significantly already at moderate training set sizes. Interpolating those scaling laws suggests that even training on millions of images would not significantly improve performance. To understand the expected behavior, we analytically characterize the performance of a linear estimator learned with early stopped gradient descent. The result formalizes the intuition that once the error induced by learning the signal model is small relative to the error floor, more training examples do not improve performance.
翻译:端到端训练的深度神经网络能够将(含噪)图像的测量值映射为干净图像,在多种线性逆问题上表现出色。当前方法仅使用数百或数千张图像进行训练,而其他领域深度网络通常使用数百万张样本。本研究探讨扩大训练集规模是否预期能带来显著性能提升。我们以图像去噪、加速磁共振成像和超分辨率任务为例,通过同时缩放网络规模,实证测定重建质量随训练集大小的变化规律。对于所有三项任务,我们发现初始陡峭的幂律缩放规律在训练集规模达到中等水平后显著放缓。对这些缩放规律的插值表明,即使使用数百万张图像进行训练,性能也难有显著提升。为理解预期行为,我们理论分析了采用早停梯度下降法估计的线性模型性能。该结果形式化了如下直觉:一旦学习信号模型导致的误差相对于误差下限可忽略,增加训练样本将不再提升性能。