Behavior of neural networks is irremediably determined by the specific loss and data used during training. However it is often desirable to tune the model at inference time based on external factors such as preferences of the user or dynamic characteristics of the data. This is especially important to balance the perception-distortion trade-off of ill-posed image-to-image translation tasks. In this work, we propose to optimize a parametric tunable convolutional layer, which includes a number of different kernels, using a parametric multi-loss, which includes an equal number of objectives. Our key insight is to use a shared set of parameters to dynamically interpolate both the objectives and the kernels. During training, these parameters are sampled at random to explicitly optimize all possible combinations of objectives and consequently disentangle their effect into the corresponding kernels. During inference, these parameters become interactive inputs of the model hence enabling reliable and consistent control over the model behavior. Extensive experimental results demonstrate that our tunable convolutions effectively work as a drop-in replacement for traditional convolutions in existing neural networks at virtually no extra computational cost, outperforming state-of-the-art control strategies in a wide range of applications; including image denoising, deblurring, super-resolution, and style transfer.
翻译:神经网络的特性不可避免地由训练时使用的特定损失函数和数据决定。然而,在实际应用中,常需根据用户偏好或数据动态特征等外部因素在推理阶段调整模型。这对于平衡不适定图像到图像翻译任务中的感知-失真权衡尤为重要。本文提出使用参数化多损失优化可调卷积层,该层包含多个不同核函数,同时参数化多损失包含相同数量的优化目标。核心思路在于利用共享参数集动态插值目标和核函数。训练过程中,这些参数被随机采样以显式优化所有可能的目标组合,从而将各目标的影响解耦至对应核函数。推理时,这些参数成为模型的可交互输入,实现对模型行为的可靠一致控制。大量实验表明,所提可调卷积可作为传统卷积的有效替代方案直接嵌入现有神经网络,且几乎不增加额外计算成本,在图像去噪、去模糊、超分辨率及风格迁移等广泛应用中均优于现有最先进控制策略。