Many important computer vision tasks are naturally formulated to have a non-differentiable objective. Therefore, the standard, dominant training procedure of a neural network is not applicable since back-propagation requires the gradients of the objective with respect to the output of the model. Most deep learning methods side-step the problem sub-optimally by using a proxy loss for training, which was originally designed for another task and is not tailored to the specifics of the objective. The proxy loss functions may or may not align well with the original non-differentiable objective. An appropriate proxy has to be designed for a novel task, which may not be feasible for a non-specialist. This thesis makes four main contributions toward bridging the gap between the non-differentiable objective and the training loss function. Throughout the thesis, we refer to a loss function as a surrogate loss if it is a differentiable approximation of the non-differentiable objective. Note that we use the terms objective and evaluation metric interchangeably. The contributions of this thesis make the training of neural networks more scalable -- to new tasks in a nearly labor-free manner when the evaluation metric is decomposable, which will help researchers with novel tasks. For non-decomposable evaluation metrics, the differentiable components developed for the recall@k surrogate, such as sorting and counting, can also be used for creating new surrogates.
翻译:许多重要的计算机视觉任务自然地被形式化为具有非可微目标函数的问题。因此,标准且主流的神经网络训练流程无法直接应用,因为反向传播需要目标函数相对于模型输出的梯度。大多数深度学习方法通过使用代理损失进行训练来次优地规避该问题,而这些代理损失最初是为其他任务设计的,并未针对特定目标的细节进行定制。代理损失函数可能与原始的非可微目标函数对齐良好,也可能存在偏差。针对新任务设计合适的代理损失并非易事,对于非专业人士而言尤其困难。本论文在弥合非可微目标与训练损失函数之间的差距方面做出了四项主要贡献。在论文中,我们将损失函数称为替代损失,若它是对非可微目标的可微近似。注意,我们交替使用术语“目标”和“评估指标”。本论文的贡献使神经网络训练更具可扩展性——当评估指标可分解时,几乎无需额外劳动即可应用于新任务,这将有助于研究者处理新颖任务。对于不可分解的评估指标,为recall@k替代损失开发的可微组件(例如排序和计数)也可用于创建新的替代损失。