This paper illustrates the central role of loss functions in data-driven decision making, providing a comprehensive survey on their influence in cost-sensitive classification (CSC) and reinforcement learning (RL). We demonstrate how different regression loss functions affect the sample efficiency and adaptivity of value-based decision making algorithms. Across multiple settings, we prove that algorithms using the binary cross-entropy loss achieve first-order bounds scaling with the optimal policy's cost and are much more efficient than the commonly used squared loss. Moreover, we prove that distributional algorithms using the maximum likelihood loss achieve second-order bounds scaling with the policy variance and are even sharper than first-order bounds. This in particular proves the benefits of distributional RL. We hope that this paper serves as a guide analyzing decision making algorithms with varying loss functions, and can inspire the reader to seek out better loss functions to improve any decision making algorithm.
翻译:本文阐述了损失函数在数据驱动决策中的核心作用,对损失函数在代价敏感分类(CSC)与强化学习(RL)中的影响进行了全面综述。我们论证了不同的回归损失函数如何影响基于价值的决策算法的样本效率与适应性。在多种设定下,我们证明了使用二元交叉熵损失的算法能够获得与最优策略代价成比例的一阶界,其效率远高于常用的平方损失。此外,我们证明了使用最大似然损失的分布估计算法能够获得与策略方差成比例的二阶界,该界甚至比一阶界更为尖锐。这尤其证明了分布强化学习的优势。我们希望本文能作为分析采用不同损失函数的决策算法的指南,并启发读者探寻更优的损失函数以改进各类决策算法。