From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms

In this work, we explore a framework for contextual decision-making to study how the relevance and quantity of past data affects the performance of a data-driven policy. We analyze a contextual Newsvendor problem in which a decision-maker needs to trade-off between an underage and an overage cost in the face of uncertain demand. We consider a setting in which past demands observed under ``close by'' contexts come from close by distributions and analyze the performance of data-driven algorithms through a notion of context-dependent worst-case expected regret. We analyze the broad class of Weighted Empirical Risk Minimization (WERM) policies which weigh past data according to their similarity in the contextual space. This class includes classical policies such as ERM, k-Nearest Neighbors and kernel-based policies. Our main methodological contribution is to characterize exactly the worst-case regret of any WERM policy on any given configuration of contexts. To the best of our knowledge, this provides the first understanding of tight performance guarantees in any contextual decision-making problem, with past literature focusing on upper bounds via concentration inequalities. We instead take an optimization approach, and isolate a structure in the Newsvendor loss function that allows to reduce the infinite-dimensional optimization problem over worst-case distributions to a simple line search. This in turn allows us to unveil fundamental insights that were obfuscated by previous general-purpose bounds. We characterize actual guaranteed performance as a function of the contexts, as well as granular insights on the learning curve of algorithms.

翻译：本文探讨了一个上下文决策框架，研究过去数据的相关性和数量如何影响数据驱动策略的性能。我们分析了一个上下文报童问题，其中决策者需要在面临不确定需求时权衡缺货成本与过剩成本。我们考虑了一个设定，其中在“邻近”上下文下观察到的过去需求来自邻近的分布，并通过上下文相关的最坏情况期望遗憾来评估数据驱动算法的性能。我们分析了广义的加权经验风险最小化（WERM）策略，这类策略根据过去数据在上下文空间中的相似性对其进行加权。该策略类别包括经典策略，如经验风险最小化（ERM）、k-近邻算法和基于核的策略。我们的主要方法贡献在于精确刻画了任意WERM策略在任意给定上下文配置下的最坏情况遗憾。据我们所知，这首次为任何上下文决策问题提供了紧致性能保证的理解，而以往文献仅通过集中不等式给出了上界。我们转而采用优化方法，并在报童损失函数中分离出一种结构，使得将最坏情况分布上的无穷维优化问题简化为简单的线搜索成为可能。这进而使我们揭示了以往通用界所掩盖的基本洞察。我们刻画了实际保证性能随上下文变化的特征，以及算法学习曲线的精细洞察。