Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand

The data-driven newsvendor problem with features has recently emerged as a significant area of research, driven by the proliferation of data across various sectors such as retail, supply chains, e-commerce, and healthcare. Given the sensitive nature of customer or organizational data often used in feature-based analysis, it is crucial to ensure individual privacy to uphold trust and confidence. Despite its importance, privacy preservation in the context of inventory planning remains unexplored. A key challenge is the nonsmoothness of the newsvendor loss function, which sets it apart from existing work on privacy-preserving algorithms in other settings. This paper introduces a novel approach to estimate a privacy-preserving optimal inventory policy within the f-differential privacy framework, an extension of the classical $(\epsilon, \delta)$-differential privacy with several appealing properties. We develop a clipped noisy gradient descent algorithm based on convolution smoothing for optimal inventory estimation to simultaneously address three main challenges: (1) unknown demand distribution and nonsmooth loss function; (2) provable privacy guarantees for individual-level data; and (3) desirable statistical precision. We derive finite-sample high-probability bounds for optimal policy parameter estimation and regret analysis. By leveraging the structure of the newsvendor problem, we attain a faster excess population risk bound compared to that obtained from an indiscriminate application of existing results for general nonsmooth convex loss. Our bound aligns with that for strongly convex and smooth loss function. Our numerical experiments demonstrate that the proposed new method can achieve desirable privacy protection with a marginal increase in cost.

翻译：基于特征的数据驱动新闻供应商问题近年来已成为一个重要的研究领域，这得益于零售、供应链、电子商务和医疗等多个领域数据的激增。鉴于基于特征分析中常使用的客户或组织数据的敏感性，确保个人隐私对于维护信任和信心至关重要。尽管其重要性不言而喻，但库存规划中的隐私保护问题尚未得到探索。一个关键挑战是新闻供应商损失函数的非光滑性，这使其有别于其他场景下已有的隐私保护算法研究。本文在f-差分隐私框架（经典$(\epsilon, \delta)$-差分隐私的扩展，具有若干良好性质）下，提出了一种估计隐私保护最优库存策略的新方法。我们开发了一种基于卷积平滑的裁剪噪声梯度下降算法用于最优库存估计，以同时应对三个主要挑战：（1）未知道求分布和非光滑损失函数；（2）个体数据的可证明隐私保证；（3）理想的统计精度。我们推导了最优策略参数估计和遗憾分析的有限样本高概率界。通过利用新闻供应商问题的结构，我们获得了比直接套用一般非光滑凸损失现有结果更快的超额总体风险界。我们的界与强凸且光滑损失函数的界一致。数值实验表明，所提出的新方法能够在成本小幅增加的情况下实现理想的隐私保护。