We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses. Our motivation is to consider general demands, losses and dynamics to go beyond standard models which usually rely on newsvendor-type losses, fixed dynamics, and unrealistic i.i.d. demand assumptions. We propose MaxCOSD, an online algorithm that has provable guarantees even for problems with non-i.i.d. demands and stateful dynamics, including for instance perishability. We consider what we call non-degeneracy assumptions on the demand process, and argue that they are necessary to allow learning.
翻译:我们研究多产品库存控制问题,其中管理者基于部分历史信息做出序贯补货决策,以最小化累计损失。我们的动机是考虑一般性需求、损失和动态机制,以超越通常依赖报童型损失、固定动态机制及不切实际的独立同分布需求假设的标准模型。我们提出MaxCOSD算法,该在线算法即使在非独立同分布需求和带状态动态机制(例如易腐性)的问题中也能提供可证明的保障。我们考虑需求过程的所谓非退化假设,并论证这些假设对于实现学习机制是必要的。