An Online Gradient-Based Caching Policy with Logarithmic Complexity and Regret Guarantees

The commonly used caching policies, such as LRU or LFU, exhibit optimal performance only for specific traffic patterns. Even advanced Machine Learning-based methods, which detect patterns in historical request data, struggle when future requests deviate from past trends. Recently, a new class of policies has emerged that makes no assumptions about the request arrival process. These algorithms solve an online optimization problem, enabling continuous adaptation to the context. They offer theoretical guarantees on the regret metric, which is the gap between the gain of the online policy and the gain of the optimal static cache allocation in hindsight. Nevertheless, the high computational complexity of these solutions hinders their practical adoption. In this study, we introduce a groundbreaking gradient-based online caching policy, the first to achieve logarithmic computational complexity relative to catalog size along with regret guarantees. This means our algorithm can efficiently handle large-scale data while minimizing the performance gap between real-time decisions and optimal hindsight choices. As requests arrive, our policy dynamically adjusts the probabilities of including items in the cache, which drive cache update decisions. Our algorithm's streamlined complexity is a key advantage, enabling its application to real-world traces featuring millions of requests and items. This is a significant achievement, as traces of this scale have been out of reach for existing policies with regret guarantees. To the best of our knowledge, our experimental results show for the first time that the regret guarantees of gradient-based caching policies bring significant benefits in scenarios of practical interest.

翻译：常用的缓存策略（如LRU或LFU）仅能在特定流量模式下表现出最优性能。即便基于机器学习的先进方法（通过检测历史请求数据中的模式），在面对未来请求偏离历史趋势时也表现不佳。近年来，一类不假设请求到达过程的新策略应运而生。这些算法通过求解在线优化问题实现持续自适应，并提供关于遗憾指标的理论保证——即在线策略收益与事后最优静态缓存分配收益之间的差距。然而，这类方案的高计算复杂度阻碍了其实际应用。本研究提出一种突破性的基于梯度的在线缓存策略，首次实现相对于目录规模的对数计算复杂度与遗憾保证。这意味着本算法能高效处理大规模数据，同时最小化实时决策与事后最优选择之间的性能差距。当请求到达时，本策略动态调整缓存包含项的概率，驱动缓存更新决策。本算法的精简复杂度是其核心优势，使其能够应用于包含数百万请求与数据项的真实轨迹。这是重大突破，因为此类规模的轨迹此前超出所有具备遗憾保证的现有策略的处理能力。据我们所知，本实验结果首次证明，基于梯度的缓存策略的遗憾保证在实际应用场景中能带来显著收益。