A Learning-Based Caching Mechanism for Edge Content Delivery

With the advent of 5G networks and the rise of the Internet of Things (IoT), Content Delivery Networks (CDNs) are increasingly extending into the network edge. This shift introduces unique challenges, particularly due to the limited cache storage and the diverse request patterns at the edge. These edge environments can host traffic classes characterized by varied object-size distributions and object-access patterns. Such complexity makes it difficult for traditional caching strategies, which often rely on metrics like request frequency or time intervals, to be effective. Despite these complexities, the optimization of edge caching is crucial. Improved byte hit rates at the edge not only alleviate the load on the network backbone but also minimize operational costs and expedite content delivery to end-users. In this paper, we introduce HR-Cache, a comprehensive learning-based caching framework grounded in the principles of Hazard Rate (HR) ordering, a rule originally formulated to compute an upper bound on cache performance. HR-Cache leverages this rule to guide future object eviction decisions. It employs a lightweight machine learning model to learn from caching decisions made based on HR ordering, subsequently predicting the "cache-friendliness" of incoming requests. Objects deemed "cache-averse" are placed into cache as priority candidates for eviction. Through extensive experimentation, we demonstrate that HR-Cache not only consistently enhances byte hit rates compared to existing state-of-the-art methods but also achieves this with minimal prediction overhead. Our experimental results, using three real-world traces and one synthetic trace, indicate that HR-Cache consistently achieves 2.2-14.6% greater WAN traffic savings than LRU. It outperforms not only heuristic caching strategies but also the state-of-the-art learning-based algorithm.

翻译：随着5G网络的出现和物联网(IoT)的兴起，内容分发网络(CDN)正日益向网络边缘延伸。这一转变带来了独特的挑战，尤其是边缘缓存存储容量有限且请求模式多样化。这些边缘环境可承载具有不同对象大小分布和对象访问模式的流量类别。这种复杂性使得依赖请求频率或时间间隔等指标的传统缓存策略难以有效发挥作用。尽管存在这些复杂性，边缘缓存的优化仍至关重要。提升边缘字节命中率不仅可减轻网络骨干的负载，还能降低运营成本并加快向终端用户的内容交付。在本文中，我们提出HR-Cache，一种基于风险率(HR)排序原理的综合性学习型缓存框架——该规则最初用于计算缓存性能的上界。HR-Cache利用此规则指导未来对象驱逐决策。它采用轻量级机器学习模型，从基于HR排序做出的缓存决策中学习，进而预测传入请求的"缓存友好性"。被判定为"缓存规避"的对象会被放入缓存作为优先驱逐候选。通过大量实验，我们证明HR-Cache不仅相较于现有最先进方法持续提升字节命中率，而且能以极低的预测开销实现这一目标。我们使用三条真实轨迹和一条合成轨迹的实验结果表明，HR-Cache始终比LRU多节省2.2%-14.6%的WAN流量。它不仅优于启发式缓存策略，还超越了最先进的学习型算法。