This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. We develop an online algorithm and prove that it is ($\frac{5+\alpha}{3}$)-consistent (competitiveness under perfect predictions) and ($1 + \frac{1}{\alpha}$)-robust (competitiveness under terrible predictions), where $\alpha \in (0, 1]$ is a hyper-parameter representing the level of distrust in the predictions. We also study the impact of mispredictions on the competitive ratio of the proposed algorithm and adapt it to achieve a bounded robustness while retaining its consistency. We further establish a lower bound of $\frac{3}{2}$ on the consistency of any deterministic learning-augmented algorithm. Experimental evaluations are carried out to evaluate our algorithms using real data access traces.
翻译:本文研究分布式数据访问的在线复制问题。目标是在多服务器系统中随时间动态创建和删除数据副本,以最小化服务访问请求的总存储和网络成本。我们在新兴的学习增强设置下研究该问题,假设各服务器上的请求间隔时间具有简单的二元预测。我们开发了一种在线算法,并证明其具有($\frac{5+\alpha}{3}$)-一致性(完美预测下的竞争性)和($1 + \frac{1}{\alpha}$)-鲁棒性(极差预测下的竞争性),其中 $\alpha \in (0, 1]$ 是表示对预测不信任程度的超参数。我们还研究了错误预测对所提算法竞争比的影响,并对其进行调整以在保持一致性的同时实现有界鲁棒性。我们进一步证明了任意确定性学习增强算法的一致性下界为 $\frac{3}{2}$。通过使用真实数据访问轨迹对所提算法进行了实验评估。