A growing line of work shows how learned predictions can be used to break through worst-case barriers to improve the running time of an algorithm. However, incorporating predictions into data structures with strong theoretical guarantees remains underdeveloped. This paper takes a step in this direction by showing that predictions can be leveraged in the fundamental online list labeling problem. In the problem, n items arrive over time and must be stored in sorted order in an array of size Theta(n). The array slot of an element is its label and the goal is to maintain sorted order while minimizing the total number of elements moved (i.e., relabeled). We design a new list labeling data structure and bound its performance in two models. In the worst-case learning-augmented model, we give guarantees in terms of the error in the predictions. Our data structure provides strong guarantees: it is optimal for any prediction error and guarantees the best-known worst-case bound even when the predictions are entirely erroneous. We also consider a stochastic error model and bound the performance in terms of the expectation and variance of the error. Finally, the theoretical results are demonstrated empirically. In particular, we show that our data structure has strong performance on real temporal data sets where predictions are constructed from elements that arrived in the past, as is typically done in a practical use case.
翻译:越来越多的工作表明,习得性预测可用于突破最坏情况下的障碍,从而改进算法的运行时间。然而,将预测整合到具有强理论保证的数据结构中仍是一个有待发展的方向。本文通过展示预测可被应用于基础的在线列表标记问题,朝此方向迈出了一步。在该问题中,n个元素随时间到达,必须按排序顺序存储在大小为Θ(n)的数组中。元素的数组槽位即为它的标记,目标是在维护排序顺序的同时最小化被移动(即被重新标记)的元素总数。我们设计了一种新的列表标记数据结构,并在两种模型下对其性能进行了界定。在最坏情况的学习增强模型中,我们根据预测误差给出了性能保证。我们的数据结构提供了强保证:对于任何预测误差它都是最优的,且即使预测完全错误,也能保证已知的最坏情况最优界。我们还考虑了一种随机误差模型,并根据误差的期望和方差界定了性能。最后,理论结果得到了实证验证。特别地,我们证明了该数据结构在实际时间序列数据集上表现优异,其中预测由过去到达的元素构建——这正是实际使用场景中的典型做法。