A growing line of work shows how learned predictions can be used to break through worst-case barriers to improve the running time of an algorithm. However, incorporating predictions into data structures with strong theoretical guarantees remains underdeveloped. This paper takes a step in this direction by showing that predictions can be leveraged in the fundamental online list labeling problem. In the problem, n items arrive over time and must be stored in sorted order in an array of size Theta(n). The array slot of an element is its label and the goal is to maintain sorted order while minimizing the total number of elements moved (i.e., relabeled). We design a new list labeling data structure and bound its performance in two models. In the worst-case learning-augmented model, we give guarantees in terms of the error in the predictions. Our data structure provides strong guarantees: it is optimal for any prediction error and guarantees the best-known worst-case bound even when the predictions are entirely erroneous. We also consider a stochastic error model and bound the performance in terms of the expectation and variance of the error. Finally, the theoretical results are demonstrated empirically. In particular, we show that our data structure has strong performance on real temporal data sets where predictions are constructed from elements that arrived in the past, as is typically done in a practical use case.
翻译:随着研究工作的不断深入,如何利用学习型预测突破最坏情况限制以提升算法运行时间已成为一个新兴方向。然而,将预测引入具有强理论保障的数据结构仍处于探索阶段。本文通过证明预测可应用于基础性在线列表标记问题中,在该方向上迈出了重要一步。该问题中,n个元素随时间动态到达,需以排序顺序存储在大小为Theta(n)的数组中。元素的数组槽位即为其标记,目标是在维持排序顺序的同时,最小化元素移动(即重标记)的总次数。我们设计了一种新的列表标记数据结构,并在两种模型下约束其性能。在最坏情况学习增强模型中,我们给出了基于预测误差的保证。该数据结构提供了强保障:对任意预测误差均达到最优性能,且当预测完全错误时仍能保证已知的最优最坏情况界。我们还考虑了随机误差模型,并根据误差的期望与方差约束了性能。最后,理论结果通过实证得到验证。特别地,实验表明,在基于历史元素构造预测(实际应用中的典型场景)的真实时序数据集上,该数据结构表现出优异性能。