Consistent hashing is used in distributed systems and networking applications to spread data evenly and efficiently across a cluster of nodes. In this paper, we present MementoHash, a novel consistent hashing algorithm that eliminates known limitations of state-of-the-art algorithms while keeping optimal performance and minimal memory usage. We describe the algorithm in detail, provide a pseudo-code implementation, and formally establish its solid theoretical guarantees. To measure the efficacy of MementoHash, we compare its performance, in terms of memory usage and lookup time, to that of state-of-the-art algorithms, namely, AnchorHash, DxHash, and JumpHash. Unlike JumpHash, MementoHash can handle random failures. Moreover, MementoHash does not require fixing the overall capacity of the cluster (as AnchorHash and DxHash do), allowing it to scale indefinitely. The number of removed nodes affects the performance of all the considered algorithms. Therefore, we conduct experiments considering three different scenarios: stable (no removed nodes), one-shot removals (90% of the nodes removed at once), and incremental removals. We report experimental results that averaged a varying number of nodes from ten to one million. Results indicate that our algorithm shows optimal lookup performance and minimal memory usage in its best-case scenario. It behaves better than AnchorHash and DxHash in its average-case scenario and at least as well as those two algorithms in its worst-case scenario. However, the worst-case scenario for MementoHash occurs when more than 70% of the nodes fail, which describes a unlikely scenario. Therefore, MementoHash shows the best performance during the regular life cycle of a cluster.
翻译:一致性哈希广泛应用于分布式系统和网络应用,用于将数据均匀高效地分布到节点集群中。本文提出MementoHash,一种新型一致性哈希算法,在保持最优性能和最小内存占用的同时,消除了现有算法的已知局限性。我们详细描述了该算法,提供了伪代码实现,并正式建立了其坚实的理论保证。为评估MementoHash的有效性,我们将其在内存使用和查找时间方面的性能与当前主流算法(即AnchorHash、DxHash和JumpHash)进行比较。与JumpHash不同,MementoHash能够处理随机故障。此外,MementoHash无需像AnchorHash和DxHash那样固定集群的总容量,从而支持无限扩展。删除节点的数量会影响所有被考虑算法的性能。因此,我们进行了三种不同场景的实验:稳定状态(无删除节点)、一次性删除(一次性删除90%节点)和增量删除。我们报告了节点数从十到一百万变化的实验平均值。结果表明,我们的算法在最佳场景下表现出最优的查找性能和最小的内存使用。在平均场景下,其表现优于AnchorHash和DxHash,在最差场景下至少与这两种算法相当。然而,MementoHash的最差场景发生在超过70%节点失效时,这种情况极为罕见。因此,在集群的常规生命周期中,MementoHash展现出最佳性能。