Index structures are fundamental for efficient query processing on large-scale datasets. Learned indexes model the indexing process as a prediction problem to overcome the inherent trade-offs of traditional indexes. However, most existing learned indexes optimize only for limited objectives like query latency or space usage, neglecting other practical evaluation dimensions such as update efficiency and stability. Moreover, many learned indexes rely on assumptions about data distributions or workloads, lacking theoretical guarantees when facing unknown or evolving scenarios, which limits their generality in real-world systems. In this paper, we propose LMG, a robust and efficient learned index framework designed for multi-dimensional performance balance. LMG integrates a decoupled routing structure with theoretical $O(1)$ complexity for fixed key types and an optimal error threshold training algorithm that approaches $O(1)$ overhead in practice. Furthermore, the framework enhances query performance by optimizing gap allocation. Extensive evaluations show that our framework achieves competitive or leading performance across all key evaluation dimensions, including bulk loading (up to 7.55$\times$ faster), point queries (up to 1.68$\times$ faster), range queries (up to 11.41$\times$ faster), and mixed read-write throughput (up to 3.50$\times$ faster). Furthermore, LMG ensures robust long-term stability and high space efficiency (up to 6.26$\times$ smaller footprint). These results demonstrate that LMG significantly mitigates the multi-dimensional performance trade-offs often observed in state-of-the-art approaches, offering a balanced and efficient framework.
翻译:暂无翻译