We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations. Many algorithms in CLRS require global memory or information exchange, mirrored in its execution model, which constructs fully connected (not sparse) graphs based on the underlying problem. Despite CLRS's aim of assessing how effectively learned algorithms can generalize to larger instances, the existing execution model becomes a significant constraint due to its demanding memory requirements and runtime (hard to scale). However, many important algorithms do not demand a fully connected graph; these algorithms, primarily distributed in nature, align closely with the message-passing paradigm employed by Graph Neural Networks. Hence, we propose SALSA-CLRS, an extension of the current CLRS benchmark specifically with scalability and sparseness in mind. Our approach includes adapted algorithms from the original CLRS benchmark and introduces new problems from distributed and randomized algorithms. Moreover, we perform a thorough empirical evaluation of our benchmark. Code is publicly available at https://github.com/jkminder/SALSA-CLRS.
翻译:我们提出对CLRS算法学习基准的扩展,重点优化可扩展性并利用稀疏表示。CLRS中的许多算法需要全局内存或信息交换,这在其执行模型中得以体现——基于底层问题构建全连接(非稀疏)图。尽管CLRS旨在评估学习算法对更大规模实例的泛化效果,但现有执行模型因其对内存和运行时间的苛刻需求(难以扩展)而成为显著限制。然而,许多重要算法并不需要全连接图;这些本质上分布式的算法,与图神经网络采用的消息传递范式高度契合。为此,我们提出SALSA-CLRS——专注于可扩展性与稀疏性的CLRS基准扩展方案。该方法既包含对原始CLRS基准中算法的适配,又引入了分布式与随机化算法领域的新问题。此外,我们对基准进行了全面的实证评估。代码已开源在https://github.com/jkminder/SALSA-CLRS。