In the algorithm selection research, the discussion surrounding algorithm features has been significantly overshadowed by the emphasis on problem features. Although a few empirical studies have yielded evidence regarding the effectiveness of algorithm features, the potential benefits of incorporating algorithm features into algorithm selection models and their suitability for different scenarios remain unclear. In this paper, we address this gap by proposing the first provable guarantee for algorithm selection based on algorithm features, taking a generalization perspective. We analyze the benefits and costs associated with algorithm features and investigate how the generalization error is affected by different factors. Specifically, we examine adaptive and predefined algorithm features under transductive and inductive learning paradigms, respectively, and derive upper bounds for the generalization error based on their model's Rademacher complexity. Our theoretical findings not only provide tight upper bounds, but also offer analytical insights into the impact of various factors, such as the training scale of problem instances and candidate algorithms, model parameters, feature values, and distributional differences between the training and test data. Notably, we demonstrate how models will benefit from algorithm features in complex scenarios involving many algorithms, and proves the positive correlation between generalization error bound and $\chi^2$-divergence of distributions.
翻译:在算法选择研究中,关于算法特征的讨论一直被对问题特征的强调所掩盖。尽管少数实证研究提供了算法特征有效性的证据,但将算法特征纳入算法选择模型的潜在优势及其在不同场景下的适用性仍不明确。本文通过提出首个基于算法特征的算法选择可证明保证,从泛化角度填补了这一空白。我们分析了与算法特征相关的收益与成本,并研究了泛化误差如何受不同因素影响。具体而言,我们分别在转导学习和归纳学习范式下考察了自适应与预定义的算法特征,并基于其模型的Rademacher复杂度推导了泛化误差的上界。我们的理论结果不仅提供了紧致的上界,还从分析视角揭示了多种因素的影响机制,例如问题实例与候选算法的训练规模、模型参数、特征值以及训练数据与测试数据之间的分布差异。值得注意的是,我们证明了模型在涉及大量算法的复杂场景中将如何受益于算法特征,并验证了泛化误差上界与分布$\chi^2$散度之间的正相关性。