Recent years have witnessed an upsurge in research interests and applications of machine learning on graphs. However, manually designing the optimal machine learning algorithms for different graph datasets and tasks is inflexible, labor-intensive, and requires expert knowledge, limiting its adaptivity and applicability. Automated machine learning (AutoML) on graphs, aiming to automatically design the optimal machine learning algorithm for a given graph dataset and task, has received considerable attention. However, none of the existing libraries can fully support AutoML on graphs. To fill this gap, we present Automated Graph Learning (AutoGL), the first dedicated library for automated machine learning on graphs. AutoGL is open-source, easy to use, and flexible to be extended. Specifically, we propose a three-layer architecture, consisting of backends to interface with devices, a complete automated graph learning pipeline, and supported graph applications. The automated machine learning pipeline further contains five functional modules: auto feature engineering, neural architecture search, hyper-parameter optimization, model training, and auto ensemble, covering the majority of existing AutoML methods on graphs. For each module, we provide numerous state-of-the-art methods and flexible base classes and APIs, which allow easy usage and customization. We further provide experimental results to showcase the usage of our AutoGL library. We also present AutoGL-light, a lightweight version of AutoGL to facilitate customizing pipelines and enriching applications, as well as benchmarks for graph neural architecture search. The codes of AutoGL are publicly available at https://github.com/THUMNLab/AutoGL.
翻译:近年来,图上的机器学习在研究和应用中呈激增态势。然而,针对不同图数据集和任务手动设计最优机器学习算法存在灵活性不足、劳动强度大且需专家知识等局限,制约了其自适应性与适用性。图上的自动化机器学习(AutoML)旨在为给定图数据集和任务自动设计最优机器学习算法,已受到广泛关注。然而,现有库均无法完全支持图上的AutoML。为填补这一空白,我们提出AutoGL(Automated Graph Learning),这是首个专用于图上自动化机器学习的库。AutoGL具有开源、易用和灵活可扩展的特点。具体而言,我们提出包含后端设备接口、完整自动化图学习流水线与支持图应用的三层架构。该自动化机器学习流水线进一步包含五大功能模块:自动特征工程、神经架构搜索、超参数优化、模型训练与自动集成,覆盖了现有图AutoML方法的主体。针对每个模块,我们提供了大量最先进方法、灵活的基类与API,支持便捷使用与自定义。我们进一步通过实验展示AutoGL库的使用方式。同时,为简化流水线定制与丰富应用场景,我们提供了轻量版AutoGL-light,并为图神经架构搜索提供基准测试。AutoGL代码已在https://github.com/THUMNLab/AutoGL 公开。