Graph Neural Network (GNN) architectures are defined by their implementations of update and aggregation modules. While many works focus on new ways to parametrise the update modules, the aggregation modules receive comparatively little attention. Because it is difficult to parametrise aggregation functions, currently most methods select a ``standard aggregator'' such as $\mathrm{mean}$, $\mathrm{sum}$, or $\mathrm{max}$. While this selection is often made without any reasoning, it has been shown that the choice in aggregator has a significant impact on performance, and the best choice in aggregator is problem-dependent. Since aggregation is a lossy operation, it is crucial to select the most appropriate aggregator in order to minimise information loss. In this paper, we present GenAgg, a generalised aggregation operator, which parametrises a function space that includes all standard aggregators. In our experiments, we show that GenAgg is able to represent the standard aggregators with much higher accuracy than baseline methods. We also show that using GenAgg as a drop-in replacement for an existing aggregator in a GNN often leads to a significant boost in performance across various tasks.
翻译:图神经网络(GNN)架构由更新模块和聚合模块的实现定义。尽管许多研究关注更新模块参数化的新方法,但聚合模块受到的关注相对较少。由于聚合函数的参数化存在困难,目前大多数方法选择诸如$\mathrm{mean}$、$\mathrm{sum}$或$\mathrm{max}$等"标准聚合器"。这种选择往往缺乏理论依据,但已有研究表明,聚合器的选择对性能有显著影响,且最佳聚合器取决于具体问题。由于聚合操作具有信息损失特性,选择最合适的聚合器对于最小化信息损失至关重要。本文提出GenAgg——一种广义聚合算子,其参数化函数空间包含所有标准聚合器。实验表明,GenAgg能够以远高于基线方法的精度表示标准聚合器。我们还证明,在GNN中将GenAgg作为现有聚合器的即插即用替代品,往往能在各类任务中显著提升性能。