Document-level relation extraction typically relies on text-based encoders and hand-coded pooling heuristics to aggregate information learned by the encoder. In this paper, we leverage the intrinsic graph processing capabilities of the Transformer model and propose replacing hand-coded pooling methods with new tokens in the input, which are designed to aggregate information via explicit graph relations in the computation of attention weights. We introduce a joint text-graph Transformer model and a graph-assisted declarative pooling (GADePo) specification of the input, which provides explicit and high-level instructions for information aggregation. GADePo allows the pooling process to be guided by domain-specific knowledge or desired outcomes but still learned by the Transformer, leading to more flexible and customisable pooling strategies. We evaluate our method across diverse datasets and models and show that our approach yields promising results that are consistently better than those achieved by the hand-coded pooling functions.
翻译:文档级关系抽取通常依赖于基于文本的编码器和手动编码的池化启发式方法来聚合编码器学习到的信息。本文利用Transformer模型固有的图处理能力,提出用输入中的新标记替代手动编码的池化方法,这些标记旨在通过注意力权重计算中的显式图关系来聚合信息。我们引入了一种联合文本-图Transformer模型以及一种图辅助声明式池化(GADePo)输入规范,该规范为信息聚合提供了显式且高级的指令。GADePo允许池化过程受领域特定知识或预期结果指导,同时仍由Transformer学习,从而实现更灵活和可定制的池化策略。我们在多种数据集和模型上评估了我们的方法,结果表明该方法取得了优于手动编码池化函数的性能,且效果稳定。