A considerable number of texts encountered daily are somehow connected with each other. For example, Wikipedia articles refer to other articles via hyperlinks, scientific papers relate to others via citations or (co)authors, while tweets relate via users that follow each other or reshare content. Hence, a graph-like structure can represent existing connections and be seen as capturing the "context" of the texts. The question thus arises if extracting and integrating such context information into a language model might help facilitate a better automated understanding of the text. In this study, we experimentally demonstrate that incorporating graph-based contextualization into BERT model enhances its performance on an example of a classification task. Specifically, on Pubmed dataset, we observed a reduction in error from 8.51% to 7.96%, while increasing the number of parameters just by 1.6%. Our source code: https://github.com/tryptofanik/gc-bert
翻译:日常接触的大量文本往往相互关联。例如,维基百科文章通过超链接相互引用,科学论文通过引用关系或(合)作者相互关联,而推文则通过用户互关或内容转发产生联系。因此,图状结构能够表征这些关联关系,可视为捕捉文本的"上下文"信息。由此引发一个关键问题:提取此类上下文信息并将其融入语言模型,是否有助于提升对文本的自动化理解水平?本研究通过实验证明,将基于图的上下文信息融入BERT模型,可在分类任务中显著提升性能。具体而言,在PubMed数据集上,模型参数仅增加1.6%的情况下,错误率从8.51%降至7.96%。源代码参见:https://github.com/tryptofanik/gc-bert