Representation learning is the first step in automating tasks such as research paper recommendation, classification, and retrieval. Due to the accelerating rate of research publication, together with the recognised benefits of interdisciplinary research, systems that facilitate researchers in discovering and understanding relevant works from beyond their immediate school of knowledge are vital. This work explores different methods of research paper representation (or document embedding), to identify those methods that are capable of preserving the interdisciplinary implications of research papers in their embeddings. In addition to evaluating state of the art methods of document embedding in a interdisciplinary citation prediction task, we propose a novel Graph Neural Network architecture designed to preserve the key interdisciplinary implications of research articles in citation network node embeddings. Our proposed method outperforms other GNN-based methods in interdisciplinary citation prediction, without compromising overall citation prediction performance.
翻译:表示学习是自动完成研究论文推荐、分类与检索等任务的首要步骤。随着科研出版速度的持续加快,加之交叉学科研究公认的益处,开发能够帮助研究者发现并理解其直接知识领域之外相关成果的系统至关重要。本研究探索了不同的研究论文表示方法(即文档嵌入),旨在识别那些能够在嵌入中保留论文交叉学科含义的方法。除了在交叉学科引用预测任务中评估当前最先进的文档嵌入方法外,我们还提出了一种新型图神经网络架构,旨在将研究论文在引用网络节点嵌入中的关键交叉学科含义保留下来。我们提出的方法在不牺牲整体引用预测性能的前提下,在交叉学科引用预测任务中优于其他基于图神经网络的方法。