The relation classification task assigns the proper semantic relation to a pair of subject and object entities; the task plays a crucial role in various text mining applications, such as knowledge graph construction and entities interaction discovery in biomedical text. Current relation classification models employ additional procedures to identify multiple relations in a single sentence. Furthermore, they overlook the imbalanced predictions pattern. The pattern arises from the presence of a few valid relations that need positive labeling in a relatively large predefined relations set. We propose a multiple relations classification model that tackles these issues through a customized output architecture and by exploiting additional input features. Our findings suggest that handling the imbalanced predictions leads to significant improvements, even on a modest training design. The results demonstrate superiority performance on benchmark datasets commonly used in relation classification. To the best of our knowledge, this work is the first that recognizes the imbalanced predictions within the relation classification task.
翻译:关系分类任务旨在为给定的主语和宾语实体对分配正确的语义关系;该任务在文本挖掘应用中扮演着关键角色,例如知识图谱构建和生物医学文本中的实体交互发现。当前的关系分类模型采用附加流程来识别单个句子中的多重关系。此外,它们忽视了不平衡预测模式。该模式源于在相对较大的预定义关系集合中,仅有少数有效关系需要被正标注。我们提出了一种多重关系分类模型,通过定制的输出架构并利用额外的输入特征来解决这些问题。我们的研究发现,即使在简单的训练设计下,处理不平衡预测也能带来显著的改进。实验结果在关系分类中常用的基准数据集上展示了优越性能。据我们所知,本研究首次识别了关系分类任务中存在的不平衡预测问题。