Coreference resolution, critical for identifying textual entities referencing the same entity, faces challenges in pronoun resolution, particularly identifying pronoun antecedents. Existing methods often treat pronoun resolution as a separate task from mention detection, potentially missing valuable information. This study proposes the first end-to-end neural network system for Persian pronoun resolution, leveraging pre-trained Transformer models like ParsBERT. Our system jointly optimizes both mention detection and antecedent linking, achieving a 3.37 F1 score improvement over the previous state-of-the-art system (which relied on rule-based and statistical methods) on the Mehr corpus. This significant improvement demonstrates the effectiveness of combining neural networks with linguistic models, potentially marking a significant advancement in Persian pronoun resolution and paving the way for further research in this under-explored area.
翻译:共指消解是识别文本中指代同一实体的关键任务,在代词消解(尤其是识别代词先行词)方面面临挑战。现有方法通常将代词消解视为与指称检测分离的独立任务,可能丢失有价值的信息。本研究首次提出面向波斯语代词消解的端到端神经网络系统,利用预训练Transformer模型(如ParsBERT)。我们的系统联合优化指称检测与先行词链接两个环节,在Mehr语料库上相较此前最佳系统(基于规则和统计方法)实现了3.37个F1值的提升。这一显著改进证明了神经网络与语言模型结合的有效性,有望成为波斯语代词消解领域的重要进展,并为这一待深入研究的方向铺平道路。