The paper presents a study of methods for extracting information about dialogue participants and evaluating their performance in Russian. To train models for this task, the Multi-Session Chat dataset was translated into Russian using multiple translation models, resulting in improved data quality. A metric based on the F-score concept is presented to evaluate the effectiveness of the extraction models. The metric uses a trained classifier to identify the dialogue participant to whom the persona belongs. Experiments were conducted on MBart, FRED-T5, Starling-7B, which is based on the Mistral, and Encoder2Encoder models. The results demonstrated that all models exhibited an insufficient level of recall in the persona extraction task. The incorporation of the NCE Loss improved the model's precision at the expense of its recall. Furthermore, increasing the model's size led to enhanced extraction of personas.
翻译:本文研究了对话参与者信息提取方法及其在俄语环境下的性能评估。为训练该任务的模型,研究采用多种翻译模型将多轮对话数据集翻译为俄语,从而提升了数据质量。本文提出了一种基于F值概念的评估指标,用于衡量提取模型的有效性。该指标通过训练分类器来识别人物角色所属的对话参与者。实验在MBart、FRED-T5、基于Mistral的Starling-7B以及Encoder2Encoder等模型上进行。结果表明,所有模型在人物角色提取任务中的召回率均未达到理想水平。引入NCE损失函数虽提升了模型精确率,却以牺牲召回率为代价。此外,增大模型规模有助于提升人物角色的提取效果。