Text summarization is an essential task in natural language processing, and researchers have developed various approaches over the years, ranging from rule-based systems to neural networks. However, there is no single model or approach that performs well on every type of text. We propose a system that recommends the most suitable summarization model for a given text. The proposed system employs a fully connected neural network that analyzes the input content and predicts which summarizer should score the best in terms of ROUGE score for a given input. The meta-model selects among four different summarization models, developed for the Slovene language, using different properties of the input, in particular its Doc2Vec document representation. The four Slovene summarization models deal with different challenges associated with text summarization in a less-resourced language. We evaluate the proposed SloMetaSum model performance automatically and parts of it manually. The results show that the system successfully automates the step of manually selecting the best model.
翻译:文本摘要是自然语言处理中的一项关键任务。多年来,研究人员开发了各种方法,从基于规则的体系到神经网络。然而,没有任何单一模型或方法能在所有类型的文本上表现优异。我们提出了一个系统,针对给定文本推荐最合适的摘要模型。该系统采用全连接神经网络分析输入内容,并预测哪个摘要器能在给定输入下取得最高的ROUGE分数。该元模型根据输入的不同属性(尤其是其Doc2Vec文档表示),从四种为斯洛文尼亚语开发的摘要模型中进行选择。这四种斯洛文尼亚语摘要模型应对了低资源语言文本摘要中的不同挑战。我们自动评估了所提出的SloMetaSum模型的性能,并对其部分进行了手动评估。结果表明,该系统成功实现了手动选择最佳模型的自动化步骤。