With the continuous advancement of artificial intelligence, natural language processing technology has become widely utilized in various fields. At the same time, there are many challenges in creating Chinese news summaries. First of all, the semantics of Chinese news is complex, and the amount of information is enormous. Extracting critical information from Chinese news presents a significant challenge. Second, the news summary should be concise and clear, focusing on the main content and avoiding redundancy. In addition, the particularity of the Chinese language, such as polysemy, word segmentation, etc., makes it challenging to generate Chinese news summaries. Based on the above, this paper studies the information extraction method of the LCSTS dataset based on an improved BERTSum-LSTM model. We improve the BERTSum-LSTM model to make it perform better in generating Chinese news summaries. The experimental results show that the proposed method has a good effect on creating news summaries, which is of great importance to the construction of news summaries.
翻译:随着人工智能的持续发展,自然语言处理技术已在众多领域得到广泛应用。与此同时,中文新闻摘要的生成仍面临诸多挑战。首先,中文新闻语义复杂、信息量庞大,从中提取关键信息具有显著难度。其次,新闻摘要需简洁明了,聚焦核心内容并避免冗余。此外,中文特有的多义词、分词等语言特性,进一步增加了生成中文新闻摘要的复杂性。基于以上背景,本文研究基于改进BERTSum-LSTM模型的LCSTS数据集信息抽取方法。我们对BERTSum-LSTM模型进行改进,以提升其在中文新闻摘要生成任务中的性能。实验结果表明,所提方法在新闻摘要生成方面效果良好,对新闻摘要的构建具有重要意义。