Text summarization is the process of condensing a piece of text to fewer sentences, while still preserving its content. Chat transcript, in this context, is a textual copy of a digital or online conversation between a customer (caller) and agent(s). This paper presents an indigenously (locally) developed hybrid method that first combines extractive and abstractive summarization techniques in compressing ill-punctuated or un-punctuated chat transcripts to produce more readable punctuated summaries and then optimizes the overall quality of summarization through reinforcement learning. Extensive testing, evaluations, comparisons, and validation have demonstrated the efficacy of this approach for large-scale deployment of chat transcript summarization, in the absence of manually generated reference (annotated) summaries.
翻译:文本摘要旨在将文本内容压缩为更少的句子,同时保留其核心信息。本文中的聊天记录特指客户(呼叫方)与客服人员之间数字或在线对话的文本副本。本文提出一种本地开发的混合方法:首先结合抽取式与生成式摘要技术,将缺乏标点或标点不当的聊天记录压缩为更具可读性的带标点摘要;随后通过强化学习优化摘要的整体质量。在缺乏人工生成参考摘要的情况下,经过大量测试、评估、比较与验证,本方法在大规模聊天记录摘要部署中展现出显著效能。