Abstractive dialogue summarization is to generate a concise and fluent summary covering the salient information in a dialogue among two or more interlocutors. It has attracted great attention in recent years based on the massive emergence of social communication platforms and an urgent requirement for efficient dialogue information understanding and digestion. Different from news or articles in traditional document summarization, dialogues bring unique characteristics and additional challenges, including different language styles and formats, scattered information, flexible discourse structures and unclear topic boundaries. This survey provides a comprehensive investigation on existing work for abstractive dialogue summarization from scenarios, approaches to evaluations. It categorizes the task into two broad categories according to the type of input dialogues, i.e., open-domain and task-oriented, and presents a taxonomy of existing techniques in three directions, namely, injecting dialogue features, designing auxiliary training tasks and using additional data.A list of datasets under different scenarios and widely-accepted evaluation metrics are summarized for completeness. After that, the trends of scenarios and techniques are summarized, together with deep insights on correlations between extensively exploited features and different scenarios. Based on these analyses, we recommend future directions including more controlled and complicated scenarios, technical innovations and comparisons, publicly available datasets in special domains, etc.
翻译:摘要性对话摘要旨在生成简洁流畅的摘要,涵盖两个或多个对话者之间对话中的关键信息。近年来,随着社交沟通平台的大量涌现以及高效理解与消化对话信息的迫切需求,该领域受到了广泛关注。与传统文档摘要中的新闻或文章不同,对话具有独特的特征和额外挑战,包括不同的语言风格与格式、信息分散、灵活的话语结构以及模糊的主题边界。本综述从场景、方法到评估,对现有摘要性对话摘要研究工作进行了全面梳理。根据输入对话的类型,将任务分为两大类:开放域对话和任务导向型对话,并从三个方向系统归纳了现有技术:注入对话特征、设计辅助训练任务以及使用额外数据。为求完整,本文还汇总了不同场景下的数据集列表和广泛采用的评估指标。随后,总结了场景和技术的发展趋势,并深入分析了广泛利用的特征与不同场景之间的关联。基于这些分析,我们推荐了未来研究方向,包括更具控制性和复杂性的场景、技术创新与比较、特定领域的公开数据集等。