Key Point Analysis (KPA) has been recently proposed for deriving fine-grained insights from collections of textual comments. KPA extracts the main points in the data as a list of concise sentences or phrases, termed key points, and quantifies their prevalence. While key points are more expressive than word clouds and key phrases, making sense of a long, flat list of key points, which often express related ideas in varying levels of granularity, may still be challenging. To address this limitation of KPA, we introduce the task of organizing a given set of key points into a hierarchy, according to their specificity. Such hierarchies may be viewed as a novel type of Textual Entailment Graph. We develop ThinkP, a high quality benchmark dataset of key point hierarchies for business and product reviews, obtained by consolidating multiple annotations. We compare different methods for predicting pairwise relations between key points, and for inferring a hierarchy from these pairwise predictions. In particular, for the task of computing pairwise key point relations, we achieve significant gains over existing strong baselines by applying directional distributional similarity methods to a novel distributional representation of key points, and further boost performance via weak supervision.
翻译:关键点分析(Key Point Analysis,KPA)最近被提出用于从文本评论集合中提取细粒度洞察。KPA从数据中提取主要观点,以简洁句子或短语(称为关键点)列表的形式呈现,并量化其普遍性。尽管关键点比词云和关键短语更具表达力,但理解一份冗长、扁平的关键点列表(这些关键点往往以不同的粒度表达相关思想)仍可能具有挑战性。为解决KPA的这一局限性,我们引入了一项新任务:根据关键点的具体程度,将给定的关键点集合组织成层次结构。此类层次结构可视为一种新型的文本蕴含图。我们构建了ThinkP,这是一个针对商业和产品评论的高质量关键点层次基准数据集,通过整合多重标注得到。我们比较了预测关键点之间成对关系的不同方法,以及从这些成对预测中推断层次结构的不同方法。特别地,在计算关键点成对关系的任务中,我们通过将方向性分布相似性方法应用于一种新颖的关键点分布表示,显著超越了现有强基线模型,并进一步通过弱监督提升了性能。