Opinion summarization is the task of creating summaries capturing popular opinions from user reviews. In this paper, we introduce Geodesic Summarizer (GeoSumm), a novel system to perform unsupervised extractive opinion summarization. GeoSumm involves an encoder-decoder based representation learning model, that generates representations of text as a distribution over latent semantic units. GeoSumm generates these representations by performing dictionary learning over pre-trained text representations at multiple decoder layers. We then use these representations to quantify the relevance of review sentences using a novel approximate geodesic distance based scoring mechanism. We use the relevance scores to identify popular opinions in order to compose general and aspect-specific summaries. Our proposed model, GeoSumm, achieves state-of-the-art performance on three opinion summarization datasets. We perform additional experiments to analyze the functioning of our model and showcase the generalization ability of {\X} across different domains.
翻译:观点摘要是从用户评论中生成捕捉主流观点的摘要任务。本文提出了一种新颖的无监督抽取式观点摘要系统——测地线摘要器(GeoSumm)。GeoSumm采用基于编码器-解码器的表示学习模型,将文本表示建模为潜在语义单元的分布。该模型通过在多个解码器层对预训练文本表示进行字典学习生成这些表示。随后,我们利用这些表示,通过一种基于近似测地线距离的评分机制量化评论句子的相关性,并根据相关性分数识别主流观点以生成通用摘要和特定方面摘要。实验表明,所提出的GeoSumm模型在三个观点摘要数据集上取得了最先进的性能。我们通过额外实验分析了模型运行机制,并展示了其在跨领域场景下的泛化能力。