Contrastive opinion extraction aims to extract a structured summary or key points organised as positive and negative viewpoints towards a common aspect or topic. Most recent works for unsupervised key point extraction is largely built on sentence clustering or opinion summarisation based on the popularity of opinions expressed in text. However, these methods tend to generate aspect clusters with incoherent sentences, conflicting viewpoints, redundant aspects. To address these problems, we propose a novel unsupervised Contrastive OpinioN Extraction model, called Cone, which learns disentangled latent aspect and sentiment representations based on pseudo aspect and sentiment labels by combining contrastive learning with iterative aspect/sentiment clustering refinement. Apart from being able to extract contrastive opinions, it is also able to quantify the relative popularity of aspects and their associated sentiment distributions. The model has been evaluated on both a hotel review dataset and a Twitter dataset about COVID vaccines. The results show that despite using no label supervision or aspect-denoted seed words, Cone outperforms a number of competitive baselines on contrastive opinion extraction. The results of Cone can be used to offer a better recommendation of products and services online.
翻译:对比观点抽取旨在从针对共同方面或主题的文本中,提取以正反观点形式组织的结构化摘要或关键论点。当前无监督关键点抽取方法主要基于句子聚类或依据文本中观点流行度的观点摘要技术。然而这些方法容易产生方面聚类不连贯、观点冲突及方面冗余等问题。针对上述问题,我们提出新型无监督对比观点抽取模型Cone,该模型通过结合对比学习与迭代式方面/情感聚类优化,基于伪方面标签和伪情感标签学习解耦的潜在方面表征与情感表征。该模型不仅能抽取对比观点,还可量化方面的相对流行度及其关联情感分布。我们在酒店评论数据集和关于新冠疫苗的Twitter数据集上进行了评估。结果表明,尽管未使用标签监督或方面标注种子词,Cone在对比观点抽取任务上仍优于多个竞争性基线模型。Cone的输出结果可为在线产品与服务提供更优推荐。