Existing studies on comparative opinion mining have mainly focused on explicit comparative expressions, which are uncommon in real-world reviews. This leaves implicit comparisons - here users express preferences across separate reviews - largely underexplored. We introduce SUDO, a novel dataset for implicit comparative opinion mining from same-user reviews, allowing reliable inference of user preferences even without explicit comparative cues. SUDO comprises 4,150 annotated review pairs (15,191 sentences) with a bi-level structure capturing aspect-level mentions and review-level preferences. We benchmark this task using two baseline architectures: traditional machine learning- and language model-based baselines. Experimental results show that while the latter outperforms the former, overall performance remains moderate, revealing the inherent difficulty of the task and establishing SUDO as a challenging and valuable benchmark for future research.
翻译:现有关于比较意见挖掘的研究主要集中于显性比较表达,而这些表达在现实评论中并不常见。这使得隐性比较——即用户在不同评论中表达偏好——在很大程度上未被充分探索。我们提出了SUDO,一个用于从同用户评论中挖掘隐性比较意见的新型数据集,该数据集允许在没有显性比较线索的情况下可靠推断用户偏好。SUDO包含4,150个标注评论对(共15,191个句子),采用双层结构捕捉方面级提及和评论级偏好。我们使用两种基线架构对该任务进行基准测试:基于传统机器学习的基线和基于语言模型的基线。实验结果表明,尽管后者优于前者,但整体性能仍处于中等水平,这揭示了该任务固有的难度,并确立了SUDO作为未来研究中具有挑战性和价值的基准。