Aspect-based sentiment analysis is a long-standing research interest in the field of opinion mining, and in recent years, researchers have gradually shifted their focus from simple ABSA subtasks to end-to-end multi-element ABSA tasks. However, the datasets currently used in the research are limited to individual elements of specific tasks, usually focusing on in-domain settings, ignoring implicit aspects and opinions, and with a small data scale. To address these issues, we propose a large-scale Multi-Element Multi-Domain dataset (MEMD) that covers the four elements across five domains, including nearly 20,000 review sentences and 30,000 quadruples annotated with explicit and implicit aspects and opinions for ABSA research. Meanwhile, we evaluate generative and non-generative baselines on multiple ABSA subtasks under the open domain setting, and the results show that open domain ABSA as well as mining implicit aspects and opinions remain ongoing challenges to be addressed. The datasets are publicly released at \url{https://github.com/NUSTM/MEMD-ABSA}.
翻译:方面级情感分析是意见挖掘领域长期关注的研究热点,近年来研究者逐渐将焦点从简单的ABSA子任务转向端到端的多元素ABSA任务。然而,当前研究中使用的数据集仅局限于特定任务的单一元素,通常聚焦于领域内场景,忽略了隐式方面和意见,且数据规模较小。为解决这些问题,我们提出了一个大规模的多元素多领域数据集(MEMD),涵盖五个领域中的四种元素,包含近2万条评论句子和3万个标注了显式/隐式方面与意见的四元组,用于ABSA研究。同时,我们在开放领域场景下评估了多个ABSA子任务上的生成式和非生成式基线方法,结果表明开放领域ABSA以及挖掘隐式方面与意见仍是亟待解决的持续挑战。该数据集已在\url{https://github.com/NUSTM/MEMD-ABSA}公开共享。