Opinion target extraction (OTE) or aspect extraction (AE) is a fundamental task in opinion mining that aims to extract the targets (or aspects) on which opinions have been expressed. Recent work focus on cross-domain OTE, which is typically encountered in real-world scenarios, where the testing and training distributions differ. Most methods use domain adversarial neural networks that aim to reduce the domain gap between the labelled source and unlabelled target domains to improve target domain performance. However, this approach only aligns feature distributions and does not account for class-wise feature alignment, leading to suboptimal results. Semi-supervised learning (SSL) has been explored as a solution, but is limited by the quality of pseudo-labels generated by the model. Inspired by the theoretical foundations in domain adaptation [2], we propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagree on the unlabelled target data, in an effort to boost the target domain performance. Extensive experiments on benchmark cross-domain OTE datasets show that this approach is effective and performs consistently well in settings with large domain shifts.
翻译:意见目标抽取(OTE)或方面抽取(AE)是意见挖掘中的基础任务,旨在提取意见所表达的目标(或方面)。近期研究聚焦于跨领域OTE,该任务常见于真实场景中测试与训练数据分布存在差异的情况。多数方法采用领域对抗神经网络,通过减小标注源域与未标注目标域之间的领域差距来提升目标域性能。然而,该方法仅对齐特征分布而缺乏类别层面的特征对齐,导致效果欠佳。半监督学习(SSL)虽已被探索作为解决方案,但受限于模型生成的伪标签质量。受领域适应[2]理论基础的启发,我们提出一种新型SSL方法:通过选取目标域样本中领域特定教师网络与学生网络在未标注数据上输出存在分歧的样本,以提升目标域性能。在跨领域OTE基准数据集上的大量实验表明,该方法效果显著,且在领域偏移较大的设置中表现持续稳定。