We propose an agentic data augmentation method for Aspect-Based Sentiment Analysis (ABSA) that uses iterative generation and verification to produce high quality synthetic training examples. To isolate the effect of agentic structure, we also develop a closely matched prompting-based baseline using the same model and instructions. Both methods are evaluated across three ABSA subtasks (Aspect Term Extraction (ATE), Aspect Sentiment Classification (ATSC), and Aspect Sentiment Pair Extraction (ASPE)), four SemEval datasets, and two encoder-decoder models: T5-Base and Tk-Instruct. Our results show that the agentic augmentation outperforms raw prompting in label preservation of the augmented data, especially when the tasks require aspect term generation. In addition, when combined with real data, agentic augmentation provides higher gains, consistently outperforming prompting-based generation. These benefits are most pronounced for T5-Base, while the more heavily pretrained Tk-Instruct exhibits smaller improvements. As a result, augmented data helps T5-Base achieve comparable performance with its counterpart.
翻译:我们提出了一种用于方面级情感分析(ABSA)的智能体数据增强方法,该方法通过迭代生成与验证来产生高质量的合成训练样本。为分离智能体结构的影响,我们还使用相同模型和指令开发了一个严格匹配的基于提示的基线方法。两种方法均在三个ABSA子任务(方面术语提取(ATE)、方面情感分类(ATSC)和方面情感对提取(ASPE))、四个SemEval数据集以及两种编码器-解码器模型(T5-Base与Tk-Instruct)上进行评估。实验结果表明,在增强数据的标签保持性方面,智能体增强方法优于原始提示方法,尤其在需要生成方面术语的任务中表现突出。此外,当与真实数据结合时,智能体增强能带来更高的性能提升,始终优于基于提示的生成方法。这些优势在T5-Base模型上最为显著,而预训练更充分的Tk-Instruct模型则表现出较小的改进。最终,增强数据帮助T5-Base模型达到了与Tk-Instruct相当的性能水平。