We introduce Arctic-ABSA, a collection of powerful models for real-life aspect-based sentiment analysis (ABSA). Our models are tailored to commercial needs, trained on a large corpus of public data alongside carefully generated synthetic data, resulting in a dataset 20 times larger than SemEval14. We extend typical ABSA models by expanding the number of sentiment classes from the standard three (positive, negative, neutral) to five, adding mixed and unknown classes, while also jointly predicting overall text sentiment and supporting multiple languages. We experiment with reasoning injection by fine-tuning on Chain-of-Thought (CoT) examples and introduce a novel reasoning pretraining technique for encoder-only models that significantly improves downstream fine-tuning and generalization. Our 395M-parameter encoder and 8B-parameter decoder achieve up to 10 percentage points higher accuracy than GPT-4o and Claude 3.5 Sonnet, while setting new state-of-the-art results on the SemEval14 benchmark. A single multilingual model maintains 87-91% accuracy across six languages without degrading English performance. We release ABSA-mix, a large-scale benchmark aggregating 17 public ABSA datasets across 92 domains.
翻译:我们推出Arctic-ABSA,一套面向实际应用的强大方面级情感分析模型。该系列模型针对商业需求定制,在大量公开数据与精心生成的合成数据上进行训练,构建的数据集规模达到SemEval14的20倍。我们通过将情感类别从标准的三类(积极、消极、中性)扩展至五类(新增混合类与未知类),同时联合预测文本整体情感并支持多语言处理,从而拓展了传统ABSA模型的能力边界。我们通过基于思维链示例的微调进行推理注入实验,并提出一种创新的仅编码器模型推理预训练技术,该技术显著提升了下游微调效果与泛化能力。我们的3.95亿参数编码器与80亿参数解码器模型在SemEval14基准测试中取得了比GPT-4o和Claude 3.5 Sonnet高出10个百分点的准确率,并创造了新的最优性能记录。单一多语言模型在六种语言中保持87-91%的准确率,且英语性能未见衰减。我们同步发布ABSA-mix——一个整合了跨92个领域17个公开ABSA数据集的大规模基准测试集。