Recently, ChatGPT has drawn great attention from both the research community and the public. We are particularly curious about whether it can serve as a universal sentiment analyzer. To this end, in this work, we provide a preliminary evaluation of ChatGPT on the understanding of opinions, sentiments, and emotions contained in the text. Specifically, we evaluate it in four settings, including standard evaluation, polarity shift evaluation, open-domain evaluation, and sentiment inference evaluation. The above evaluation involves 18 benchmark datasets and 5 representative sentiment analysis tasks, and we compare ChatGPT with fine-tuned BERT and corresponding state-of-the-art (SOTA) models on end-task. Moreover, we also conduct human evaluation and present some qualitative case studies to gain a deep comprehension of its sentiment analysis capabilities.
翻译:近期,ChatGPT 引发了研究界和公众的广泛关注。我们特别好奇它能否作为一种通用的情感分析器。为此,本文对 ChatGPT 在理解文本中包含的观点、情感和情绪方面进行了初步评估。具体而言,我们在四种设置下对其进行评估,包括标准评估、极性转换评估、开放域评估和情感推理评估。上述评估涉及 18 个基准数据集和 5 项代表性情感分析任务,并将 ChatGPT 与微调后的 BERT 以及相应最先进的端点任务模型进行了比较。此外,我们还进行了人工评估,并呈现了一些定性案例研究,以深入了解其情感分析能力。