A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation

The proliferation of misinformation and harmful narratives in online discourse has underscored the critical need for effective Counter Narrative (CN) generation techniques. However, existing automatic evaluation methods often lack interpretability and fail to capture the nuanced relationship between generated CNs and human perception. Aiming to achieve a higher correlation with human judgments, this paper proposes a novel approach to asses generated CNs that consists on the use of a Large Language Model (LLM) as a evaluator. By comparing generated CNs pairwise in a tournament-style format, we establish a model ranking pipeline that achieves a correlation of $0.88$ with human preference. As an additional contribution, we leverage LLMs as zero-shot (ZS) CN generators and conduct a comparative analysis of chat, instruct, and base models, exploring their respective strengths and limitations. Through meticulous evaluation, including fine-tuning experiments, we elucidate the differences in performance and responsiveness to domain-specific data. We conclude that chat-aligned models in ZS are the best option for carrying out the task, provided they do not refuse to generate an answer due to security concerns.

翻译：在线话语中错误信息和有害叙事的扩散凸显了对有效反叙事（CN）生成技术的迫切需求。然而，现有的自动评估方法往往缺乏可解释性，且未能捕捉生成的反叙事与人类感知之间的微妙关系。为了获得与人类判断更高的相关性，本文提出了一种评估生成反叙事的新方法，该方法利用大语言模型（LLM）作为评估器。通过以锦标赛形式对生成的反叙事进行成对比较，我们建立了一个模型排序流程，其与人类偏好的相关性达到 $0.88$。作为另一项贡献，我们利用大语言模型作为零样本（ZS）反叙事生成器，并对聊天模型、指令微调模型和基础模型进行了比较分析，探讨了它们各自的优势与局限。通过包括微调实验在内的细致评估，我们阐明了它们在性能及对领域特定数据的响应性方面的差异。我们得出结论：在零样本设置下，经过聊天对齐的模型是执行此任务的最佳选择，前提是它们不会因安全顾虑而拒绝生成答案。

相关内容

中国神经科学学会

关注 0

中国神经科学学会（CNS）是由全国的科研、教学和医院等单位中的神经科学工作者组成的，具有独立法人资格的非营利性社会团体。自2016年起，学会开始致力于神经科学学科引领和学术战略规划。2016-2018年完成了中国科协《神经科学方向预测与技术路线图》项目和《生命科学领域前沿跟踪研究》项目，并且已经由科学出版社正式出版，2020年完成了《神经科学和类脑人工智能发展-新进展新趋势》。2020-2021年还将完成《我国类脑智能产业与技术发展路线图研究》和《科技经济融合发展-智能细胞制造科技创新与产业发展战略研究》。2020年开始学会将每年开展评选年度“中国神经科学重大进展”。中国神经科学学会年会即全国学术会议，是我国神经科学领域规模最大、学术水平最高的学术会议。从2021年开始，改为一年一次，并且与海内外华人神经科学家研讨会结合在一起。学会下属专业分会每年召开形式多样、内容丰富的学术会议和培训班，促进了神经科学领域的学术交流和合作。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日