One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms of messaging which aim to answer concerns expressed about vaccination. Tailoring responses in this domain is difficult, since opinions often have seemingly little ideological overlap. We define the task of tailoring vaccine interventions to a Common-Ground Opinion (CGO). Tailoring responses to a CGO involves meaningfully improving the answer by relating it to an opinion or belief the reader holds. In this paper we introduce TAILOR-CGO, a dataset for evaluating how well responses are tailored to provided CGOs. We benchmark several major LLMs on this task; finding GPT-4-Turbo performs significantly better than others. We also build automatic evaluation metrics, including an efficient and accurate BERT model that outperforms finetuned LLMs, investigate how to successfully tailor vaccine messaging to CGOs, and provide actionable recommendations from this investigation. Code and model weights: https://github.com/rickardstureborg/tailor-cgo Dataset: https://huggingface.co/datasets/DukeNLP/tailor-cgo
翻译:个性化聊天机器人互动的一种方式是与目标读者建立共同观点。在疫苗关切与 misinformation 领域,建立相互理解可能尤为关键。疫苗干预措施旨在回答人们对疫苗接种表达出的关切。在该领域定制回应具有挑战性,因为不同观点往往缺乏意识形态层面的重叠。我们定义了将疫苗干预措施定制为共同观点(CGO)的任务。针对CGO定制回应需要将回答与读者持有的观点或信念相关联,从而有实质性地提升回答质量。本文介绍了TAILOR-CGO数据集,用于评估回应针对给定CGO的定制效果。我们在该任务上对多个主流大语言模型进行了基准测试,发现GPT-4-Turbo的性能显著优于其他模型。我们还构建了自动评估指标,包括一个高效准确的BERT模型,其表现优于微调后的大语言模型,并研究了如何成功地将疫苗宣传信息定制为CGO,同时基于该研究提出了可操作的建议。代码与模型权重:https://github.com/rickardstureborg/tailor-cgo;数据集:https://huggingface.co/datasets/DukeNLP/tailor-cgo