Factuality is a crucial requirement in information seeking dialogue: the system should respond to the user's queries so that the responses are meaningful and aligned with the knowledge provided to the system. However, most modern large language models suffer from hallucinations, that is, they generate responses not supported by or contradicting the knowledge source. To mitigate the issue and increase faithfulness of information-seeking dialogue systems, we introduce BeInfo, a simple yet effective method that applies behavioural tuning to aid information-seeking dialogue. Relying on three standard datasets, we show that models tuned with BeInfo} become considerably more faithful to the knowledge source both for datasets and domains seen during BeInfo-tuning, as well as on unseen domains, when applied in a zero-shot manner. In addition, we show that the models with 3B parameters (e.g., Flan-T5) tuned with BeInfo demonstrate strong performance on data from real `production' conversations and outperform GPT4 when tuned on a limited amount of such realistic in-domain dialogues.
翻译:事实准确性是信息寻求对话中的关键要求:系统应响应用户查询,使回应有意义且与提供给系统的知识一致。然而,大多数现代大语言模型存在幻觉问题,即生成不受知识来源支持或与知识来源矛盾的回应。为缓解这一问题并提高信息寻求对话系统的忠实度,我们提出了BeInfo,一种简单而有效的方法,通过行为调优来辅助信息寻求对话。基于三个标准数据集,我们展示了使用BeInfo调优的模型在BeInfo调优期间所见的数据集和领域以及未见过的领域中,以零样本方式应用时,对知识来源的忠实度显著提高。此外,我们表明,使用BeInfo调优的3B参数模型(如Flan-T5)在来自真实“生产”对话的数据上表现出色,并在有限数量的此类真实领域对话上调优时超越了GPT4。