Large language models (LLMs) with their strong zero-shot topic extraction capabilities offer an alternative to probabilistic topic modelling and closed-set topic classification approaches. As zero-shot topic extractors, LLMs are expected to understand human instructions to generate relevant and non-hallucinated topics based on the given documents. However, LLM-based topic modelling approaches often face difficulties in generating topics with adherence to granularity as specified in human instructions, often resulting in many near-duplicate topics. Furthermore, methods for addressing hallucinated topics generated by LLMs have not yet been investigated. In this paper, we focus on addressing the issues of topic granularity and hallucinations for better LLM-based topic modelling. To this end, we introduce a novel approach that leverages Direct Preference Optimisation (DPO) to fine-tune open-source LLMs, such as Mistral-7B. Our approach does not rely on traditional human annotation to rank preferred answers but employs a reconstruction pipeline to modify raw topics generated by LLMs, thus enabling a fast and efficient training and inference framework. Comparative experiments show that our fine-tuning approach not only significantly improves the LLM's capability to produce more coherent, relevant, and precise topics, but also reduces the number of hallucinated topics.
翻译:摘要:大语言模型凭借其强大的零样本主题提取能力,为概率主题建模和封闭式主题分类方法提供了替代方案。作为零样本主题提取器,大语言模型需理解人类指令,根据给定文档生成相关且无幻觉的主题。然而,基于大语言模型的主题建模方法常难以按照人类指令中的粒度规范生成主题,易产生大量近似重复的主题。此外,针对大语言模型生成幻觉主题的解决方案尚未被深入探究。本文聚焦于解决主题粒度与幻觉问题,以优化基于大语言模型的主题建模。为此,我们提出一种新颖方法,利用直接偏好优化微调开源大语言模型(如Mistral-7B)。该方法不依赖传统人工标注对优选答案排序,而是通过重构流水线对大语言模型生成的原始主题进行修正,从而实现快速高效的训练与推理框架。对比实验表明,我们的微调方法不仅显著提升了大语言模型生成更连贯、相关且精确主题的能力,还有效减少了幻觉主题的数量。