With the recent progress of Large Language Models (LLMs), there is a growing interest in applying these models to solve complex and challenging problems. Modern LLMs, capable of processing long contexts and generating verbalized explanations, offer significant potential in addressing real-world applications. However, a critical hurdle in deploying LLMs for practical decision-making is their inability to provide reliable, quantitative probabilities. While task-specific fine-tuning of LLMs using traditional discriminative objectives (similar to encoder-only models) can yield probability estimates, this often leads to catastrophic forgetting and linguistic collapse. Consequently, the model loses its ability to generate explanations, severely undermining its interpretability and usability. To address this challenge, we propose CLSGen, a novel LLM fine-tuning framework designed for binary classification tasks. The CLSGen framework encompasses a new model architecture, training methodology, and data construction strategy to enable robust probability estimation without sacrificing the model's inherent explanation-generation capabilities. Experimental results across multiple benchmark datasets demonstrate that models fine-tuned with CLSGen outperform existing baselines in classification metrics (AUROC and F1-score). Regarding explanation, the results showed strong alignment between predicted labels and generated justifications, as well as high readability.
翻译:随着大型语言模型(LLM)的最新进展,人们越来越关注将这些模型应用于解决复杂且具有挑战性的问题。现代LLM能够处理长文本上下文并生成口头解释,因此在应对实际应用方面展现出巨大潜力。然而,将LLM部署用于实际决策时的一个关键障碍是其无法提供可靠的定量概率。使用传统的判别式目标(类似于仅编码器模型)对LLM进行任务特定微调虽然可以产生概率估计,但这通常会导致灾难性遗忘和语言能力崩溃。因此,模型失去生成解释的能力,严重损害其可解释性和可用性。为应对这一挑战,我们提出CLSGen,这是一种为二分类任务设计的新型LLM微调框架。CLSGen框架涵盖了新的模型架构、训练方法和数据构建策略,能够在不牺牲模型固有解释生成能力的情况下,实现稳健的概率估计。在多个基准数据集上的实验结果表明,使用CLSGen微调的模型在分类指标(AUROC和F1分数)上优于现有基线方法。在解释方面,结果显示预测标签与生成的推理依据高度一致,且具有高可读性。