Instruction tuning is a standard paradigm for adapting large language models (LLMs), but modern instruction datasets are large, noisy, and redundant, making full-data fine-tuning costly and often unnecessary. Existing data selection methods either build expensive gradient datastores or assign static scores from a weak proxy, largely ignoring evolving uncertainty, and thus missing a key source of LLM interpretability. We propose GRADFILTERING, an objective-agnostic, uncertainty-aware data selection framework that utilizes a small GPT-2 proxy with a LoRA ensemble and aggregates per-example gradients into a Gradient Signal-to-Noise Ratio (G-SNR) utility. Our method matches or surpasses random subsets and strong baselines in most LLM-as-a-judge evaluations as well as in human assessment. Moreover, GRADFILTERING-selected subsets converge faster than competitive filters under the same compute budget, reflecting the benefit of uncertainty-aware scoring.
翻译:指令微调是适配大语言模型的标准范式,但现代指令数据集规模庞大、噪声显著且冗余度高,导致全数据微调成本高昂且往往不必要。现有数据选择方法要么构建昂贵的梯度数据存储,要么基于弱代理模型分配静态分数,大多忽略了模型演化过程中的不确定性,因而错失了LLM可解释性的关键来源。我们提出GRADFILTERING——一种目标无关、不确定性感知的数据选择框架,该方法采用集成LoRA的小型GPT-2代理模型,将每个样本的梯度聚合为梯度信噪比效用指标。在多数LLM作为评判器的评估及人工评估中,我们的方法达到或超越了随机子集与强基线模型的表现。此外,在相同计算预算下,GRADFILTERING选择的子集比竞争性筛选方法收敛更快,体现了不确定性感知评分的优势。