Even when aggregate accuracy is high, state-of-the-art NLP models often fail systematically on specific subgroups of data, resulting in unfair outcomes and eroding user trust. Additional data collection may not help in addressing these weaknesses, as such challenging subgroups may be unknown to users, and underrepresented in the existing and new data. We propose Targeted Data Generation (TDG), a framework that automatically identifies challenging subgroups, and generates new data for those subgroups using large language models (LLMs) with a human in the loop. TDG estimates the expected benefit and potential harm of data augmentation for each subgroup, and selects the ones most likely to improve within group performance without hurting overall performance. In our experiments, TDG significantly improves the accuracy on challenging subgroups for state-of-the-art sentiment analysis and natural language inference models, while also improving overall test accuracy.
翻译:摘要:即便在整体准确率较高的情况下,最先进的自然语言处理模型仍时常在特定数据子群上出现系统性失败,这导致了不公平的结果并侵蚀了用户信任。单纯增加数据收集可能无助于解决这些弱点,因为这类具有挑战性的子群对用户而言可能未知,且在现有及新数据中代表性不足。我们提出了定向数据生成(TDG)框架,该框架能自动识别具有挑战性的子群,并利用大型语言模型(LLM)在人工参与下为这些子群生成新数据。TDG会评估每个子群数据增强的预期收益与潜在危害,并选择最有可能提升组内性能且不损害整体性能的子群。实验表明,在情感分析与自然语言推理模型上,TDG显著提升了最具挑战性子群的准确率,同时优化了整体测试准确率。