Recent growth in the popularity of large language models has led to their increased usage for summarizing, predicting, and generating text, making it vital to help researchers and engineers understand how and why they work. We present KnowledgeVis, a human-in-the-loop visual analytics system for interpreting language models using fill-in-the-blank sentences as prompts. By comparing predictions between sentences, KnowledgeVis reveals learned associations that intuitively connect what language models learn during training to natural language tasks downstream, helping users create and test multiple prompt variations, analyze predicted words using a novel semantic clustering technique, and discover insights using interactive visualizations. Collectively, these visualizations help users identify the likelihood and uniqueness of individual predictions, compare sets of predictions between prompts, and summarize patterns and relationships between predictions across all prompts. We demonstrate the capabilities of KnowledgeVis with feedback from six NLP experts as well as three different use cases: (1) probing biomedical knowledge in two domain-adapted models; and (2) evaluating harmful identity stereotypes and (3) discovering facts and relationships between three general-purpose models.
翻译:近年来,大型语言模型的流行度增长导致其被越来越多地用于文本摘要、预测和生成,因此帮助研究人员和工程师理解它们的工作方式及原理变得至关重要。我们提出KnowledgeVis,一种基于人机交互的可视化分析系统,利用填空句子作为提示来解释语言模型。通过比较句子间的预测结果,KnowledgeVis揭示出语言模型在训练中习得的关联,这些关联直观地将模型训练内容与下游自然语言任务联系起来,帮助用户创建和测试多种提示变体、使用新颖的语义聚类技术分析预测词汇、并通过交互式可视化发现洞见。这些可视化共同帮助用户识别单个预测的可能性与独特性、比较不同提示之间的预测集合,并总结所有提示间的预测模式与关系。我们通过六位自然语言处理专家的反馈以及三个不同用例展示了KnowledgeVis的能力:(1)探究两个领域自适应模型中的生物医学知识;(2)评估有害身份刻板印象;(3)发现三个通用模型中的事实与关系。