We present a human-in-the-loop dashboard tailored to diagnosing potential spurious features that NLI models rely on for predictions. The dashboard enables users to generate diverse and challenging examples by drawing inspiration from GPT-3 suggestions. Additionally, users can receive feedback from a trained NLI model on how challenging the newly created example is and make refinements based on the feedback. Through our investigation, we discover several categories of spurious correlations that impact the reasoning of NLI models, which we group into three categories: Semantic Relevance, Logical Fallacies, and Bias. Based on our findings, we identify and describe various research opportunities, including diversifying training data and assessing NLI models' robustness by creating adversarial test suites.
翻译:我们提出一种人机协同仪表板,专门用于诊断NLI模型在预测中可能依赖的虚假特征。该仪表板允许用户通过借鉴GPT-3的提示来生成多样化且具有挑战性的示例。此外,用户可以从训练好的NLI模型处获取关于新创建示例难度等级的反馈,并据此优化设计。通过研究,我们发现了影响NLI模型推理的若干类虚假关联,并将其归纳为三大类别:语义相关性、逻辑谬误与偏见。基于研究结果,我们识别并阐述了多种研究方向,包括多样化训练数据以及通过创建对抗性测试套件评估NLI模型的鲁棒性。