Pre-trained transformer-based language models are becoming increasingly popular due to their exceptional performance on various benchmarks. However, concerns persist regarding the presence of hidden biases within these models, which can lead to discriminatory outcomes and reinforce harmful stereotypes. To address this issue, we propose Finspector, a human-centered visual inspection tool designed to detect biases in different categories through log-likelihood scores generated by language models. The goal of the tool is to enable researchers to easily identify potential biases using visual analytics, ultimately contributing to a fairer and more just deployment of these models in both academic and industrial settings. Finspector is available at https://github.com/IBM/finspector.
翻译:基于Transformer的预训练语言模型因在多个基准测试中表现卓越而日益流行。然而,关于这些模型中隐藏偏见的担忧持续存在,这些偏见可能导致歧视性结果并强化有害刻板印象。为解决这一问题,我们提出Finspector——一种以人为中心的视觉检测工具,旨在通过语言模型生成的对数似然分数来检测不同类别中的偏见。该工具的目标是利用可视化分析帮助研究人员轻松识别潜在偏见,最终在学术和工业环境中促进这些模型更公平、更公正地部署。Finspector工具可通过https://github.com/IBM/finspector 获取。