Content Warning: This work contains examples that potentially implicate stereotypes, associations, and other harms that could be offensive to individuals in certain social groups.} Large pre-trained language models are acknowledged to carry social biases towards different demographics, which can further amplify existing stereotypes in our society and cause even more harm. Text-to-SQL is an important task, models of which are mainly adopted by administrative industries, where unfair decisions may lead to catastrophic consequences. However, existing Text-to-SQL models are trained on clean, neutral datasets, such as Spider and WikiSQL. This, to some extent, cover up social bias in models under ideal conditions, which nevertheless may emerge in real application scenarios. In this work, we aim to uncover and categorize social biases in Text-to-SQL models. We summarize the categories of social biases that may occur in structured data for Text-to-SQL models. We build test benchmarks and reveal that models with similar task accuracy can contain social biases at very different rates. We show how to take advantage of our methodology to uncover and assess social biases in the downstream Text-to-SQL task. We will release our code and data.
翻译:[translated abstract in Chinese]
内容警告:本工作包含可能涉及刻板印象、关联及其他可能对特定社会群体造成冒犯的示例。大型预训练语言模型已被证实会对不同人群带有社会偏见,这可能会进一步放大社会中的现有刻板印象并造成更大危害。文本到SQL是一项重要任务,其主要应用于行政行业,在没有预先纠正的情况下,这些模型可能会产生破坏性后果。然而,现有文本到SQL模型均在像Spider和WikiSQL这样干净、中性的数据集上进行训练。这种作法在理想条件下一定程度上掩盖了模型中的社会偏见,但这些偏见其实可能在实际应用场景中显现。在本工作中,我们旨在揭示并分类文本到SQL模型中的社会偏见。我们总结了在结构化数据中可能出现的各类社会偏见。我们构建了测试基准,并揭示了具有相似任务准确率的模型可能包含不同程度的社会偏见。我们展示了如何利用我们的方法论来揭示和评估下游文本到SQL任务中的社会偏见。我们将公开发布我们的代码和数据。