Recent advances and applications of language technology and artificial intelligence have enabled much success across multiple domains like law, medical and mental health. AI-based Language Models, like Judgement Prediction, have recently been proposed for the legal sector. However, these models are strife with encoded social biases picked up from the training data. While bias and fairness have been studied across NLP, most studies primarily locate themselves within a Western context. In this work, we present an initial investigation of fairness from the Indian perspective in the legal domain. We highlight the propagation of learnt algorithmic biases in the bail prediction task for models trained on Hindi legal documents. We evaluate the fairness gap using demographic parity and show that a decision tree model trained for the bail prediction task has an overall fairness disparity of 0.237 between input features associated with Hindus and Muslims. Additionally, we highlight the need for further research and studies in the avenues of fairness/bias in applying AI in the legal sector with a specific focus on the Indian context.
翻译:近年来,语言技术与人工智能的进步及其应用在法律、医疗和心理健康等多个领域取得了显著成功。基于人工智能的语言模型(如判决预测)最近被提出用于法律领域。然而,这些模型从训练数据中习得了编码的社会偏见。尽管偏差和公平性已在自然语言处理领域得到广泛研究,但大多数研究主要聚焦于西方背景。在本工作中,我们从印度视角对法律领域的公平性进行了初步探索。我们揭示了在印地语法律文档上训练的保释预测模型中算法偏差的传播机制。通过人口统计均等性评估公平性差距,结果表明:在保释预测任务中,训练后的决策树模型在与印度教和穆斯林相关的输入特征之间,整体公平性差距为0.237。此外,我们强调有必要在人工智能应用于法律领域的公平性/偏差方向上开展进一步研究,特别关注印度背景。