Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection

Algorithmic bias often arises as a result of differential subgroup validity, in which predictive relationships vary across groups. For example, in toxic language detection, comments targeting different demographic groups can vary markedly across groups. In such settings, trained models can be dominated by the relationships that best fit the majority group, leading to disparate performance. We propose framing toxicity detection as multi-task learning (MTL), allowing a model to specialize on the relationships that are relevant to each demographic group while also leveraging shared properties across groups. With toxicity detection, each task corresponds to identifying toxicity against a particular demographic group. However, traditional MTL requires labels for all tasks to be present for every data point. To address this, we propose Conditional MTL (CondMTL), wherein only training examples relevant to the given demographic group are considered by the loss function. This lets us learn group specific representations in each branch which are not cross contaminated by irrelevant labels. Results on synthetic and real data show that using CondMTL improves predictive recall over various baselines in general and for the minority demographic group in particular, while having similar overall accuracy.

翻译：算法偏差常源于不同子群體的有效性差異，即預測關係在不同群體間存在變化。例如在毒性語言檢測中，針對不同人口群體的評論內容可能呈現顯著差異。此類場景下，訓練模型可能被最適配多數群體的關係主導，導致性能表現失衡。我們提出將毒性檢測框架構建為多任務學習（MTL），使模型既能針對各人口群體的相關關係進行特化學習，又能利用跨群體的共享特徵。在毒性檢測中，每個任務對應識別針對特定人口群體的毒性內容。然而傳統多任務學習要求每個數據點具備所有任務的標籤。為解決此問題，我們提出條件多任務學習（CondMTL），其損失函數僅考慮與給定人口群體相關的訓練樣本。這使我們能在各分支中學習到不受無關標籤交叉污染的群體特化表徵。在合成數據與真實數據上的實驗結果表明，CondMTL在保持總體準確率相當的前提下，能夠顯著提升普遍情況（特別是少數人口群體）的預測召回率，優於各類基準方法。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日