Our society is plagued by several biases, including racial biases, caste biases, and gender bias. As a matter of fact, several years ago, most of these notions were unheard of. These biases passed through generations along with amplification have lead to scenarios where these have taken the role of expected norms by certain groups in the society. One notable example is of gender bias. Whether we talk about the political world, lifestyle or corporate world, some generic differences are observed regarding the involvement of both the groups. This differential distribution, being a part of the society at large, exhibits its presence in the recorded data as well. Machine learning is almost entirely dependent on the availability of data; and the idea of learning from data and making predictions assumes that data defines the expected behavior at large. Hence, with biased data the resulting models are corrupted with those inherent biases too; and with the current popularity of ML in products, this can result in a huge obstacle in the path of equality and justice. This work studies and attempts to alleviate gender bias issues from language vision models particularly the task of image captioning. We study the extent of the impact of gender bias in existing datasets and propose a methodology to mitigate its impact in caption based language vision models.
翻译:我们的社会受到多种偏见的困扰,包括种族偏见、种姓偏见和性别偏见。事实上,在几年前,这些概念中的大多数闻所未闻。这些偏见代代相传并不断放大,导致在某些社会群体中它们被视作预期规范。其中一个显著例子是性别偏见。无论是在政治世界、生活方式还是企业界,都能观察到两个群体参与程度的普遍差异。这种差异性分布作为整个社会的一部分,同样体现在记录的数据中。机器学习几乎完全依赖于数据的可用性;而从数据中学习并做出预测的理念,假设数据在整体上定义了预期行为。因此,带有偏见的数据会导致生成的模型也被这些固有偏见所污染;随着当前机器学习在各类产品中的普及,这可能对平等与正义之路构成巨大障碍。本研究旨在研究并尝试减轻语言-视觉模型(特别是图像描述任务)中的性别偏见问题。我们探究了现有数据集中性别偏见的影响程度,并提出了一种方法来减轻其对基于描述的语言-视觉模型的影响。