To foster collaboration and inclusivity in Open Source Software (OSS) projects, it is crucial to understand and detect patterns of toxic language that may drive contributors away, especially those from underrepresented communities. Although machine learning-based toxicity detection tools trained on domain-specific data have shown promise, their design lacks an understanding of the unique nature and triggers of toxicity in OSS discussions, highlighting the need for further investigation. In this study, we employ Moral Foundations Theory to examine the relationship between moral principles and toxicity in OSS. Specifically, we analyze toxic communications in GitHub issue threads to identify and understand five types of moral principles exhibited in text, and explore their potential association with toxic behavior. Our preliminary findings suggest a possible link between moral principles and toxic comments in OSS communications, with each moral principle associated with at least one type of toxicity. The potential of MFT in toxicity detection warrants further investigation.
翻译:为促进开源软件项目的协作与包容性,理解并检测可能驱离贡献者(尤其是来自弱势群体的贡献者)的有毒语言模式至关重要。尽管基于领域特定数据训练的机器学习毒性检测工具已展现出潜力,但其设计缺乏对开源软件讨论中毒性独特性质与触发机制的理解,凸显了进一步研究的必要性。本研究运用道德基础理论,考察道德原则与开源软件中毒性行为的关系。具体而言,我们通过分析GitHub议题线程中的毒性交流内容,识别并理解文本中展现的五类道德原则,探索其与毒性行为的潜在关联。初步研究发现道德原则与开源软件交流中的毒性评论存在可能关联,且每类道德原则至少与一种毒性类型相关。道德基础理论在毒性检测中的潜力值得进一步探究。