Large Language Models (LLMs) have become integral to Software Engineering (SE), increasingly used in development workflows. However, their widespread adoption raises concerns about the presence and propagation of toxic language - harmful or offensive content that can foster exclusionary environments. This paper provides a comprehensive review of recent research (2020-2024) on toxicity detection and mitigation, focusing on both SE-specific and general-purpose datasets. We examine annotation and pre-processing techniques, assess detection methodologies, and evaluate mitigation strategies, particularly those leveraging LLMs. Additionally, we conduct an ablation study demonstrating the effectiveness of LLM-based rewriting for reducing toxicity. This review is limited to studies published within the specified timeframe and within the domain of toxicity in LLMs and SE; therefore, certain emerging methods or datasets beyond this period may fall outside its purview. By synthesizing existing work and identifying open challenges, this review highlights key areas for future research to ensure the responsible deployment of LLMs in SE and beyond.
翻译:大型语言模型(LLM)已成为软件工程(SE)不可或缺的组成部分,在开发工作流程中的应用日益广泛。然而,其广泛采用引发了人们对毒性语言(即可能助长排他性环境的有害或冒犯性内容)的存在和传播的担忧。本文全面综述了近期(2020-2024年)关于毒性检测与缓解的研究,重点关注SE专用和通用数据集。我们考察了标注与预处理技术,评估了检测方法学,并评价了缓解策略,特别是那些利用LLM的策略。此外,我们进行了一项消融研究,证明了基于LLM的重写对于降低毒性的有效性。本综述仅限于指定时间范围内发表的、专注于LLM和SE领域毒性问题的研究;因此,超出此时期或范围的某些新兴方法或数据集可能不在其涵盖范围内。通过综合现有工作并识别开放挑战,本综述强调了未来研究的关键领域,以确保LLM在SE及其他领域的负责任部署。