The widespread dissemination of toxic online posts is increasingly damaging to society. However, research on detecting toxic language in Chinese has lagged significantly. Existing datasets lack fine-grained annotation of toxic types and expressions, and ignore the samples with indirect toxicity. In addition, it is crucial to introduce lexical knowledge to detect the toxicity of posts, which has been a challenge for researchers. In this paper, we facilitate the fine-grained detection of Chinese toxic language. First, we built Monitor Toxic Frame, a hierarchical taxonomy to analyze toxic types and expressions. Then, a fine-grained dataset ToxiCN is presented, including both direct and indirect toxic samples. We also build an insult lexicon containing implicit profanity and propose Toxic Knowledge Enhancement (TKE) as a benchmark, incorporating the lexical feature to detect toxic language. In the experimental stage, we demonstrate the effectiveness of TKE. After that, a systematic quantitative and qualitative analysis of the findings is given.
翻译:有害网络帖子的广泛传播日益对社会造成损害。然而,中文有毒语言检测的研究进展明显滞后。现有数据集缺乏对有毒类型和表达的细粒度标注,且忽略了具有间接毒性的样本。此外,引入词汇知识以检测帖子毒性至关重要,而这一直是研究者面临的挑战。本文旨在推进中文有毒语言的细粒度检测。首先,我们构建了"监控有毒框架"(Monitor Toxic Frame)这一层次化分类体系,用于分析有毒类型和表达。随后,提出了包含直接和间接有毒样本的细粒度数据集ToxiCN。我们还构建了包含隐晦脏话的侮辱性词汇库,并提出有毒知识增强(TKE)作为基准方法,该模型融合词汇特征以检测有毒语言。在实验阶段,我们验证了TKE的有效性,并进行了系统的定量和定性结果分析。