Hate speech detection has been the subject of high research attention, due to the scale of content created on social media. In spite of the attention and the sensitive nature of the task, privacy preservation in hate speech detection has remained under-studied. The majority of research has focused on centralised machine learning infrastructures which risk leaking data. In this paper, we show that using federated machine learning can help address privacy the concerns that are inherent to hate speech detection while obtaining up to 6.81% improvement in terms of F1-score.
翻译:仇恨言论检测由于社交媒体上生成的海量内容而受到研究界的高度关注。尽管该任务备受关注且具有敏感性,但隐私保护在仇恨言论检测中仍未被充分研究。大多数研究聚焦于集中式机器学习基础设施,这些设施存在数据泄露风险。本文表明,使用联邦机器学习有助于解决仇恨言论检测中固有的隐私问题,同时F1分数可获得高达6.81%的提升。