This paper investigates the presence of political bias in emotion inference models used for sentiment analysis (SA) in social science research. Machine learning models often reflect biases in their training data, impacting the validity of their outcomes. While previous research has highlighted gender and race biases, our study focuses on political bias - an underexplored yet pervasive issue that can skew the interpretation of text data across a wide array of studies. We conducted a bias audit on a Polish sentiment analysis model developed in our lab. By analyzing valence predictions for names and sentences involving Polish politicians, we uncovered systematic differences influenced by political affiliations. Our findings indicate that annotations by human raters propagate political biases into the model's predictions. To mitigate this, we pruned the training dataset of texts mentioning these politicians and observed a reduction in bias, though not its complete elimination. Given the significant implications of political bias in SA, our study emphasizes caution in employing these models for social science research. We recommend a critical examination of SA results and propose using lexicon-based systems as a more ideologically neutral alternative. This paper underscores the necessity for ongoing scrutiny and methodological adjustments to ensure the reliability and impartiality of the use of machine learning in academic and applied contexts.
翻译:本文研究了社会科学研究中用于情感分析的情感推断模型中存在的政治偏见问题。机器学习模型通常会反映其训练数据中的偏见,从而影响其结果的效度。尽管先前研究已强调性别和种族偏见,但本研究聚焦于政治偏见——一个尚未充分探索却普遍存在的问题,可能扭曲跨广泛研究的文本数据解读。我们对实验室开发的波兰语情感分析模型进行了偏见审计。通过分析涉及波兰政治人物的人名和句子的情感效价预测,我们发现了受政治派别影响的系统性差异。研究结果表明,人工标注者的注释会将政治偏见传播至模型的预测中。为缓解此问题,我们删除了训练数据集中提及这些政治人物的文本,并观察到偏见有所减少,但未能完全消除。鉴于政治偏见在情感分析中的重大影响,本研究强调在社会科学研究中使用此类模型需保持谨慎。我们建议对情感分析结果进行批判性审查,并提出使用基于词典的系统作为意识形态上更中立的替代方案。本文强调,为确保机器学习在学术和应用场景中使用的可靠性与公正性,持续的审查和方法论调整是必要的。