The work covers the development and explainability of machine learning models for predicting political leanings through parliamentary transcriptions. We concentrate on the Slovenian parliament and the heated debate on the European migrant crisis, with transcriptions from 2014 to 2020. We develop both classical machine learning and transformer language models to predict the left- or right-leaning of parliamentarians based on their given speeches on the topic of migrants. With both types of models showing great predictive success, we continue with explaining their decisions. Using explainability techniques, we identify keywords and phrases that have the strongest influence in predicting political leanings on the topic, with left-leaning parliamentarians using concepts such as people and unity and speak about refugees, and right-leaning parliamentarians using concepts such as nationality and focus more on illegal migrants. This research is an example that understanding the reasoning behind predictions can not just be beneficial for AI engineers to improve their models, but it can also be helpful as a tool in the qualitative analysis steps in interdisciplinary research.
翻译:本研究涵盖了通过议会转录文本预测政治倾向的机器学习模型的开发及其可解释性。我们聚焦于斯洛文尼亚议会及2014年至2020年间关于欧洲移民危机的激烈辩论,分别构建了传统机器学习模型和Transformer语言模型,用于根据议员就移民议题发表的演讲预测其左翼或右翼倾向。两类模型均展现出卓越的预测性能,继而我们对模型的决策过程进行解析。通过可解释性技术,我们识别出对预测政治倾向最具影响力的关键词和短语:左翼议员倾向于使用“人民”、“团结”等概念并提及“难民”,而右翼议员则侧重“国籍”概念并更多关注“非法移民”。本研究证明,理解预测背后的推理不仅有助于人工智能工程师改进模型,更能成为跨学科研究中定性分析环节的有效工具。