We present a method based on natural language processing (NLP), for studying the influence of interest groups (lobbies) in the law-making process in the European Parliament (EP). We collect and analyze novel datasets of lobbies' position papers and speeches made by members of the EP (MEPs). By comparing these texts on the basis of semantic similarity and entailment, we are able to discover interpretable links between MEPs and lobbies. In the absence of a ground-truth dataset of such links, we perform an indirect validation by comparing the discovered links with a dataset, which we curate, of retweet links between MEPs and lobbies, and with the publicly disclosed meetings of MEPs. Our best method achieves an AUC score of 0.77 and performs significantly better than several baselines. Moreover, an aggregate analysis of the discovered links, between groups of related lobbies and political groups of MEPs, correspond to the expectations from the ideology of the groups (e.g., center-left groups are associated with social causes). We believe that this work, which encompasses the methodology, datasets, and results, is a step towards enhancing the transparency of the intricate decision-making processes within democratic institutions.
翻译:我们提出一种基于自然语言处理(NLP)的方法,用于研究利益集团(游说团体)在欧洲议会(EP)立法过程中的影响力。我们收集并分析了游说集团立场文件与欧洲议会议员(MEPs)演讲的新型数据集。通过基于语义相似性和蕴含关系比较这些文本,我们能够发现MEPs与游说集团之间可解释的关联。由于缺乏此类关联的真实数据集,我们通过将发现的关联与我们整理的MEPs与游说集团之间的转推链接数据集,以及公开披露的MEPs会议记录进行间接验证。我们的最优方法实现了0.77的AUC得分,显著优于多个基线模型。此外,对相关游说集团群体与MEPs政治团体之间关联的聚合分析,与各团体的意识形态预期相符(例如,中左翼团体与社会议题相关联)。我们认为,这项涵盖方法论、数据集与结果的研究,是推动民主制度内复杂决策过程透明化的重要一步。