Municipal meeting minutes record key decisions in local democratic processes. Unlike parliamentary proceedings, which typically adhere to standardized formats, they encode voting outcomes in highly heterogeneous, free-form narrative text that varies widely across municipalities, posing significant challenges for automated extraction. In this paper, we introduce VotIE (Voting Information Extraction), a new information extraction task aimed at identifying structured voting events in narrative deliberative records, and establish the first benchmark for this task using Portuguese municipal minutes, building on the recently introduced CitiLink corpus. Our experiments yield two key findings. First, under standard in-domain evaluation, fine-tuned encoders, specifically XLM-R-CRF, achieve the strongest performance, reaching 93.2\% macro F1, outperforming generative approaches. Second, in a cross-municipality setting that evaluates transfer to unseen administrative contexts, these models suffer substantial performance degradation, whereas few-shot LLMs demonstrate greater robustness, with significantly smaller declines in performance. Despite this generalization advantage, the high computational cost of generative models currently constrains their practicality. As a result, lightweight fine-tuned encoders remain a more practical option for large-scale, real-world deployment. To support reproducible research in administrative NLP, we publicly release our benchmark, trained models, and evaluation framework.
翻译:市政会议纪要记录了地方民主进程中的关键决策。与通常遵循标准化格式的议会记录不同,这些纪要以高度异构、自由形式的叙述性文本编码投票结果,且在不同市政机构间差异巨大,这给自动化信息抽取带来了重大挑战。本文提出VotIE(投票信息抽取),这是一种旨在从叙述性审议记录中识别结构化投票事件的新信息抽取任务,并基于近期引入的CitiLink语料库,使用葡萄牙市政会议纪要为此任务建立了首个基准。我们的实验得出两个关键发现。首先,在标准的域内评估下,经过微调的编码器(特别是XLM-R-CRF)取得了最佳性能,宏F1值达到93.2%,优于生成式方法。其次,在评估向未见过的行政环境迁移的跨市政机构设置中,这些模型性能显著下降,而少样本大语言模型则表现出更强的鲁棒性,性能下降幅度明显更小。尽管生成式模型具有这种泛化优势,但其高昂的计算成本目前限制了其实用性。因此,对于大规模实际部署,轻量级的微调编码器仍是更实用的选择。为支持行政自然语言处理领域的可复现研究,我们公开发布了基准数据集、训练好的模型及评估框架。