This work introduces UstanceBR, a multimodal corpus in the Brazilian Portuguese Twitter domain for target-based stance prediction. The corpus comprises 86.8 k labelled stances towards selected target topics, and extensive network information about the users who published these stances on social media. In this article we describe the corpus multimodal data, and a number of usage examples in both in-domain and zero-shot stance prediction based on text- and network-related information, which are intended to provide initial baseline results for future studies in the field.
翻译:本文介绍了UstanceBR,一个基于巴西葡萄牙语推特域的目标导向立场预测多模态语料库。该语料库包含针对选定目标话题的8.68万条标注立场数据,以及发布这些立场的社交媒体用户的广泛网络信息。本文阐述了该语料库的多模态数据构成,并展示了基于文本与网络相关信息的域内及零样本立场预测应用实例,旨在为该领域后续研究提供初始基线结果。