This work introduces UstanceBR, a multimodal corpus in the Brazilian Portuguese Twitter domain for target-based stance prediction. The corpus comprises 86.8 k labelled stances towards selected target topics, and extensive network information about the users who published these stances on social media. In this article we describe the corpus multimodal data, and a number of usage examples in both in-domain and zero-shot stance prediction based on text- and network-related information, which are intended to provide initial baseline results for future studies in the field.
翻译:本文介绍了UstanceBR,一个面向巴西葡萄牙语推特领域、基于目标的多模态立场预测语料库。该语料库包含86.8万条针对选定目标主题的标注立场,以及发布这些立场的社交媒体用户的大量网络信息。我们描述了该语料库的多模态数据,并基于文本和网络相关信息展示了领域内与零样本立场预测的若干使用示例,旨在为该领域的未来研究提供初步基线结果。