This work introduces UstanceBR, a multimodal corpus in the Brazilian Portuguese Twitter domain for target-based stance prediction. The corpus comprises 86.8 k labelled stances towards selected target topics, and extensive network information about the users who published these stances on social media. In this article we describe the corpus multimodal data, and a number of usage examples in both in-domain and zero-shot stance prediction based on text- and network-related information, which are intended to provide initial baseline results for future studies in the field.
翻译:本文介绍了UstanceBR,一个面向巴西葡萄牙语推特领域的目标导向型立场预测多模态语料库。该语料库包含8.68万条针对选定目标话题的标注立场,以及关于在社交媒体上发布这些立场的用户的广泛网络信息。本文描述了该语料库的多模态数据,并基于文本和网络相关信息提供了多个领域内及零样本立场预测的使用示例,旨在为该领域未来研究提供初步基准结果。