Comprehending characters' personalities is a crucial aspect of story reading. As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived. This leads to a natural problem of situated and fine-grained personality understanding. The problem has not been studied in the NLP field, primarily due to the lack of appropriate datasets mimicking the process of book reading. We present the first labeled dataset PersoNet for this problem. Our novel annotation strategy involves annotating user notes from online reading apps as a proxy for the original books. Experiments and human studies indicate that our dataset construction is both efficient and accurate; and our task heavily relies on long-term context to achieve accurate predictions for both machines and humans. The dataset is available at https://github.com/Gorov/personet_acl23.
翻译:理解角色的个性是故事阅读的关键方面。随着读者深入故事,他们对角色的理解会基于新事件和信息而演变;个性的多个细微方面也能被感知。这引出了情境化与细粒度个性理解的自然问题。该问题在自然语言处理领域尚未得到研究,主要原因是缺乏模拟书籍阅读过程的合适数据集。我们为此任务提供了首个标注数据集PersoNet。我们的新颖标注策略涉及将来自在线阅读应用的用户笔记作为原始书籍的代理进行标注。实验和人类研究表明,我们的数据集构建既高效又准确;我们的任务高度依赖长期上下文,以便机器和人类都能实现准确预测。该数据集可在https://github.com/Gorov/personet_acl23获取。