Characters in novels have typically been modeled based on their presence in scenes in narrative, considering aspects like their actions, named mentions, and dialogue. This conception of character places significant emphasis on the main character who is present in the most scenes. In this work, we instead adopt a framing developed from a new literary theory proposing a six-component structural model of character. This model enables a comprehensive approach to character that accounts for the narrator-character distinction and includes a component neglected by prior methods, discussion by other characters. We compare general-purpose LLMs with task-specific transformers for operationalizing this model of character on major 19th-century British realist novels. Our methods yield both component-level and graph representations of character discussion. We then demonstrate that these representations allow us to approach literary questions at scale from a new computational lens. Specifically, we explore Woloch's classic "the one vs the many" theory of character centrality and the gendered dynamics of character discussion.
翻译:传统上,小说人物的建模通常基于其在叙事场景中的出现情况,考虑其行为、命名提及和对话等方面。这种人物概念极大地强调了出现在最多场景中的主要人物。在本研究中,我们采用了一种基于新兴文学理论构建的框架,该理论提出了一个包含六个组成部分的人物结构模型。该模型实现了对人物的全面分析方法,既考虑了叙述者与人物的区分,又包含了先前方法所忽视的组成部分——其他人物的讨论。我们比较了通用大语言模型与任务专用Transformer模型在19世纪英国现实主义小说中实施该人物模型的效果。我们的方法生成了人物讨论的组件级表征和图结构表征。随后,我们证明这些表征使我们能够通过新的计算视角大规模探讨文学问题。具体而言,我们探索了沃洛赫经典的"主角与群像"人物中心性理论,以及人物讨论中的性别化动态。