Understanding how humans communicate and perceive narratives is important for media technology research and development. This is particularly important in current times when there are tools and algorithms that are easily available for amateur users to create high-quality content. Narrative media develops over time a set of recognizable patterns of features across similar artifacts. Genre is one such grouping of artifacts for narrative media with similar patterns, tropes, and story structures. While much work has been done on genre-based classifications in text and video, we present a novel approach to do a multi-modal analysis of genre based on comics and manga-style visual narratives. We present a systematic feature analysis of an annotated dataset that includes a variety of western and eastern visual books with annotations for high-level narrative patterns. We then present a detailed analysis of the contributions of high-level features to genre classification for this medium. We highlight some of the limitations and challenges of our existing computational approaches in modeling subjective labels. Our contributions to the community are: a dataset of annotated manga books, a multi-modal analysis of visual panels and text in a constrained and popular medium through high-level features, and a systematic process for incorporating subjective narrative patterns in computational models.
翻译:理解人类如何交流与感知叙事对于媒体技术的研究与开发至关重要。在当前业余用户可轻松获取工具与算法创作高质量内容的时代,这一点尤为重要。叙事媒介在长期发展中,会围绕相似作品形成一组可识别的特征模式。体裁正是叙事媒介中具有相似模式、惯用手法和故事结构的作品归类方式。尽管基于体裁的文本与视频分类研究已取得大量成果,我们提出了一种新颖方法,对漫画及漫画风格的视觉叙事进行多模态体裁分析。我们对一个包含东西方各类视觉书籍的标注数据集进行了系统性特征分析,并详细探讨了高层特征对该媒介体裁分类的贡献。同时,我们指出现有计算方法在建模主观标签时存在的局限与挑战。本研究对领域的贡献包括:一个标注漫画数据集、通过高层特征对受约束主流媒介中视觉画格与文本的多模态分析,以及将主观叙事模式纳入计算模型的系统性流程。