In the digital age, readers value quantitative journalism that is clear, concise, analytical, and human-centred. To understand complex topics, they often piece together scattered facts from multiple articles. Visual storytelling can transform fragmented information into clear, engaging narratives, yet its use with unstructured online articles remains largely unexplored. To fill this gap, we present Compendia, an automated system that analyzes online articles in response to a user's query and generates a coherent data story tailored to the user's informational needs. Compendia addresses key challenges of storytelling from unstructured text through two modules covering: Online Article Retrieval, which gathers relevant articles; Data Fact Extraction, which identifies, validates, and refines quantitative facts; Fact Organization, which clusters and merges related facts into coherent thematic groups; and Visual Storytelling, which transforms the organized facts into narratives with visualizations in an interactive scrollytelling interface. We evaluated Compendia through a quantitative analysis, confirming the accuracy in fact extraction and organization, and through two user studies with 16 participants, demonstrating its usability, effectiveness, and ability to produce engaging visual stories for open-ended queries.
翻译:在数字时代,读者重视清晰、简洁、分析性强且以人为本的量化新闻报道。为理解复杂议题,他们常需从多篇文章中拼凑零散的事实。视觉叙事能将碎片化信息转化为清晰且引人入胜的叙述,然而其在非结构化在线文章中的应用仍鲜有探索。为填补这一空白,我们提出了Compendia,这是一个自动化系统,可分析在线文章以响应用户查询,并生成适应用户信息需求的连贯数据故事。Compendia通过两大模块解决从非结构化文本生成叙事的关键挑战,涵盖:在线文章检索——收集相关文章;数据事实提取——识别、验证并精炼量化事实;事实组织——将相关事实聚类并合并为连贯的主题组;以及视觉叙事——将组织好的事实转化为带有可视化效果的叙事,并呈现在交互式滚动叙事界面中。我们通过定量分析评估了Compendia,证实了其在事实提取与组织方面的准确性,并通过两项共16名参与者参与的用户研究,证明了其可用性、有效性以及为开放式查询生成引人入胜的视觉故事的能力。