Narratives about economic events and policies are widely recognised as influential drivers of economic and business behaviour. Yet the statistical identification of narrative emergence remains underdeveloped. Narratives evolve gradually, exhibit subtle shifts in content, and may exert influence disproportionate to their observable frequency, making it difficult to determine when observed changes reflect genuine structural shifts rather than routine variation in language use. We propose a statistical framework for detecting narrative emergence in longitudinal text corpora using Latent Dirichlet Allocation (LDA). We define emergence as a sustained increase in a topic's relative prominence over time and articulate a statistical framework for interpreting such trajectories, recognising that topic proportions are latent, model-estimated quantities. We illustrate the approach using a corpus of academic publications in economics spanning 1970-2018, where Nobel Prize-recognised contributions serve as externally observable signals of influential narratives. Topics associated with these contributions display sustained increases in estimated prevalence that coincide with periods of heightened citation activity and broader disciplinary recognition. These findings indicate that model-based topic trajectories can reflect identifiable shifts in economic discourse and provide a statistically grounded basis for analysing thematic change in longitudinal textual data.
翻译:关于经济事件与政策的叙事被广泛认为是影响经济与商业行为的重要驱动因素。然而,叙事涌现的统计识别方法仍不完善。叙事逐渐演化,内容呈现细微变化,且其影响力可能与其可观测频率不成比例,这使得难以判断观察到的变化何时反映了真实的结构性转变,而非语言使用的常规波动。我们提出了一种基于潜在狄利克雷分配(LDA)的统计框架,用于检测纵向文本语料库中的叙事涌现。我们将涌现定义为一个主题相对重要性随时间持续增长的过程,并构建了一个统计框架来解释此类轨迹,同时认识到主题比例是潜在的、由模型估计的量。我们使用1970年至2018年间经济学学术出版物语料库对该方法进行了说明,其中诺贝尔奖认可的贡献作为有影响力的叙事的外部可观测信号。与这些贡献相关的主题在估计流行度上显示出持续增长,这种增长与引用活动加剧及更广泛的学科认可时期相吻合。这些发现表明,基于模型的主题轨迹能够反映经济话语中可识别的转变,并为分析纵向文本数据中的主题变化提供了统计学基础。