This paper introduces a novel changepoint detection framework that combines ensemble statistical methods with Large Language Models (LLMs) to enhance both detection accuracy and the interpretability of regime changes in time series data. Two critical limitations in the field are addressed. First, individual detection methods exhibit complementary strengths and weaknesses depending on data characteristics, making method selection non-trivial and prone to suboptimal results. Second, automated, contextual explanations for detected changes are largely absent. The proposed ensemble method aggregates results from ten distinct changepoint detection algorithms, achieving superior performance and robustness compared to individual methods. Additionally, an LLM-powered explanation pipeline automatically generates contextual narratives, linking detected changepoints to potential real-world historical events. For private or domain-specific data, a Retrieval-Augmented Generation (RAG) solution enables explanations grounded in user-provided documents. The open source Python framework demonstrates practical utility in diverse domains, including finance, political science, and environmental science, transforming raw statistical output into actionable insights for analysts and decision-makers.
翻译:本文提出一种新颖的变化点检测框架,将集成统计方法与大语言模型(LLMs)相结合,以提升时间序列数据中制度变化的检测精度和可解释性。研究解决了该领域两个关键局限性:首先,单一检测方法因数据特性不同而表现出互补的优势与劣势,导致方法选择困难且易产生次优结果;其次,针对检测到的变化缺乏自动化、上下文相关的解释。所提出的集成方法汇总了十种不同变化点检测算法的结果,相较于单一方法展现出更优的性能与鲁棒性。此外,基于LLM的解释流程可自动生成上下文叙事,将检测到的变化点与潜在的真实历史事件关联。针对私有或领域特定数据,检索增强生成(RAG)方案能够基于用户提供的文档生成解释。该开源Python框架在金融、政治学及环境科学等多个领域展现出实用价值,将原始统计输出转化为可供分析师和决策者直接使用的可操作洞见。