Discovering causal direction from temporal observational data is particularly challenging for symbolic sequences, where functional models and noise assumptions are often unavailable. We propose a novel \emph{Dictionary Based Pattern Entropy ($DPE$)} framework that infers both the direction of causation and the specific subpatterns driving changes in the effect variable. The framework integrates \emph{Algorithmic Information Theory} (AIT) and \emph{Shannon Information Theory}. Causation is interpreted as the emergence of compact, rule based patterns in the candidate cause that systematically constrain the effect. $DPE$ constructs direction-specific dictionaries and quantifies their influence using entropy-based measures, enabling a principled link between deterministic pattern structure and stochastic variability. Causal direction is inferred via a minimum-uncertainty criterion, selecting the direction exhibiting stronger and more consistent pattern-driven organization. As summarized in Table 7, $DPE$ consistently achieves reliable performance across diverse synthetic systems, including delayed bit-flip perturbations, AR(1) coupling, 1D skew-tent maps, and sparse processes, outperforming or matching competing AIT-based methods ($ETC_E$, $ETC_P$, $LZ_P$). In biological and ecological datasets, performance is competitive, while alternative methods show advantages in specific genomic settings. Overall, the results demonstrate that minimizing pattern level uncertainty yields a robust, interpretable, and broadly applicable framework for causal discovery.
翻译:从时间观测数据中推断因果方向对于符号序列尤为具有挑战性,因为这类数据通常缺乏函数模型和噪声假设。我们提出了一种新颖的**基于字典的模式熵($DPE$)**框架,该框架既能推断因果方向,又能识别驱动效应变量变化的具体子模式。该框架融合了**算法信息论**(AIT)和**香农信息论**,将因果关系解释为候选原因中出现的紧凑、基于规则的模式系统性约束效应的过程。$DPE$构建方向特异性字典,并利用基于熵的度量量化其影响,从而在确定性模式结构与随机变异性之间建立原则性联系。因果方向通过最小不确定性准则推断,选择表现出更强且更一致的模式驱动组织的方向。如表7所示,$DPE$ 在各种合成系统中均能稳定获得可靠性能,包括延迟比特翻转扰动、AR(1)耦合、一维斜帐篷映射和稀疏过程,其表现优于或持平于基于AIT的对比方法($ETC_E$、$ETC_P$、$LZ_P$)。在生物与生态数据集中,该框架性能具有竞争力,而替代方法在特定基因组环境中展现出优势。总体而言,结果表明最小化模式级不确定性为因果发现提供了一种稳健、可解释且广泛适用的框架。