Measurements of systems taken along a continuous functional dimension, such as time or space, are ubiquitous in many fields, from the physical and biological sciences to economics and engineering.Such measurements can be viewed as realisations of an underlying smooth process sampled over the continuum. However, traditional methods for independence testing and causal learning are not directly applicable to such data, as they do not take into account the dependence along the functional dimension. By using specifically designed kernels, we introduce statistical tests for bivariate, joint, and conditional independence for functional variables. Our method not only extends the applicability to functional data of the HSIC and its d-variate version (d-HSIC), but also allows us to introduce a test for conditional independence by defining a novel statistic for the CPT based on the HSCIC, with optimised regularisation strength estimated through an evaluation rejection rate. Our empirical results of the size and power of these tests on synthetic functional data show good performance, and we then exemplify their application to several constraint- and regression-based causal structure learning problems, including both synthetic examples and real socio-economic data.
翻译:沿连续函数维度(如时间或空间)对系统进行的测量在从物理、生物科学到经济学和工程学的众多领域中普遍存在。此类测量可视为在连续体上采样的潜在光滑过程的实现。然而,传统的独立性检验和因果学习方法无法直接应用于此类数据,因为它们未考虑沿函数维度的依赖关系。通过使用专门设计的核,我们引入了针对函数变量的双变量、联合和条件独立性的统计检验。我们的方法不仅将HSIC及其多元版本(d-HSIC)的适用性扩展到函数数据,还通过基于HSCIC定义CPT的新统计量,并利用评估拒绝率估计优化正则化强度,从而引入了条件独立性检验。我们在合成函数数据上对这些检验的规模和功效进行的实证结果显示其性能良好,随后我们通过多个基于约束和回归的因果结构学习问题(包括合成示例和真实社会经济数据)展示了其应用。