Modern data analysis across diverse disciplines increasingly relies on time series. Many of these datasets exhibit cyclostationarity, where patterns approximately repeat in a regular manner, often across multiple time scales, such as daily, weekly or yearly cycles. In this context, statistical inference is essential to distinguish genuine underlying effects from random variability. While tools like Analysis of Variance (ANOVA) provide such inference, they often lack interpretability and struggle with the complexities of multivariate data. To address these limitations, we propose a unified pipeline for the exploratory analysis of cyclostationary times series using ANOVA Simultaneous Component Analysis (ASCA). ASCA is an extension of ANOVA that is able to work in both univariate and multivariate cases. Combining inference with the visualization capabilities of Principal Component Analysis (PCA), ASCA provides powerful options for interpretability. ASCA's capabilities have been well-established in the analysis of experimental data, but they remain largely unexplored for observational data like time series. Our workflow introduces an algorithmic approach to modeling time-dependent data using ASCA, enabling control over multiple cyclostationary time scales while also accounting for the specific challenges of this type of data, such as autocorrelation. Furthermore, we observed that ASCA provides a better separation of variability across factors than ANOVA in unbalanced designs due to its multivariate nature. We demonstrate the efficacy of this methodology through two real-world case studies: water temperature trends in mountain lakes in Sierra Nevada, Spain, and airborne pollen trends over 30 years recorded in the city of Granada, Spain.
翻译:现代数据分析在多个学科领域日益依赖于时间序列。许多此类数据集表现出循环平稳性,即模式以近似规律的方式重复出现,通常跨越多个时间尺度,如日周期、周周期或年周期。在此背景下,统计推断对于区分真实的潜在效应与随机变异至关重要。虽然方差分析(ANOVA)等工具提供了此类推断,但它们通常缺乏可解释性,且难以处理多元数据的复杂性。为应对这些局限性,我们提出了一种利用方差分析同步成分分析(ASCA)进行循环平稳时间序列探索性分析的统一流程。ASCA是ANOVA的扩展,能够在单变量和多元情况下工作。通过将统计推断与主成分分析(PCA)的可视化能力相结合,ASCA为可解释性提供了强大的选择。ASCA在实验数据分析中的应用已得到充分验证,但对于时间序列等观测数据的应用仍鲜有探索。我们的工作流程引入了一种基于ASCA建模时间相关数据的算法方法,能够在控制多个循环平稳时间尺度的同时,兼顾此类数据特有的挑战(如自相关性)。此外,我们观察到,由于ASCA的多元特性,在不平衡实验设计中,它比ANOVA能更好地实现跨因素的变异分离。我们通过两个真实案例研究证明了该方法的有效性:西班牙内华达山脉高山湖泊的水温趋势,以及西班牙格拉纳达市记录的30年间空气花粉趋势。