Independent component (IC) models are a standard tool for representing multivariate data in statistics, signal processing, and machine learning. Despite the extensive use of IC models, much less attention has been given to goodness-of-fit tests for assessing their compatibility with data. We develop the first goodness-of-fit test for IC models that is supported by a theoretical validity guarantee when the data dimension and sample size diverge proportionally. This is made possible by the fact that the test does not rely on a pre-whitening step, which often limits the applicability of other goodness-of-fit tests in high dimensions. Our theoretical analysis is complemented with numerical experiments that demonstrate the test's size control and power under a range of conditions. In addition, we provide examples involving gene-expression data to illustrate that the test has potential for effective diagnostic use in practice.
翻译:独立成分(IC)模型是统计学、信号处理与机器学习中表示多变量数据的标准工具。尽管IC模型被广泛使用,但针对其与数据兼容性的拟合优度检验研究却相对较少。我们首次提出了一种IC模型的拟合优度检验方法,该方法在数据维度和样本量成比例增长时具有理论有效性保证。这一进展得益于该检验无需依赖预白化步骤——这一步骤常限制其他拟合优度检验在高维场景中的适用性。我们的理论分析辅以数值实验,表明该检验在多种条件下具有良好的尺寸控制能力和统计功效。此外,通过基因表达数据的实例分析,证明该检验在实际诊断应用中具有显著潜力。