Data depth has emerged as an invaluable nonparametric measure for the ranking of multivariate samples. The main contribution of depth-based two-sample comparisons is the introduction of the Q statistic (Liu and Singh, 1993), a quality index. Unlike traditional methods, data depth does not require the assumption of normal distributions and adheres to four fundamental properties. Many existing two-sample homogeneity tests, which assess mean and/or scale changes in distributions often suffer from low statistical power or indeterminate asymptotic distributions. To overcome these challenges, we introduced a DEEPEAST (depth-explored same-attraction sample-to-sample central-outward ranking) technique for improving statistical power in two-sample tests via the same-attraction function. We proposed two novel and powerful depth-based test statistics: the sum test statistic and the product test statistic, which are rooted in Q statistics, share a "common attractor" and are applicable across all depth functions. We further proved the asymptotic distribution of these statistics for various depth functions. To assess the performance of power gain, we apply three depth functions: Mahalanobis depth (Liu and Singh, 1993), Spatial depth (Brown, 1958; Gower, 1974), and Projection depth (Liu, 1992). Through two-sample simulations, we have demonstrated that our sum and product statistics exhibit superior power performance, utilizing a strategic block permutation algorithm and compare favourably with popular methods in literature. Our tests are further validated through analysis on Raman spectral data, acquired from cellular and tissue samples, highlighting the effectiveness of the proposed tests highlighting the effective discrimination between health and cancerous samples.
翻译:数据深度已成为一种用于多变量样本排序的宝贵非参数度量。基于深度的两样本比较的主要贡献在于引入了Q统计量(Liu与Singh,1993),这是一种质量指标。与传统方法不同,数据深度不要求正态分布假设,并遵循四个基本性质。许多现有的两样本同质性检验,用于评估分布中的均值和/或尺度变化,常面临统计效能较低或渐近分布不确定的问题。为克服这些挑战,我们引入了一种DEEPEAST(深度探索的同吸引样本间由中心向外排序)技术,通过同吸引函数提升两样本检验的统计效能。我们提出了两种新颖且高效的基于深度的检验统计量:和统计量与积统计量,它们植根于Q统计量,共享一个“共同吸引子”,并适用于所有深度函数。我们进一步证明了这些统计量在不同深度函数下的渐近分布。为评估效能增益的表现,我们应用了三种深度函数:马氏深度(Liu与Singg,1993)、空间深度(Brown,1958;Gower,1974)以及投影深度(Liu,1992)。通过两样本模拟,我们证明了我们的和统计量与积统计量展现出优越的效能表现,这得益于一种策略性的分块置换算法,并且与文献中常用方法相比具有优势。我们的检验进一步通过对细胞和组织样本获取的拉曼光谱数据的分析得到验证,突显了所提检验在有效区分健康与癌变样本方面的有效性。