With the aim of generalizing histogram statistics to higher dimensional cases, density estimation via discrepancy based sequential partition (DSP) has been proposed [D. Li, K. Yang, W. Wong, Advances in Neural Information Processing Systems (2016) 1099-1107] to learn an adaptive piecewise constant approximation defined on a binary sequential partition of the underlying domain, where the star discrepancy is adopted to measure the uniformity of particle distribution. However, the calculation of the star discrepancy is NP-hard and it does not satisfy the reflection invariance and rotation invariance either. To this end, we use the mixture discrepancy and the comparison of moments as a replacement of the star discrepancy, leading to the density estimation via mixture discrepancy based sequential partition (DSP-mix) and density estimation via moments based sequential partition (MSP), respectively. Both DSP-mix and MSP are computationally tractable and exhibit the reflection and rotation invariance. Numerical experiments in reconstructing the $d$-D mixture of Gaussians and Betas with $d=2, 3, \dots, 6$ demonstrate that DSP-mix and MSP both run approximately ten times faster than DSP while maintaining the same accuracy.
翻译:为了将直方图统计推广至高维情形,基于差异的序列划分密度估计(DSP)方法被提出[D. Li, K. Yang, W. Wong, Advances in Neural Information Processing Systems (2016) 1099-1107],该方法通过在定义域上构建二元序列划分来学习自适应的分段常数近似,其中采用星差异来衡量粒子分布的均匀性。然而,星差异的计算是NP难的,且不满足反射不变性与旋转不变性。为此,我们采用混合差异与矩的比较分别替代星差异,从而提出了基于混合差异的序列划分密度估计(DSP-mix)与基于矩的序列划分密度估计(MSP)。DSP-mix与MSP均具有计算可行性,并展现出反射与旋转不变性。在重构维度$d=2, 3, \dots, 6$的高斯混合分布与Beta混合分布的数值实验中,DSP-mix与MSP在保持相同精度的同时,运行速度均比DSP快约十倍。