This article focuses on estimating distribution elements over a high-dimensional binary hypercube from multivariate binary data. A popular approach to this problem, optimizing Walsh basis coefficients, is made more interpretable by an alternative representation as a "Fourier-Walsh" diagonalization. Allowing monotonic transformations of the resulting matrix elements yields a versatile binary density estimator: the main contribution of this article. It is shown that the Aitchison and Aitken kernel emerges from a constrained exponential form of this estimator, and that relaxing these constraints yields a flexible variable-weighted version of the kernel that retains positive-definiteness. Estimators within this unifying framework mix together well and span over extremes of the speed-flexibility trade-off, allowing them to serve a wide range of statistical inference and learning problems.
翻译:本文关注从多元二元数据中估计高维二元超立方体上的分布元素。针对该问题的一种常用方法是优化沃尔什基系数,而通过一种替代表示——"傅里叶-沃尔什"对角化,该方法变得更具可解释性。本文的主要贡献在于:允许对所得矩阵元素进行单调变换,从而构建出通用的二元密度估计器。研究表明,Aitchison与Aitken核函数可由此估计量的约束指数形式推导得出,而放松这些约束条件则可得到保留正定性的灵活变权重核函数。该统一框架下的估计量具有良好的混合性能,并能覆盖速度-灵活性权衡谱的两个极端,从而适用于广泛的统计推断与学习问题。