In $\mathbb R^d$, it is well-known that cumulants provide an alternative to moments that can achieve the same goals with numerous benefits such as lower variance estimators. In this paper we extend cumulants to reproducing kernel Hilbert spaces (RKHS) using tools from tensor algebras and show that they are computationally tractable by a kernel trick. These kernelized cumulants provide a new set of all-purpose statistics; the classical maximum mean discrepancy and Hilbert-Schmidt independence criterion arise as the degree one objects in our general construction. We argue both theoretically and empirically (on synthetic, environmental, and traffic data analysis) that going beyond degree one has several advantages and can be achieved with the same computational complexity and minimal overhead in our experiments.
翻译:在 $\mathbb R^d$ 中,众所周知,累积量提供了矩的替代方案,能够以更低方差的估计器等诸多优势实现相同目标。本文利用张量代数工具,将累积量扩展到再生核希尔伯特空间(RKHS),并通过核技巧证明其计算可行性。这些核化累积量构成一组新型通用统计量;经典的最大均值差异和希尔伯特-施密特独立性准则在我们的通用框架中仅作为一阶对象出现。我们从理论和实验(基于合成数据、环境数据及交通数据分析)两方面论证,超越一阶具有多重优势,且可以在相同计算复杂度下实现,实验中的额外开销极小。