We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints.For discrete distributions, we assume that the two-dimensional array to estimate is a low-rank probability matrix. In the continuous case, we assume that the density with respect to the Lebesgue measure satisfies a generalized multi-view model, meaning that it is $\beta$-H{\"o}lder and can be decomposed as a sum of $K$ components, each of which is a product of one-dimensional functions. In both settings, we propose estimators that achieve, up to logarithmic factors, the minimax optimal convergence rates under such low-rank constraints. In the discrete case, the proposed estimator is adaptive to the rank $K$. In the continuous case, our estimator converges with the $L_1$ rate $\min((K/n)^{\beta/(2\beta+1)}, n^{-\beta/(2\beta+2)})$ up to logarithmic factors, and it is adaptive to the unknown support as well as to the smoothness $\beta$ and to the unknown number of separable components $K$. We present efficient algorithms for computing our estimators.
翻译:本文研究了在低秩约束下二元离散或连续概率密度估计的问题。对于离散分布,我们假设待估计的二维数组是一个低秩概率矩阵。在连续情形中,我们假设关于勒贝格测度的密度满足广义多视图模型,即该密度是$\beta$-Hölder连续的,且可分解为$K$个分量之和,其中每个分量都是一维函数的乘积。在这两种设定下,我们提出的估计量在达到对数因子范围内,实现了此类低秩约束下的极小极大最优收敛速率。在离散情形中,所提出的估计量对秩$K$具有自适应性。在连续情形中,我们的估计量以$L_1$速率$\min((K/n)^{\beta/(2\beta+1)}, n^{-\beta/(2\beta+2)})$收敛(忽略对数因子),且对未知支撑集、光滑度$\beta$以及未知的可分离分量数$K$均具有自适应性。我们提出了计算这些估计量的高效算法。