We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints.For discrete distributions, we assume that the two-dimensional array to estimate is a low-rank probability matrix.In the continuous case, we assume that the density with respect to the Lebesgue measure satisfies a generalized multi-view model, meaning that it is $\beta$-H{\"o}lder and can be decomposed as a sum of $K$ components, each of which is a product of one-dimensional functions.In both settings, we propose estimators that achieve, up to logarithmic factors, the minimax optimal convergence rates under such low-rank constraints.In the discrete case, the proposed estimator is adaptive to the rank $K$. In the continuous case, our estimator converges with the $L_1$ rate $\min((K/n)^{\beta/(2\beta+1)}, n^{-\beta/(2\beta+2)})$ up to logarithmic factors, and it is adaptive to the unknown support as well as to the smoothness $\beta$ and to the unknown number of separable components $K$. We present efficient algorithms for computing our estimators.
翻译:我们研究在低秩约束下的双变量离散或连续概率密度估计问题。对于离散分布,假设待估计的二维数组为低秩概率矩阵。在连续情形下,假设相对于勒贝格测度的密度满足广义多视角模型,即该密度为$\beta$-赫尔德连续且可分解为$K$个分量之和,每个分量均为一维函数的乘积。针对这两种设定,我们提出的估计量能达到(至多相差对数因子)该低秩约束下的极小化最优收敛速率。在离散情形中,所提估计量对秩$K$具有自适应性。在连续情形中,我们的估计量以$L_1$速率$\min((K/n)^{\beta/(2\beta+1)}, n^{-\beta/(2\beta+2)})$(至多相差对数因子)收敛,并对未知支撑、光滑度$\beta$及可分离分量个数$K$均具有自适应性。我们给出了计算这些估计量的高效算法。