We study a distributional generalization of the matrix completion problem in which each entry of the target matrix is a probability distribution rather than a scalar. In this setting, only a subset of matrix entries is observed, and even for observed entries, the underlying distributions are not directly accessible; instead, we observe finitely many samples drawn from them. To represent distributional entries, we employ kernel mean embeddings and introduce a notion of Tucker rank for distribution-valued matrices to capture their low-rank structure. The infinite-dimensional nature of kernel embeddings poses significant methodological challenges. To address this, we introduce functional unfolding operators that link the proposed distributional low-rank structure to the classical Tucker rank for finite-dimensional tensors. Based on this framework, we propose a novel estimator for distributional matrix completion. We establish non-asymptotic error bounds that characterize the statistical performance of the estimator. Extensive experiments on synthetic data and a real-world application demonstrate the effectiveness of the proposed method.
翻译:我们研究了矩阵补全问题的一种分布泛化形式,其中目标矩阵的每个条目是一个概率分布而非标量。在该设定下,仅能观测到部分矩阵条目,且即使对于观测条目,其底层分布也无法直接获取;相反,我们仅能从这些分布中抽取有限数量的样本。为表示分布条目,我们采用核均值嵌入,并引入分布值矩阵的Tucker秩概念以刻画其低秩结构。核嵌入的无穷维特性带来了重大的方法论挑战。为解决该问题,我们提出了函数展开算子,将所提出的分布低秩结构与有限维张量的经典Tucker秩联系起来。基于此框架,我们为分布矩阵补全提出了一种新型估计量。我们建立了非渐近误差界以刻画该估计量的统计性能。在合成数据及实际应用上的大量实验证明了所提方法的有效性。