We consider tensor factorizations based on sparse measurements of the components of relatively high rank tensors. The measurements are designed in a way that the underlying graph of interactions is a random graph. The setup will be useful in cases where a substantial amount of data is missing, as in completion of relatively high rank matrices for recommendation systems heavily used in social network services. In order to obtain theoretical insights on the setup, we consider statistical inference of the tensor factorization in a high dimensional limit, which we call as dense limit, where the graphs are large and dense but not fully connected. We build message-passing algorithms and test them in a Bayes optimal teacher-student setting in some specific cases. We also develop a replica theory to examine the performance of statistical inference in the dense limit based on a cumulant expansion. The latter approach allows one to avoid blind usage of Gaussian ansatz which fails in some fully connected systems.
翻译:我们研究基于相对高秩张量分量稀疏测量的张量分解方法。测量方式的设计使得相互作用的基础图结构呈现随机图特性。该框架适用于数据大量缺失的场景,例如社交网络服务中广泛使用的推荐系统所需相对高秩矩阵的补全问题。为获得该框架的理论认知,我们在高维极限(称为稠密极限)下考察张量分解的统计推断问题,此时图结构规模庞大且稠密但非全连接。我们构建了消息传递算法,并在特定案例的贝叶斯最优师生设置中进行测试。同时基于累积量展开发展了副本理论,用以检验稠密极限下统计推断的性能。后一种方法可避免在某些全连接系统中失效的高斯假设的盲目使用。