We consider tensor factorizations based on sparse measurements of the components of relatively high rank tensors. The measurements are designed in a way that the underlying graph of interactions is a random graph. The setup will be useful in cases where a substantial amount of data is missing, as in completion of relatively high rank matrices for recommendation systems heavily used in social network services. In order to obtain theoretical insights on the setup, we consider statistical inference of the tensor factorization in a high dimensional limit, which we call as dense limit, where the graphs are large and dense but not fully connected. We build message-passing algorithms and test them in a Bayes optimal teacher-student setting in some specific cases. We also develop a replica theory to examine the performance of statistical inference in the dense limit based on a cumulant expansion. The latter approach allows one to avoid blind usage of Gaussian ansatz which fails in some fully connected systems.
翻译:我们考虑基于对相对高阶张量分量进行稀疏测量的张量分解方法。测量方式的设计使得相互作用的基础图结构为随机图。该框架适用于数据大量缺失的场景,例如社交网络服务中广泛使用的推荐系统所需相对高阶矩阵的补全问题。为获得该框架的理论洞见,我们在高维极限(称为稠密极限)下研究张量分解的统计推断问题,其中图结构规模庞大且稠密但非全连接。我们构建了消息传递算法,并在特定案例的贝叶斯最优师生设置中进行了测试。同时,我们基于累积量展开发展了副本理论,用以检验稠密极限下统计推断的性能。后一种方法可避免在某些全连接系统中失效的高斯拟设的盲目使用。