Tucker decomposition is a powerful tensor model to handle multi-aspect data. It demonstrates the low-rank property by decomposing the grid-structured data as interactions between a core tensor and a set of object representations (factors). A fundamental assumption of such decomposition is that there were finite objects in each aspect or mode, corresponding to discrete indexes of data entries. However, many real-world data are not naturally posed in the setting. For example, geographic data is represented as continuous indexes of latitude and longitude coordinates, and cannot fit tensor models directly. To generalize Tucker decomposition to such scenarios, we propose Functional Bayesian Tucker Decomposition (FunBaT). We treat the continuous-indexed data as the interaction between the Tucker core and a group of latent functions. We use Gaussian processes (GP) as functional priors to model the latent functions, and then convert the GPs into a state-space prior by constructing an equivalent stochastic differential equation (SDE) to reduce computational cost. An efficient inference algorithm is further developed for scalable posterior approximation based on advanced message-passing techniques. The advantage of our method is shown in both synthetic data and several real-world applications.
翻译:塔克分解是一种处理多维度数据的强大张量模型。它通过将网格结构数据分解为核心张量与一组对象表示(因子)之间的交互,展示了数据的低秩特性。这种分解的一个基本假设是每个维度或模式中都存在有限数量的对象,对应于数据条目的离散索引。然而,许多现实世界数据并非自然符合这一设定。例如,地理数据以经纬度坐标的连续索引形式呈现,无法直接适配张量模型。为将塔克分解推广至此类场景,我们提出了功能贝叶斯塔克分解(FunBaT)。我们将连续索引数据视为塔克核心与一组潜在函数之间的交互作用。采用高斯过程(GP)作为函数先验来建模潜在函数,并通过构建等效随机微分方程(SDE)将高斯过程转化为状态空间先验,以降低计算成本。进一步基于先进的消息传递技术,开发了一种高效推理算法,用于可扩展的后验近似。在合成数据及多个实际应用案例中,该方法均展现出显著优势。