Coupled tensor decompositions (CTDs) perform data fusion by linking factors from different datasets. Although many CTDs have been already proposed, current works do not address important challenges of data fusion, where: 1) the datasets are often heterogeneous, constituting different "views" of a given phenomena (multimodality); and 2) each dataset can contain personalized or dataset-specific information, constituting distinct factors that are not coupled with other datasets. In this work, we introduce a personalized CTD framework tackling these challenges. A flexible model is proposed where each dataset is represented as the sum of two components, one related to a common tensor through a multilinear measurement model, and another specific to each dataset. Both the common and distinct components are assumed to admit a polyadic decomposition. This generalizes several existing CTD models. We provide conditions for specific and generic uniqueness of the decomposition that are easy to interpret. These conditions employ uni-mode uniqueness of different individual datasets and properties of the measurement model. Two algorithms are proposed to compute the common and distinct components: a semi-algebraic one and a coordinate-descent optimization method. Experimental results illustrate the advantage of the proposed framework compared with the state of the art approaches.
翻译:耦合张量分解(CTDs)通过连接不同数据集中的因子实现数据融合。尽管已有多种CTD方法被提出,当前研究尚未解决数据融合中的关键挑战:1)数据集常具有异质性,构成对特定现象的不同“视角”(多模态性);2)每个数据集可能包含个性化或数据集特定信息,形成与其他数据集不耦合的独立因子。本研究提出一种应对这些挑战的个性化CTD框架。我们构建了一个灵活模型,其中每个数据集被表示为两个分量的和:一个分量通过多线性测量模型与公共张量相关联,另一个分量则特定于各数据集。公共分量与独立分量均被假设允许进行多向式分解。该框架推广了多种现有CTD模型。我们给出了易于解释的分解特定唯一性与通用唯一性条件,这些条件运用了不同个体数据集的单模唯一性及测量模型特性。本文提出两种计算公共分量与独立分量的算法:半代数方法与坐标下降优化方法。实验结果证明了所提框架相较于现有先进方法的优势。