The study of representations is widespread across fields, including neuroscience, psychology, and artificial intelligence. While representations are often studied and compared through similarities between stimuli, current methods provide only limited access to the dimensions that shape these representations and are often limited in interpretability. To overcome these challenges, here we introduce Similarity-Based Representation Factorization (SRF), a general computational method for recovering low-dimensional, non-negative, interpretable embeddings from similarity matrices derived from measured data. Across simulations and many neural, behavioral, and computational datasets, SRF recovers interpretable dimensions from diverse forms of representational data, even for very sparsely sampled, incomplete data. The dimensions derived from these datasets match those obtained by task-specific models, predict independent behavioral properties, improve exploratory analysis, and offer higher power for confirmatory hypothesis testing than comparing similarity matrices. Together, these results establish SRF as a general-purpose method with broad applications for uncovering, understanding, and using the dimensions underlying representations.
翻译:表征研究广泛存在于神经科学、心理学和人工智能等领域。尽管当前常通过刺激间的相似性来研究和比较表征,但现有方法对构成这些表征的维度的解析能力有限,且可解释性常受制约。为解决这些挑战,本文提出了基于相似性的表示分解方法(Similarity-Based Representation Factorization, SRF)——一种从实测数据导出的相似性矩阵中恢复低维、非负、可解释嵌入的通用计算方法。通过仿真实验及神经、行为和计算多类数据集验证,SRF能够从不同形式的表征数据中恢复可解释维度,即使面对高度稀疏且不完整的数据集仍有效。从这些数据集中提取的维度与任务特定模型所得维度高度吻合,可预测独立的行为属性,改善探索性分析,并在验证性假设检验中比直接比较相似性矩阵具有更高统计效力。这些结果表明,SRF作为一种通用方法,在揭示、理解及应用表征的潜在维度方面具有广泛适用性。