The study of representations is widespread across fields, including neuroscience, psychology, and artificial intelligence. While representations are often studied and compared through similarities between stimuli, current methods provide only limited access to the dimensions that shape these representations and are often limited in interpretability. To overcome these challenges, here we introduce Similarity-Based Representation Factorization (SRF), a general computational method for recovering low-dimensional, non-negative, interpretable embeddings from similarity matrices derived from measured data. Across simulations and many neural, behavioral, and computational datasets, SRF recovers interpretable dimensions from diverse forms of representational data, even for very sparsely sampled, incomplete data. The dimensions derived from these datasets match those obtained by task-specific models, predict independent behavioral properties, improve exploratory analysis, and offer higher power for confirmatory hypothesis testing than comparing similarity matrices. Together, these results establish SRF as a general-purpose method with broad applications for uncovering, understanding, and using the dimensions underlying representations.
翻译:表征研究广泛存在于神经科学、心理学和人工智能等多个领域。尽管人们通常通过刺激之间的相似性来研究和比较表征,但现有方法在揭示构建这些表征的维度方面能力有限,且可解释性常常不足。为克服这些挑战,本文引入了基于相似性的表征分解方法(SRF),这是一种通用的计算方法,旨在从测量数据得到的相似性矩阵中恢复低维、非负且可解释的嵌入。通过模拟实验以及对多个神经、行为和计算数据集的分析,SRF能够从各种形式的表征数据中恢复可解释的维度,即便在数据采样极其稀疏且不完整的情况下也是如此。从这些数据集中提取的维度与通过特定任务模型获得的维度相匹配,能够预测独立的行为属性,改进探索性分析,并且在验证性假设检验方面比比较相似性矩阵具有更高的统计效力。综合而言,这些结果确立了SRF作为一种通用方法,在发现、理解及运用表征背后的维度方面具有广泛的应用前景。