Advancements in computational power and hardware efficiency have enabled the tackling of increasingly complex and high-dimensional problems. While artificial intelligence (AI) achieved remarkable results, the interpretability of high-dimensional solutions remains challenging. A critical issue is the comparison of multidimensional quantities, which is essential in techniques like Principal Component Analysis (PCA), or k-means clustering. Common metrics such as cosine similarity, Euclidean distance, and Manhattan distance are often used for such comparisons - for example in muscular synergies of the human motor control system. However, their applicability and interpretability diminish as dimensionality increases. This paper provides a comprehensive analysis of the effects of dimensionality on these metrics. Our results reveal significant limitations of cosine similarity, particularly its dependency on the dimensionality of the vectors, leading to biased and poorly interpretable outcomes. To address this, we introduce the Dimension Insensitive Euclidean Metric (DIEM) which demonstrates superior robustness and generalizability across dimensions. DIEM maintains consistent variability and eliminates the biases observed in traditional metrics, making it a reliable tool for high-dimensional comparisons. This novel metric has the potential to replace cosine similarity, providing a more accurate and insightful method to analyze multidimensional data in fields ranging from neuromotor control to machine and deep learning.
翻译:计算能力和硬件效率的进步使得处理日益复杂和高维问题成为可能。尽管人工智能(AI)取得了显著成果,但高维解决方案的可解释性仍然具有挑战性。一个关键问题在于多维量的比较,这在主成分分析(PCA)或k均值聚类等技术中至关重要。常用的度量方法,如余弦相似度、欧几里得距离和曼哈顿距离,常被用于此类比较——例如在人体运动控制系统的肌肉协同作用中。然而,随着维度的增加,它们的适用性和可解释性会减弱。本文全面分析了维度对这些度量的影响。我们的结果揭示了余弦相似度的显著局限性,特别是其对向量维度的依赖性,这会导致有偏差且难以解释的结果。为解决这一问题,我们提出了维度不敏感欧几里得度量(DIEM),该度量在不同维度间展现出卓越的鲁棒性和泛化能力。DIEM保持了稳定的变异性,并消除了传统度量中观察到的偏差,使其成为高维比较的可靠工具。这种新颖的度量方法有潜力取代余弦相似度,为从神经运动控制到机器学习和深度学习等领域分析多维数据提供一种更准确、更具洞察力的方法。