A central objective of machine learning is to identify structure and patterns in data. Advances in data acquisition have increasingly produced datasets whose observations possess rich geometric form, giving rise to shape spaces that encode variability in object geometry. Such datasets arise across a wide range of disciplines, including biology, medicine, anthropology, and computer vision, where subtle geometric differences often carry important scientific information. Traditional machine learning methods, however, are frequently ill-equipped to account for the nonlinear geometric structure underlying these data. This survey synthesizes a rapidly growing body of work on shape space analysis, which provides a mathematical and computational framework for the study of geometric data. Drawing on ideas from differential geometry, statistics, and machine learning, we organize the literature around a common analytical pipeline: shape representation and parameterization, the rigorous construction of robust geodesic metrics, statistical analysis on shape spaces, and geometry-aware learning methods. We discuss how these tools enable the characterization of shape variability, the comparison of geometric objects, and the analysis of structural trajectories across populations and time. To illustrate the breadth of the field, we highlight applications spanning multiple scales of biological organization, including studies of subcellular morphology and primate tooth evolution. Across these and many other domains, researchers face common challenges arising from complex, nonlinear, and often unaligned geometric variation. The review concludes by identifying key theoretical and computational challenges, as well as emerging opportunities driven by increasingly large and diverse geometric datasets.
翻译:机器学习的一个核心目标是识别数据中的结构与模式。数据采集技术的进步日益生成观测值具有丰富几何形态的数据集,从而产生了编码物体几何变异性的形状空间。此类数据集广泛存在于生物学、医学、人类学和计算机视觉等多个学科,其中微妙的几何差异往往承载着重要的科学信息。然而,传统的机器学习方法通常难以充分处理这些数据中蕴含的非线性几何结构。本综述综合了形状空间分析这一快速发展的研究领域,该领域为几何数据的研究提供了数学与计算框架。借鉴微分几何、统计学和机器学习的理念,我们将相关文献围绕一个共同的分析流程进行组织:形状表示与参数化、稳健测地度量的严谨构建、形状空间上的统计分析以及几何感知的学习方法。我们讨论了这些工具如何实现形状变异性的刻画、几何物体的比较,以及跨群体和时间的结构轨迹分析。为展示该领域的广度,我们重点介绍了跨越生物组织多个尺度的应用,包括亚细胞形态学研究和灵长类牙齿演化研究。在这些及许多其他领域中,研究者共同面临着由复杂、非线性且通常未对齐的几何变异所引发的挑战。本综述最后指出了关键的理论与计算挑战,以及由日益庞大且多样化的几何数据集所驱动的新兴机遇。