Scientists, engineers, biologists, and technology specialists universally leverage image segmentation to extract shape ensembles containing many thousands of curves representing patterns in observations and measurements. These large curve ensembles facilitate inferences about important changes when comparing and contrasting images. We introduce novel pattern recognition formalisms combined with inference methods over large ensembles of segmented curves. Our formalism involves accurately approximating eigenspaces of composite integral operators to motivate discrete, dual representations of curves collocated at quadrature nodes. Approximations are projected onto underlying matrix manifolds and the resulting separable shape tensors constitute rigid-invariant decompositions of curves into generalized (linear) scale variations and complementary (nonlinear) undulations. With thousands of curves segmented from pairs of images, we demonstrate how data-driven features of separable shape tensors inform explainable binary classification utilizing a product maximum mean discrepancy; absent labeled data, building interpretable feature spaces in seconds without high performance computation, and detecting discrepancies below cursory visual inspections.
翻译:科学家、工程师、生物学家和技术专家普遍利用图像分割技术来提取包含成千上万条曲线的形状集合,这些曲线代表了观测与测量中的模式。这些大型曲线集合有助于在图像比较与对比时推断重要变化。我们引入了新颖的模式识别形式化方法,并结合针对大规模分割曲线集合的推断方法。我们的形式化方法涉及精确逼近复合积分算子的特征空间,以激励在求积节点处共置的曲线的离散对偶表示。这些逼近被投影到底层矩阵流形上,所得的可分离形状张量构成了曲线到广义(线性)尺度变化与互补(非线性)波动的刚性不变分解。利用从图像对中分割出的数千条曲线,我们展示了可分离形状张量的数据驱动特征如何通过乘积最大均值差异实现可解释的二元分类;在缺乏标注数据的情况下,无需高性能计算即可在数秒内构建可解释的特征空间,并检测出低于粗略视觉检查的差异。