In many applications of X-ray computed tomography, an unsupervised segmentation of the reconstructed 3D volumes forms an important step in the image processing chain for further investigation of the digitized object. Therefore, the goal is to train a clustering algorithm on the volume, which produces a voxelwise classification by assigning a cluster index to each voxel. However, clustering methods, e.g., K-Means, typically have an asymptotic polynomial runtime with respect to the dataset size, and thus, these techniques are rarely applicable to large volumes. In this work, we introduce a novel clustering technique based on random sampling, which allows for the voxelwise classification of arbitrarily large volumes. The presented method conducts efficient linear passes over the data to extract a representative random sample of a fixed size on which the classifier can be trained. Then, a final linear pass performs the segmentation and assigns a cluster index to each individual voxel. Quantitative and qualitative evaluations show that excellent results can be achieved even with a very small sample size. Consequently, the unsupervised segmentation by means of clustering becomes feasible for arbitrarily large volumes.
翻译:在X射线计算机断层扫描的许多应用中,对重建三维体素进行无监督分割构成了图像处理链中数字化对象进一步分析的重要环节。因此,目标是在体素数据上训练聚类算法,通过为每个体素分配聚类索引生成逐体素分类结果。然而,K-Means等聚类方法通常具有关于数据集规模的多项式渐进时间复杂度,这使得这些技术难以应用于大规模体素数据。本文提出一种基于随机采样的新型聚类技术,能够实现对任意规模体素数据的逐体素分类。该方法通过高效线性遍历数据,提取固定规模的代表性随机样本用于分类器训练,随后通过最终线性遍历完成分割并为每个体素分配聚类索引。定量与定性评估表明,即使采用极小采样规模也能获得优异结果。由此,基于聚类的无监督分割方法可适用于任意规模的体素数据。