The development of machine learning (ML) models based on computed tomography (CT) imaging modality has been a major focus of recent research in the medical imaging domain. Incorporating robust feature engineering approach can highly improve the performance of these models. Topological data analysis (TDA), a recent development based on the mathematical field of algebraic topology, mainly focuses on the data from a topological perspective, extracting deeper insight and higher dimensional structures from the data. Persistent homology (PH), a fundamental tool in the area of TDA, can extract topological features such as connected components, cycles and voids from the data. A popular approach to construct PH from 3D CT images is to utilize the 3D cubical complex filtration, a method adapted for grid-structured data. However, this approach may not always yield the best performance and can suffer from computational complexity with higher resolution CT images. This study introduces a novel patch-based PH construction approach tailored for volumetric medical imaging data, in particular CT modality. A wide range of experiments has been conducted on several datasets of 3D CT images to comprehensively analyze the performance of the proposed method with various parameters and benchmark it against the 3D cubical complex algorithm. Our results highlight the dominance of the patch-based TDA approach in terms of both classification performance and time-efficiency. The proposed approach outperformed the cubical complex method, achieving average improvement of 10.38%, 6.94%, 2.06%, 11.58%, and 8.51% in accuracy, AUC, sensitivity, specificity, and F1 score, respectively, across all datasets. Finally, we provide a convenient python package, Patch-TDA, to facilitate the utilization of the proposed approach.
翻译:基于计算机断层扫描(CT)成像模态的机器学习模型开发,一直是医学影像领域近期研究的重点。结合稳健的特征工程方法可以显著提升这些模型的性能。拓扑数据分析(TDA)是代数拓扑数学领域的最新发展,主要从拓扑学视角分析数据,以提取更深层次的洞见和更高维度的数据结构。持续同调(PH)作为TDA领域的一个基本工具,可以从数据中提取诸如连通分量、环和空洞等拓扑特征。从3D CT图像构建PH的一种常用方法是利用3D立方复形过滤,这是一种适用于网格结构数据的方法。然而,该方法可能并不总能获得最佳性能,并且在处理高分辨率CT图像时可能面临计算复杂度高的挑战。本研究针对体数据医学影像数据(特别是CT模态),提出了一种新颖的基于图像块的PH构建方法。我们在多个3D CT图像数据集上进行了广泛的实验,以全面分析所提方法在不同参数下的性能,并将其与3D立方复形算法进行基准比较。我们的结果突显了基于图像块的TDA方法在分类性能和时间效率方面的优势。所提出的方法在各项指标上均优于立方复形方法,在所有数据集上,准确率、AUC、灵敏度、特异度和F1分数分别平均提升了10.38%、6.94%、2.06%、11.58%和8.51%。最后,我们提供了一个便捷的Python软件包Patch-TDA,以促进所提方法的应用。