This work presents a comparative evaluation of two fundamentally different feature extraction paradigms--Histogram of Oriented Gradients (HOG) and Topological Data Analysis (TDA)--for medical image classification, with a focus on retinal fundus imagery. HOG captures local structural information by modeling gradient orientation distributions within spatial regions, effectively encoding texture and edge patterns. In contrast, TDA, implemented through cubical persistent homology, extracts global topological descriptors that characterize shape, connectivity, and intensity-based structure across images. We evaluate both approaches on the publicly available APTOS retinal fundus dataset for two classification tasks: binary classification (normal vs. diabetic retinopathy (DR)) and five-class DR severity grading. From each image, 26,244 HOG features and 800 TDA features are extracted and independently used to train seven classical machine learning models, including logistic regression, random forest, XGBoost, support vector machines, decision trees, k-nearest neighbors, and Extra Trees, using 10-fold cross-validation. Experimental results show that XGBoost achieves the best performance across both feature types. For binary classification, accuracies of 94.29% (HOG) and 94.18% (TDA) are obtained, while multi-class classification yields accuracies of 74.41% and 74.69%, respectively. These results demonstrate that gradient-based and topological features provide complementary representations of retinal image structure and highlight the potential of integrating both approaches for interpretable and robust medical image classification.
翻译:本研究对两种根本不同的特征提取范式——方向梯度直方图与拓扑数据分析——在医学图像分类(重点关注视网膜眼底影像)中进行了对比评估。HOG通过建模空间区域内梯度方向的分布来捕捉局部结构信息,有效编码纹理和边缘模式。相比之下,通过立方持续同调实现的TDA提取全局拓扑描述符,用于刻画图像中基于强度的形状、连通性和结构特征。我们在公开可用的APTOS视网膜眼底数据集上评估了这两种方法,针对两项分类任务:二分类(正常与糖尿病视网膜病变)以及五级DR严重程度分级。从每幅图像中提取26,244个HOG特征和800个TDA特征,并分别用于训练七种经典机器学习模型(包括逻辑回归、随机森林、XGBoost、支持向量机、决策树、k近邻和Extra Trees),采用10折交叉验证。实验结果表明,XGBoost在两种特征类型上均取得最佳性能。二分类任务中,准确率分别达到94.29%(HOG)和94.18%(TDA);多分类任务中,准确率分别为74.41%和74.69%。这些结果表明,基于梯度的特征与拓扑特征为视网膜图像结构提供了互补的表征,并突显了整合两种方法以实现可解释且鲁棒的医学图像分类的潜力。