Natural Image Classification via Quasi-Cyclic Graph Ensembles and Random-Bond Ising Models at the Nishimori Temperature

from arxiv, 38 pages, 8 figures, 4 tables, was presented at the 9th International Conference 'Deep Learning on Computational Physics (DLCP2025)', and accepted for the Moscow University Physics Bulletin, Physics series

Modern multi-class image classification uses high-dimensional CNN features that incur large memory and computational costs and obscure the data manifold's geometry. Existing graph-based spectral classifiers work on synthetic or binary tasks but degrade on natural images with many classes because feature manifolds have non-trivial topology. We introduce a physics-inspired pipeline where frozen MobileNetV2 features are interpreted as Ising spins on a sparse multi-edge type quasi-cyclic LDPC graph, defining a Random-Bond Ising Model (RBIM). The model is operated at its Nishimori temperature -- where the smallest eigenvalue of the Bethe-Hessian matrix vanishes. A spectral-topological correspondence links trapping sets in the Tanner graph to topological invariants via poles of the Ihara-Bass zeta function, enabling systematic suppression of harmful substructures that otherwise reduce top-1 accuracy by more than a factor of four. A fast quadratic-Newton estimator finds the Nishimori temperature in $\sim 9$ Arnoldi iterations, a sixfold speed-up over bisection. The resulting ensembles compress the original $1280$-dimensional MobileNetV2 representation to $32$ dimensions (ImageNet-10) or $64$ dimensions (ImageNet-100). We achieve $98.7\%$ top-1 accuracy on ImageNet-10 and $84.92\%$ on ImageNet-100 using a three-graph soft ensemble. Relative to MobileNetV2, our hard ensemble increases accuracy by $0.10\%$ while reducing FLOPs by a factor of $2.67$. Against ResNet-50, the soft ensemble drops only 1.09% accuracy yet cuts FLOPs by $29\times$. The novelty lies in (a) establishing a rigorous link between graph trapping sets and algebraic-topological defects, (b) an efficient Nishimori-temperature estimator, and (c) demonstrating topology-guided LDPC graph embedding for highly compressed classifiers.

翻译：现代多类图像分类使用高维CNN特征，这些特征带来巨大的内存和计算开销，并掩盖了数据流形的几何结构。现有基于图的谱分类器适用于合成或二分类任务，但在处理具有复杂拓朴结构特征流形的自然图像（尤其是多类场景）时性能下降。我们提出一种受物理学启发的流水线：将冻结的MobileNetV2特征解释为稀疏多边类型准循环LDPC图上的伊辛自旋，从而定义随机键伊辛模型（RBIM）。该模型在其尼西森温度下运行——此时Bethe-Hessian矩阵的最小特征值为零。通过Ihara-Bass zeta函数的极点，谱-拓扑对应关系将Tanner图中的陷阱集与拓扑不变量联系起来，从而系统性地抑制有害子结构（否则会导致top-1准确率降低四倍以上）。一种快速二次牛顿估计器可在约9次Arnoldi迭代内找到尼西森温度，速度比二分法提升六倍。由此产生的集成方法将原始的1280维MobileNetV2表示压缩至32维（ImageNet-10）或64维（ImageNet-100）。采用三图软集成方法，我们在ImageNet-10上达到98.7%的top-1准确率，在ImageNet-100上达到84.92%。相较于MobileNetV2，硬集成在降低2.67倍FLOPs的同时提升0.10%准确率；相较于ResNet-50，软集成仅损失1.09%准确率，但FLOPs降低29倍。本文的创新点在于：（a）建立图陷阱集与代数拓扑缺陷之间的严格联系，（b）提出高效的尼西森温度估计器，（c）展示拓扑引导的LDPC图嵌入方法用于高压缩比分类器。