Most neural network quantization methods apply uniform bit precision across spatial regions, ignoring the heterogeneous structural and textural complexity of visual data. This paper introduces MCAQ-YOLO, a morphological complexity-aware quantization framework for object detection. The framework employs five morphological metrics - fractal dimension, texture entropy, gradient variance, edge density, and contour complexity - to characterize local visual morphology and guide spatially adaptive bit allocation. By correlating these metrics with quantization sensitivity, MCAQ-YOLO dynamically adjusts bit precision according to spatial complexity. In addition, a curriculum-based quantization-aware training scheme progressively increases quantization difficulty to stabilize optimization and accelerate convergence. Experimental results demonstrate a strong correlation between morphological complexity and quantization sensitivity and show that MCAQ-YOLO achieves superior detection accuracy and convergence efficiency compared with uniform quantization. On a safety equipment dataset, MCAQ-YOLO attains 85.6 percent mAP@0.5 with an average of 4.2 bits and a 7.6x compression ratio, yielding 3.5 percentage points higher mAP than uniform 4-bit quantization while introducing only 1.8 ms of additional runtime overhead per image. Cross-dataset validation on COCO and Pascal VOC further confirms consistent performance gains, indicating that morphology-driven spatial quantization can enhance efficiency and robustness for computationally constrained, safety-critical visual recognition tasks.
翻译:大多数神经网络量化方法在空间区域上采用统一的比特精度,忽略了视觉数据在结构和纹理上的异质复杂性。本文提出了MCAQ-YOLO,一种用于目标检测的形态复杂度感知量化框架。该框架采用五种形态学指标——分形维数、纹理熵、梯度方差、边缘密度和轮廓复杂度——来表征局部视觉形态并指导空间自适应的比特分配。通过将这些指标与量化敏感性相关联,MCAQ-YOLO根据空间复杂度动态调整比特精度。此外,一种基于课程的量化感知训练方案逐步增加量化难度,以稳定优化过程并加速收敛。实验结果表明,形态复杂度与量化敏感性之间存在强相关性,并且MCAQ-YOLO相比均匀量化实现了更优的检测精度和收敛效率。在安全装备数据集上,MCAQ-YOLO以平均4.2比特和7.6倍的压缩比达到了85.6%的mAP@0.5,比均匀4比特量化高出3.5个百分点,同时每幅图像仅引入1.8毫秒的额外运行时开销。在COCO和Pascal VOC数据集上的跨数据集验证进一步证实了其性能的持续提升,表明形态驱动的空间量化能够增强计算受限、安全关键视觉识别任务的效率和鲁棒性。