Achieving a universally high accuracy in object detection is quite challenging, and the mainstream focus in the industry currently lies on detecting specific classes of objects. However, deploying one or multiple object detection networks requires a certain amount of GPU memory for training and storage capacity for inference. This presents challenges in terms of how to effectively coordinate multiple object detection tasks under resource-constrained conditions. This paper introduces a lightweight fine-tuning strategy called Calibration side tuning, which integrates aspects of adapter tuning and side tuning to adapt the successful techniques employed in transformers for use with ResNet. The Calibration side tuning architecture that incorporates maximal transition calibration, utilizing a small number of additional parameters to enhance network performance while maintaining a smooth training process. Furthermore, this paper has conducted an analysis on multiple fine-tuning strategies and have implemented their application within ResNet, thereby expanding the research on fine-tuning strategies for object detection networks. Besides, this paper carried out extensive experiments using five benchmark datasets. The experimental results demonstrated that this method outperforms other compared state-of-the-art techniques, and a better balance between the complexity and performance of the finetune schemes is achieved.
翻译:在目标检测中实现普遍高精度极具挑战性,当前业界主流聚焦于特定类别目标的检测。然而,部署一个或多个目标检测网络需要相应规模的GPU内存进行训练及存储空间用于推理。这带来了如何在资源受限条件下有效协调多目标检测任务的难题。本文提出一种名为校准侧调优的轻量级微调策略,该策略融合了适配器调优与侧调优的理念,将Transformer中成功的技术适配至ResNet。校准侧调优架构引入最大过渡校准机制,通过使用少量额外参数增强网络性能,同时保持训练过程的平滑性。此外,本文对多种微调策略进行了分析,并在ResNet中实现了其应用,拓展了目标检测网络微调策略的研究范畴。为验证方法有效性,本文在五个基准数据集上开展了大量实验。结果表明,该方法优于其他对比的最先进技术,并在微调方案的复杂度与性能之间实现了更优平衡。