With the emergence of foundation model, this novel paradigm of deep learning has encouraged many powerful achievements in natural language processing and computer vision. There are many advantages of foundation model, such as excellent feature extraction power, mighty generalization ability, great few-shot and zero-shot learning capacity, etc. which are beneficial to vision tasks. As the unique identity of vehicle, different countries and regions have diverse license plate (LP) styles and appearances, and even different types of vehicles have different LPs. However, recent deep learning based license plate detectors are mainly trained on specific datasets, and these limited datasets constrain the effectiveness and robustness of LP detectors. To alleviate the negative impact of limited data, an attempt to exploit the advantages of foundation model is implement in this paper. We customize a vision foundation model, i.e. Segment Anything Model (SAM), for LP detection task and propose the first LP detector based on vision foundation model, named SamLP. Specifically, we design a Low-Rank Adaptation (LoRA) fine-tuning strategy to inject extra parameters into SAM and transfer SAM into LP detection task. And then, we further propose a promptable fine-tuning step to provide SamLP with prompatable segmentation capacity. The experiments show that our proposed SamLP achieves promising detection performance compared to other LP detectors. Meanwhile, the proposed SamLP has great few-shot and zero-shot learning ability, which shows the potential of transferring vision foundation model. The code is available at https://github.com/Dinghaoxuan/SamLP
翻译:随着基础模型的出现,这种深度学习新范式已在自然语言处理和计算机视觉领域催生了许多强大成果。基础模型具有诸多优势,如卓越的特征提取能力、强大的泛化性能、优秀的少样本和零样本学习能力等,这些特性对视觉任务十分有利。作为车辆的唯一标识,不同国家和地区的车牌(LP)样式与外观各异,甚至不同车型的车牌也各不相同。然而,目前基于深度学习的车牌检测器主要针对特定数据集进行训练,有限的数据集限制了车牌检测器的有效性和鲁棒性。为缓解数据不足带来的负面影响,本文尝试利用基础模型的优势,针对车牌检测任务定制了视觉基础模型——分割一切模型(SAM),并提出了首个基于视觉基础模型的车牌检测器,命名为SamLP。具体而言,我们设计了低秩适配(LoRA)微调策略,向SAM注入额外参数,并将其迁移至车牌检测任务。随后,我们进一步提出可提示微调步骤,赋予SamLP可提示分割能力。实验表明,与其他车牌检测器相比,我们提出的SamLP取得了优异的检测性能。同时,SamLP具备出色的少样本和零样本学习能力,展示了视觉基础模型迁移的潜力。代码已开源至https://github.com/Dinghaoxuan/SamLP。