SamLP: A Customized Segment Anything Model for License Plate Detection

With the emergence of foundation model, this novel paradigm of deep learning has encouraged many powerful achievements in natural language processing and computer vision. There are many advantages of foundation model, such as excellent feature extraction power, mighty generalization ability, great few-shot and zero-shot learning capacity, etc. which are beneficial to vision tasks. As the unique identity of vehicle, different countries and regions have diverse license plate (LP) styles and appearances, and even different types of vehicles have different LPs. However, recent deep learning based license plate detectors are mainly trained on specific datasets, and these limited datasets constrain the effectiveness and robustness of LP detectors. To alleviate the negative impact of limited data, an attempt to exploit the advantages of foundation model is implement in this paper. We customize a vision foundation model, i.e. Segment Anything Model (SAM), for LP detection task and propose the first LP detector based on vision foundation model, named SamLP. Specifically, we design a Low-Rank Adaptation (LoRA) fine-tuning strategy to inject extra parameters into SAM and transfer SAM into LP detection task. And then, we further propose a promptable fine-tuning step to provide SamLP with prompatable segmentation capacity. The experiments show that our proposed SamLP achieves promising detection performance compared to other LP detectors. Meanwhile, the proposed SamLP has great few-shot and zero-shot learning ability, which shows the potential of transferring vision foundation model. The code is available at https://github.com/Dinghaoxuan/SamLP

翻译：随着基础模型的出现，这种深度学习新范式已在自然语言处理和计算机视觉领域催生了许多强大成果。基础模型具有诸多优势，如卓越的特征提取能力、强大的泛化性能、优秀的少样本和零样本学习能力等，这些特性对视觉任务十分有利。作为车辆的唯一标识，不同国家和地区的车牌（LP）样式与外观各异，甚至不同车型的车牌也各不相同。然而，目前基于深度学习的车牌检测器主要针对特定数据集进行训练，有限的数据集限制了车牌检测器的有效性和鲁棒性。为缓解数据不足带来的负面影响，本文尝试利用基础模型的优势，针对车牌检测任务定制了视觉基础模型——分割一切模型（SAM），并提出了首个基于视觉基础模型的车牌检测器，命名为SamLP。具体而言，我们设计了低秩适配（LoRA）微调策略，向SAM注入额外参数，并将其迁移至车牌检测任务。随后，我们进一步提出可提示微调步骤，赋予SamLP可提示分割能力。实验表明，与其他车牌检测器相比，我们提出的SamLP取得了优异的检测性能。同时，SamLP具备出色的少样本和零样本学习能力，展示了视觉基础模型迁移的潜力。代码已开源至https://github.com/Dinghaoxuan/SamLP。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日