In precision agriculture, the detection and recognition of insects play an essential role in the ability of crops to grow healthy and produce a high-quality yield. The current machine vision model requires a large volume of data to achieve high performance. However, there are approximately 5.5 million different insect species in the world. None of the existing insect datasets can cover even a fraction of them due to varying geographic locations and acquisition costs. In this paper, we introduce a novel "Insect-1M" dataset, a game-changing resource poised to revolutionize insect-related foundation model training. Covering a vast spectrum of insect species, our dataset, including 1 million images with dense identification labels of taxonomy hierarchy and insect descriptions, offers a panoramic view of entomology, enabling foundation models to comprehend visual and semantic information about insects like never before. Then, to efficiently establish an Insect Foundation Model, we develop a micro-feature self-supervised learning method with a Patch-wise Relevant Attention mechanism capable of discerning the subtle differences among insect images. In addition, we introduce Description Consistency loss to improve micro-feature modeling via insect descriptions. Through our experiments, we illustrate the effectiveness of our proposed approach in insect modeling and achieve State-of-the-Art performance on standard benchmarks of insect-related tasks. Our Insect Foundation Model and Dataset promise to empower the next generation of insect-related vision models, bringing them closer to the ultimate goal of precision agriculture.
翻译:在精准农业中,昆虫的检测与识别对作物健康生长及优质高产至关重要。当前机器视觉模型需要大量数据才能实现高性能,然而全球约有550万种昆虫物种。由于地理分布差异与数据采集成本,现有昆虫数据集均无法覆盖其中哪怕一小部分物种。本文提出创新性"Insect-1M"数据集,这一革命性资源有望彻底改变昆虫基础模型的训练方式。该数据集涵盖广泛昆虫物种,包含百万张图像及其对应的分类层级标识与昆虫描述信息,为昆虫学提供了全景式视角,使基础模型得以前所未有地理解昆虫的视觉与语义信息。为高效构建昆虫基础模型,我们开发了基于分块相关性注意力机制的微特征自监督学习方法,该机制能够精准识别昆虫图像间的细微差异。同时引入描述一致性损失函数,通过昆虫描述增强微特征建模能力。实验表明,所提方法在昆虫建模中表现优异,并在昆虫相关任务的标准基准测试中取得了当前最优性能。我们的昆虫基础模型与数据集将赋能新一代昆虫视觉模型,推动精准农业终极目标的实现。