Plant phenotyping is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is the main bottleneck in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. In this work, we address these challenges by leveraging few-shot learning with convolutional neural networks (CNNs) to segment the leaf body and visible venation of 2,906 P. trichocarpa leaf images obtained in the field. In contrast to previous methods, our approach (i) does not require experimental or image pre-processing, (ii) uses the raw RGB images at full resolution, and (iii) requires very few samples for training (e.g., just eight images for vein segmentation). Traits relating to leaf morphology and vein topology are extracted from the resulting segmentations using traditional open-source image-processing tools, validated using real-world physical measurements, and used to conduct a genome-wide association study to identify genes controlling the traits. In this way, the current work is designed to provide the plant phenotyping community with (i) methods for fast and accurate image-based feature extraction that require minimal training data, and (ii) a new population-scale data set, including 68 different leaf phenotypes, for domain scientists and machine learning researchers. All of the few-shot learning code, data, and results are made publicly available.
翻译:植物表型分析通常是一项耗时且昂贵的工作,需要大量研究人员精确测量与生物学相关的植物性状,这成为在群体规模上理解植物适应性及复杂性状遗传结构的主要瓶颈。本研究通过利用卷积神经网络(CNN)的小样本学习,对田间采集的2,906张毛果杨叶片图像进行叶片主体和可见脉络的分割,从而应对这些挑战。与以往方法相比,我们的方法:(i)无需实验或图像预处理;(ii)直接使用原始全分辨率RGB图像;(iii)训练所需样本极少(例如,脉络分割仅需八张图像)。利用传统开源图像处理工具从分割结果中提取叶片形态和脉络拓扑相关性状,并通过真实物理测量进行验证,进而开展全基因组关联分析以识别控制这些性状的基因。据此,本项研究旨在为植物表型分析领域提供:(i)基于快速精准图像特征提取且需要最少训练数据的方法;(ii)涵盖68种不同叶片表型的全新群体规模数据集,供领域科学家和机器学习研究人员使用。所有小样本学习相关的代码、数据及结果均已公开共享。