Fusarium head blight is a devastating disease that causes significant economic losses annually on small grains. Efficiency, accuracy, and timely detection of FHB in the resistance screening are critical for wheat and barley breeding programs. In recent years, various image processing techniques have been developed using supervised machine learning algorithms for the early detection of FHB. The state-of-the-art convolutional neural network-based methods, such as U-Net, employ a series of encoding blocks to create a local representation and a series of decoding blocks to capture the semantic relations. However, these methods are not often capable of long-range modeling dependencies inside the input data, and their ability to model multi-scale objects with significant variations in texture and shape is limited. Vision transformers as alternative architectures with innate global self-attention mechanisms for sequence-to-sequence prediction, due to insufficient low-level details, may also limit localization capabilities. To overcome these limitations, a new Context Bridge is proposed to integrate the local representation capability of the U-Net network in the transformer model. In addition, the standard attention mechanism of the original transformer is replaced with Efficient Self-attention, which is less complicated than other state-of-the-art methods. To train the proposed network, 12,000 wheat images from an FHB-inoculated wheat field at the SDSU research farm in Volga, SD, were captured. In addition to healthy and unhealthy plants, these images encompass various stages of the disease. A team of expert pathologists annotated the images for training and evaluating the developed model. As a result, the effectiveness of the transformer-based method for FHB-disease detection, through extensive experiments across typical tasks for plant image segmentation, is demonstrated.
翻译:赤霉病是一种毁灭性疾病,每年对谷类作物造成重大经济损失。在抗性筛选中,高效、准确且及时地检测赤霉病对小麦和大麦育种项目至关重要。近年来,多种基于监督机器学习算法的图像处理技术已被开发用于赤霉病的早期检测。最先进的基于卷积神经网络的方法(如U-Net)通过一系列编码模块构建局部表示,并通过一系列解码模块捕捉语义关系。然而,这些方法通常无法对输入数据中的长程依赖关系进行建模,且对纹理和形状差异显著的多尺度目标的建模能力有限。视觉Transformer作为替代架构,凭借其固有的全局自注意力机制适用于序列到序列预测,但由于缺乏足够的低级细节,可能也会限制其定位能力。为克服这些局限,本文提出了一种新的上下文桥接机制,将U-Net网络的局部表示能力集成到Transformer模型中。此外,原始Transformer的标准注意力机制被替换为高效自注意力机制,其复杂度低于其他最先进方法。为训练所提出的网络,我们在南达科他州立大学(SDSU)位于南达科他州沃尔加的研究农场中,从接种了赤霉病的小麦田采集了12,000幅小麦图像。这些图像不仅包含健康与不健康植株,还涵盖了疾病的不同阶段。由病理学专家团队对图像进行标注,以用于训练和评估所开发的模型。实验结果表明,通过针对植物图像分割典型任务的广泛测试,基于Transformer的赤霉病检测方法具有显著有效性。