High-resolution aerial imagery allows fine details in the segmentation of farmlands. However, small objects and features introduce distortions to the delineation of object boundaries, and larger contextual views are needed to mitigate class confusion. In this work, we present an end-to-end trainable network for segmenting farmlands with contour levees from high-resolution aerial imagery. A fusion block is devised that includes multiple voting blocks to achieve image segmentation and classification. We integrate the fusion block with a backbone and produce both semantic predictions and segmentation slices. The segmentation slices are used to perform majority voting on the predictions. The network is trained to assign the most likely class label of a segment to its pixels, learning the concept of farmlands rather than analyzing constitutive pixels separately. We evaluate our method using images from the National Agriculture Imagery Program. Our method achieved an average accuracy of 94.34\%. Compared to the state-of-the-art methods, the proposed method obtains an improvement of 6.96% and 2.63% in the F1 score on average.
翻译:高分辨率航空影像能够提供农田分割中的精细细节。然而,小尺度物体和特征会导致目标边界描绘出现扭曲,需要更大的上下文视野来缓解类别混淆问题。本文提出一种端到端可训练网络,用于从高分辨率航空影像中分割具有等高线堤坝的农田。我们设计了一种融合模块,该模块包含多个投票块以实现图像分割与分类。将该融合模块与主干网络集成,可同时生成语义预测结果与分割切片。分割切片用于对预测结果执行多数投票。网络被训练为将每个片段最可能的类别标签分配给其像素,从而学习农田的整体概念,而非单独分析构成像素。我们使用美国国家农业影像计划提供的图像评估该方法,平均准确率达到94.34%。与现有最优方法相比,本方法在F1分数上平均提升6.96%和2.63%。