AdaTreeFormer: Few Shot Domain Adaptation for Tree Counting from a Single High-Resolution Image

The process of estimating and counting tree density using only a single aerial or satellite image is a difficult task in the fields of photogrammetry and remote sensing. However, it plays a crucial role in the management of forests. The huge variety of trees in varied topography severely hinders tree counting models to perform well. The purpose of this paper is to propose a framework that is learnt from the source domain with sufficient labeled trees and is adapted to the target domain with only a limited number of labeled trees. Our method, termed as AdaTreeFormer, contains one shared encoder with a hierarchical feature extraction scheme to extract robust features from the source and target domains. It also consists of three subnets: two for extracting self-domain attention maps from source and target domains respectively and one for extracting cross-domain attention maps. For the latter, an attention-to-adapt mechanism is introduced to distill relevant information from different domains while generating tree density maps; a hierarchical cross-domain feature alignment scheme is proposed that progressively aligns the features from the source and target domains. We also adopt adversarial learning into the framework to further reduce the gap between source and target domains. Our AdaTreeFormer is evaluated on six designed domain adaptation tasks using three tree counting datasets, \ie Jiangsu, Yosemite, and London. Experimental results show that AdaTreeFormer significantly surpasses the state of the art, \eg in the cross domain from the Yosemite to Jiangsu dataset, it achieves a reduction of 15.9 points in terms of the absolute counting errors and an increase of 10.8\% in the accuracy of the detected trees' locations. The codes and datasets are available at https://github.com/HAAClassic/AdaTreeFormer.

翻译：仅使用单张航空或卫星图像进行树木密度估计与计数是摄影测量与遥感领域的一项艰巨任务，但其在森林管理中具有关键作用。复杂地形中树木形态的巨大差异性严重制约了树木计数模型的性能。本文旨在提出一种框架，该框架可从具有充足标注树木的源域进行学习，并适应于仅含有限标注树木的目标域。我们提出的方法称为AdaTreeFormer，包含一个采用分层特征提取策略的共享编码器，用于从源域和目标域提取鲁棒特征。该方法还包含三个子网络：两个分别用于从源域和目标域提取自域注意力图，另一个用于提取跨域注意力图。对于跨域注意力提取，我们引入了注意力自适应机制，在生成树木密度图的同时从不同域蒸馏相关信息；并提出分层跨域特征对齐方案，逐步对齐源域与目标域的特征。我们还采用对抗学习进一步缩小源域与目标域间的差距。我们在三个树木计数数据集（即江苏、优胜美地和伦敦数据集）构建的六项域自适应任务上评估AdaTreeFormer。实验结果表明，AdaTreeFormer显著优于现有最优方法，例如在从优胜美地到江苏数据集的跨域任务中，其绝对计数误差降低15.9点，检测树木位置的准确率提升10.8%。代码与数据集已开源：https://github.com/HAAClassic/AdaTreeFormer。