We propose a new active learning approach for efficiently estimating the geographic range of a species from a limited number of on the ground observations. We model the range of an unmapped species of interest as the weighted combination of estimated ranges obtained from a set of different species. We show that it is possible to generate this candidate set of ranges by using models that have been trained on large weakly supervised community collected observation data. From this, we develop a new active querying approach that sequentially selects geographic locations to visit that best reduce our uncertainty over an unmapped species' range. We conduct a detailed evaluation of our approach and compare it to existing active learning methods using an evaluation dataset containing expert-derived ranges for one thousand species. Our results demonstrate that our method outperforms alternative active learning methods and approaches the performance of end-to-end trained models, even when only using a fraction of the data. This highlights the utility of active learning via transfer learned spatial representations for species range estimation. It also emphasizes the value of leveraging emerging large-scale crowdsourced datasets, not only for modeling a species' range, but also for actively discovering them.
翻译:我们提出了一种新的主动学习方法,用于在有限的地面观测数据下高效估计物种的地理分布范围。我们将目标未测绘物种的分布范围建模为从一组不同物种获得的估计范围的加权组合。研究表明,通过使用在大规模弱监督社区收集的观测数据上训练的模型,可以生成这一候选分布范围集合。基于此,我们开发了一种新的主动查询方法,该方法能够顺序选择最能降低未测绘物种分布范围不确定性的地理位置进行实地考察。我们利用包含一千个物种专家划定分布范围的评估数据集,对我们的方法进行了详细评估,并与现有主动学习方法进行了比较。结果表明,即使仅使用少量数据,我们的方法仍优于其他主动学习方法,其性能接近端到端训练模型。这凸显了利用迁移学习的空间表征进行主动学习在物种分布范围估计中的实用性,同时强调了不仅利用新兴大规模众包数据集进行物种分布范围建模,还将其用于主动发现的重要价值。