Planktonic organisms are key components of aquatic ecosystems and respond quickly to changes in the environment, therefore their monitoring is vital to understand the changes in the environment. Yet, monitoring plankton at appropriate scales still remains a challenge, limiting our understanding of functioning of aquatic systems and their response to changes. Modern plankton imaging instruments can be utilized to sample at high frequencies, enabling novel possibilities to study plankton populations. However, manual analysis of the data is costly, time consuming and expert based, making such approach unsuitable for large-scale application and urging for automatic solutions. The key problem related to the utilization of plankton datasets through image analysis is plankton recognition. Despite the large amount of research done, automatic methods have not been widely adopted for operational use. In this paper, a comprehensive survey on existing solutions for automatic plankton recognition is presented. First, we identify the most notable challenges that that make the development of plankton recognition systems difficult. Then, we provide a detailed description of solutions for these challenges proposed in plankton recognition literature. Finally, we propose a workflow to identify the specific challenges in new datasets and the recommended approaches to address them. For many of the challenges, applicable solutions exist. However, important challenges remain unsolved: 1) the domain shift between the datasets hindering the development of a general plankton recognition system that would work across different imaging instruments, 2) the difficulty to identify and process the images of previously unseen classes, and 3) the uncertainty in expert annotations that affects the training of the machine learning models for recognition. These challenges should be addressed in the future research.
翻译:浮游生物是水生生态系统的关键组成部分,它们能迅速响应环境变化,因此监测浮游生物对理解环境变化至关重要。然而,在适当尺度上监测浮游生物仍然是一项挑战,这限制了我们对水生系统功能及其响应变化的理解。现代浮游生物成像仪器可用于高频采样,为研究浮游生物种群提供了新的可能性。然而,人工数据分析成本高昂、耗时且依赖专家经验,这使得该方法不适用于大规模应用,并迫切需要自动化解决方案。通过图像分析利用浮游生物数据集的关键问题是浮游生物识别。尽管已有大量研究,但自动方法尚未被广泛用于实际操作。本文对现有的自动浮游生物识别解决方案进行了全面综述。首先,我们指出了开发浮游生物识别系统面临的最显著挑战。然后,我们详细描述了浮游生物识别文献中针对这些挑战提出的解决方案。最后,我们提出了一套工作流程,用于识别新数据集中的具体挑战以及应对这些挑战的推荐方法。对于许多挑战,已有可用的解决方案。然而,一些重要挑战仍未解决:1)数据集之间的领域差异阻碍了开发一个能跨不同成像仪器工作的通用浮游生物识别系统;2)识别和处理未见类图像的困难;3)专家注释的不确定性影响了用于识别的机器学习模型的训练。这些挑战应在未来的研究中得到解决。