Artificial Intelligence (AI) is revolutionizing biodiversity research by enabling advanced data analysis, species identification, and habitats monitoring, thereby enhancing conservation efforts. Ensuring reproducibility in AI-driven biodiversity research is crucial for fostering transparency, verifying results, and promoting the credibility of ecological findings.This study investigates the reproducibility of deep learning (DL) methods within the biodiversity domain. We design a methodology for evaluating the reproducibility of biodiversity-related publications that employ DL techniques across three stages. We define ten variables essential for method reproducibility, divided into four categories: resource requirements, methodological information, uncontrolled randomness, and statistical considerations. These categories subsequently serve as the basis for defining different levels of reproducibility. We manually extract the availability of these variables from a curated dataset comprising 61 publications identified using the keywords provided by biodiversity experts. Our study shows that the dataset is shared in 47% of the publications; however, a significant number of the publications lack comprehensive information on deep learning methods, including details regarding randomness.
翻译:人工智能(AI)正通过实现先进的数据分析、物种识别和栖息地监测,彻底改变生物多样性研究,从而加强保护工作。确保AI驱动的生物多样性研究的可复现性对于促进透明度、验证结果以及提升生态学发现的可信度至关重要。本研究调查了生物多样性领域内深度学习(DL)方法的可复现性。我们设计了一种方法,用于评估在三个阶段中采用DL技术的生物多样性相关出版物的可复现性。我们定义了方法可复现性所必需的十个变量,并将其分为四类:资源需求、方法学信息、非受控随机性以及统计考量。这些类别随后作为定义不同可复现性级别的基础。我们从一个包含61篇出版物的精选数据集中,手动提取了这些变量的可用性,这些出版物是使用生物多样性专家提供的关键词识别出的。我们的研究表明,有47%的出版物共享了数据集;然而,相当数量的出版物缺乏关于深度学习方法的全面信息,包括随机性相关的细节。