The reproducibility of computational pipelines is an expectation in biomedical science, particularly in critical domains like human health. In this context, reporting next generation genome sequencing methods used in precision medicine spurred the development of the IEEE 2791-2020 standard for Bioinformatics Analyses Generated by High Throughput Sequencing (HTS), known as the BioCompute Object (BCO). Championed by the USA's Food and Drug Administration, the BCO is a pragmatic framework for documenting pipelines; however, it has not been systematically assessed for its reproducibility claims. This study uses the PRIMAD model, a conceptual framework for describing computational experiments for reproducibility purposes, to systematically review the BCO for depth and coverage. A meticulous mapping of BCO and PRIMAD elements onto a published BCO use case reveals potential omissions and necessary extensions within both frameworks. This underscores the significance of systematically validating claims of reproducibility for published digital objects, thereby enhancing the reliability of scientific research in bioscience and related disciplines. This study, along with its artifacts, is reported as a RO-Crate, providing a structured reporting approach, which is available at https://doi.org/10.5281/zenodo.14317922.
翻译:计算流程的可重复性是生物医学研究中的一项基本要求,在人类健康等关键领域尤为如此。在此背景下,精准医疗中新一代基因组测序方法的报告需求推动了IEEE 2791-2020标准(即高通量测序生物信息学分析规范,通称BioCompute对象)的制定。由美国食品药品监督管理局倡导的BCO是一种记录计算流程的实用框架,然而其关于可重复性的主张尚未得到系统化评估。本研究采用PRIMAD模型(一种为可重复性目的描述计算实验的概念框架)对BCO的深度与覆盖度进行系统性审查。通过将BCO与PRIMAD要素精细映射至已发布的BCO用例,揭示了两个框架中潜在的遗漏点与必要的扩展方向。这凸显了对已发布数字对象的可重复性主张进行系统化验证的重要性,从而提升生物科学及相关学科科学研究的可靠性。本研究及其相关成果以RO-Crate形式发布,提供了结构化报告方法,可通过https://doi.org/10.5281/zenodo.14317922获取。