The integration of command-line tools into the Galaxy platform is crucial for making complex computational methods accessible to a broader audience and ensuring reproducible research. However, the manual development of tool wrappers is a time-consuming, error-prone, and knowledge-intensive process. This bottleneck significantly affects the rapid deployment of new and updated tools, creating a gap between tool development and its availability to the scientific community. We have developed a novel, automated approach that directly translates Python tool interfaces into Galaxy-compliant tool wrappers. Our method leverages the argparse library, a standard for command-line argument parsing in Python. By embedding structured metadata within the metavar attribute of input and output arguments, our system programmatically parses the tool's interface to extract all necessary information. This includes parameter types, data formats, help text, and input/output definitions. The system then uses this information to automatically generate a complete and valid Galaxy tool XML wrapper, requiring no manual intervention. To validate the scalability and effectiveness of our approach, we applied it to the anvi'o framework, a comprehensive and complex bioinformatics platform comprising hundreds of individual programs. Our method successfully parsed the argparse definitions for the entire anvi'o suite and generated functional Galaxy tool wrappers. The resulting integration allows for the seamless execution of anvi'o workflows within the Galaxy environment. This work presents a significant advancement in the automation of tool integration for scientific workflow systems. By establishing a convention-based approach using Python's argparse library, we have created a scalable and generalizable solution that dramatically reduces the effort required to make command-line tools available in Galaxy.
翻译:将命令行工具集成到Galaxy平台对于让更广泛的用户群体能够使用复杂的计算方法并确保研究的可重复性至关重要。然而,手动开发工具包装器是一个耗时、易错且需要专业知识的过程。这一瓶颈显著影响了新工具和更新工具的快速部署,在工具开发与其对科学界的可用性之间造成了差距。我们开发了一种新颖的自动化方法,能够直接将Python工具接口转换为符合Galaxy规范的工具包装器。我们的方法利用了Python中命令行参数解析的标准库argparse。通过在输入和输出参数的metavar属性中嵌入结构化元数据,我们的系统能够以编程方式解析工具接口以提取所有必要信息,包括参数类型、数据格式、帮助文本以及输入/输出定义。系统随后利用这些信息自动生成完整且有效的Galaxy工具XML包装器,无需人工干预。为了验证我们方法的可扩展性和有效性,我们将其应用于anvi'o框架——一个包含数百个独立程序的全面而复杂的生物信息学平台。我们的方法成功解析了整个anvi'o套件的argparse定义,并生成了功能完整的Galaxy工具包装器。由此实现的集成使得anvi'o工作流能够在Galaxy环境中无缝执行。这项工作在科学工作流系统的工具集成自动化方面取得了重大进展。通过建立基于Python argparse库的约定式方法,我们创建了一个可扩展且可推广的解决方案,极大地减少了在Galaxy中提供命令行工具所需的工作量。