Open-source scientific software is a major driver of scientific progress, yet its development and reuse remain difficult in collaborative settings. Researchers repeatedly face four recurring challenges: discovering and reproducing existing routines, adapting them for new use cases, sharing and scaling them across collaborators, and stabilizing them with reproducible execution environments. We present Album, an open-source framework for packaging and sharing scientific routines as executable artifacts through two minimal primitives: (i) the solution, a Python-native executable entry point that combines machine-readable metadata, arguments, environment specifications, and lifecycle hooks; and (ii) the catalog, a decentralized, git-native distribution mechanism with indexed search and optional web rendering for discovery, provenance, and governance. Album uses a two-context execution model in which a host controller evaluates manifests and prepares per-solution environments, while lifecycle hooks execute inside isolated solution environments. This design supports reproducible execution, post-environment setup, and the composition of routines with incompatible dependencies. Album can be used in conjunction with LLM agents: solutions can be drafted and revised with LLM assistance, and a MCP interface exposes cataloged solutions as callable tools for tool-grounded discovery and orchestration. We evaluate Album through four realworld imaging deployments spanning interactive visualization of electron microscopy data, integration of multiple segmentation methods, the orchestration of cryo-electron tomography competition workflows, and mineral quantification pipelines. Overall, Album complements package managers, workflow systems, and container runtimes by making scientific routines executable, shareable artifacts. Documentation and examples are available at https://album.solutions.
翻译:开源科学软件是科学进步的重要驱动力,但其在协作场景下的开发与复用仍面临诸多困难。研究人员反复遭遇四大挑战:发现并复现现有流程、为新型用例适配流程、跨协作者共享与扩展流程、以及通过可复现执行环境确保流程稳定性。我们提出Album框架,这是一个通过两种基本原语将科学流程打包为可执行制品(artifact)的开源方案:(i)解决方案(solution),即Python原生的可执行入口点,整合了机器可读元数据、参数、环境规范及生命周期钩子;(ii)目录(catalog),一种基于git的去中心化分发机制,通过索引搜索和可选网页渲染实现流程发现、溯源与治理。Album采用双上下文执行模型:主控制器解析清单并为各解决方案准备独立环境,而生命周期钩子在隔离的解决方案环境内执行。该设计支持可复现执行、环境后置配置以及含不兼容依赖的流程编排。Album可与大语言模型智能体协同使用:借助LLM辅助起草与修订解决方案,并通过模型上下文协议(MCP)接口将目录化解决方案暴露为可调用工具,实现基于工具的流程发现与编排。我们通过四个真实成像部署场景评估Album:涵盖电子显微镜数据的交互式可视化、多重分割方法集成、冷冻电子断层扫描竞赛工作流编排以及矿物定量分析管线。总体而言,Album通过将科学流程转化为可执行、可共享的制品,补充了包管理器、工作流系统与容器运行时的功能。文档与示例参见https://album.solutions。