The Exascale Computing Project (ECP) was one of the largest open-source scientific software development projects ever. It supported approximately 1,000 staff from US Department of Energy laboratories, and university and industry partners. About 250 staff contributed to 70 scientific libraries and tools to support applications on multiple exascale computing systems that were also under development. Funded as a construction project, ECP adopted an earned-value management system, based on milestones. and a key performance parameter system based, in part, on integrations. With accelerated delivery schedules and significant project risk, we also emphasized software quality using community policies, automated testing, and continuous integration. Software Development Kit teams provided cross-team collaboration. Products were delivered via E4S, a curated portfolio of libraries and tools. In this paper, we discuss the organizational and management elements that enabled the efficient and effective delivery of ECP libraries and tools, lessons learned and next steps.
翻译:百亿亿次计算项目(ECP)是有史以来规模最大的开源科学软件开发项目之一。该项目支持了来自美国能源部实验室、大学及行业合作伙伴的约1000名科研人员。其中约250名人员参与了70个科学库和工具的研发,以支持多套同步开发的百亿亿次计算系统上的应用程序。作为建设项目资助,ECP采用了基于里程碑的挣值管理体系,以及部分基于集成度的关键性能参数系统。面对加速交付的时间表和重大项目风险,我们通过社区策略、自动化测试和持续集成来强调软件质量。软件开发套件团队提供了跨团队协作。产品通过E4S(精选库与工具组合)进行交付。本文讨论了实现ECP库与工具高效交付的组织与管理要素、经验教训及后续步骤。