The ATLAS Google Project was established as part of an ongoing evaluation of the use of commercial clouds by the ATLAS Collaboration, in anticipation of the potential future adoption of such resources by WLCG grid sites to fulfil or complement their computing pledges. Seamless integration of Google cloud resources into the worldwide ATLAS distributed computing infrastructure was achieved at large scale and for an extended period of time, and hence cloud resources are shown to be an effective mechanism to provide additional, flexible computing capacity to ATLAS. For the first time a total cost of ownership analysis has been performed, to identify the dominant cost drivers and explore effective mechanisms for cost control. Network usage significantly impacts the costs of certain ATLAS workflows, underscoring the importance of implementing such mechanisms. Resource bursting has been successfully demonstrated, whilst exposing the true cost of this type of activity. A follow-up to the project is underway to investigate methods for improving the integration of cloud resources in data-intensive distributed computing environments and reducing costs related to network connectivity, which represents the primary expense when extensively utilising cloud resources.
翻译:ATLAS谷歌云项目作为ATLAS合作组持续评估商业云应用的一部分而设立,旨在为WLCG网格站点未来可能采用此类资源以履行或补充其计算承诺做准备。谷歌云资源已在大规模、长时间运行的条件下实现了与全球ATLAS分布式计算基础设施的无缝集成,证明云资源是为ATLAS提供额外弹性计算能力的有效机制。本研究首次进行了总拥有成本分析,以识别主要成本驱动因素并探索有效的成本控制机制。网络使用量对特定ATLAS工作流成本产生显著影响,这凸显了实施此类机制的重要性。资源突发扩展已成功实现,同时揭示了此类活动的真实成本。项目的后续工作正在进行中,旨在研究改进数据密集型分布式计算环境中云资源集成的方法,并降低网络连接相关成本——这已成为大规模使用云资源时的主要支出。