The ATLAS Google Project was established as part of an ongoing evaluation of the use of commercial clouds by the ATLAS Collaboration, in anticipation of the potential future adoption of such resources by WLCG grid sites to fulfil or complement their computing pledges. Seamless integration of Google cloud resources into the worldwide ATLAS distributed computing infrastructure was achieved at large scale and for an extended period of time, and hence cloud resources are shown to be an effective mechanism to provide additional, flexible computing capacity to ATLAS. For the first time a total cost of ownership analysis has been performed, to identify the dominant cost drivers and explore effective mechanisms for cost control. Network usage significantly impacts the costs of certain ATLAS workflows, underscoring the importance of implementing such mechanisms. Resource bursting has been successfully demonstrated, whilst exposing the true cost of this type of activity. A follow-up to the project is underway to investigate methods for improving the integration of cloud resources in data-intensive distributed computing environments and reducing costs related to network connectivity, which represents the primary expense when extensively utilising cloud resources.
翻译:ATLAS谷歌云项目作为ATLAS合作组持续评估商用云服务的一部分而设立,旨在为WLCG网格站点未来可能采用此类资源以履行或补充其计算承诺做准备。谷歌云资源已实现与全球ATLAS分布式计算基础设施的大规模长期无缝集成,证明云资源是为ATLAS提供额外弹性计算能力的有效机制。研究首次进行了总拥有成本分析,以识别主要成本驱动因素并探索有效的成本控制机制。网络使用量显著影响特定ATLAS工作流的成本,这凸显了实施此类机制的重要性。资源突发扩容已成功验证,同时揭示了此类活动的真实成本。项目后续工作正在进行中,重点研究改进数据密集型分布式计算环境中云资源集成的方法,并降低网络连接相关成本——这已成为大规模使用云资源时的首要支出项。