Cloud vendors offer discounted spot instances to maximize surplus resource utilization, but these instances are subject to the risk of sudden interruption. Traditional pricing datasets have been employed to predict this risk, yet recent policy changes by cloud vendors have diminished their effectiveness. To promote spot instance usage, public cloud vendors provide instant availability datasets to help users mitigate interruption risks. While existing research utilizing this data has proposed methods to reduce interruptions, these studies have primarily focused on single-node instances, overlooking the stability of multi-node environments widely adopted for modern cloud workloads. This paper proposes SpotVista, a system that recommends a resource pool of reliable and cost-efficient multi-node spot instances by leveraging various publicly available datasets. To achieve this, SpotVista collects a large-scale multi-node availability dataset while overcoming significant query limitations. Through a thorough analysis of multi-node spot instance availability behavior, SpotVista establishes a methodology for recommending cost-efficient and reliable multi-node configurations. To evaluate how effectively the proposed methodology reflects multi-node availability and cost efficiency, extensive real-world interruption experiments were conducted. The results demonstrate that SpotVista outperforms the state-of-the-art work, SpotVerse, achieving 81.28% greater availability and 2.84\% more cost savings in a multi-region setup. When compared to a publicly available service, AWS SpotFleet, SpotVista provides 21.6\% higher stability and 26.3% greater cost savings.
翻译:云服务商通过提供折扣竞价实例最大化闲置资源利用率,但这类实例面临突发中断风险。传统定价数据集已被用于预测此类风险,然而云服务商近期政策变更削弱了其有效性。为推广竞价实例使用,公有云服务商提供即时可用性数据集以帮助用户缓解中断风险。尽管现有研究利用该数据提出了减少中断的方法,但这些研究主要聚焦于单节点实例,忽视了现代云工作负载广泛采用的多节点环境稳定性。本文提出SpotVista系统,通过整合多种公开数据集,推荐兼具可靠性与成本效益的多节点竞价实例资源池。为此,SpotVista在克服显著查询限制的同时,收集了大规模多节点可用性数据集。通过对多节点竞价实例可用行为进行深入分析,SpotVista建立了推荐成本高效且可靠的多节点配置的方法论。为评估所提方法论对多节点可用性与成本效率的反映效果,开展了大规模真实中断实验。结果表明,SpotVista在多区域部署场景中较当前最优方案SpotVerse实现可用性提升81.28%、成本节省增加2.84%;与公共云服务AWS SpotFleet相比,稳定性提高21.6%,成本节省提升26.3%。