In the 1980s, high-performance computing (HPC) became another tool for research in the open (non-defense) science and engineering research communities. However, HPC came with a high price tag; the first Cray-2 machines, released in 1985, cost between \$12 million and \$17 million, according to the Computer History Museum, and were largely available only at government research labs or through national supercomputing centers. In the 1990s, with demand for HPC increasing due to vast datasets, more complex modeling, and the growing computational needs of scientific applications, researchers began experimenting with building HPC machines from clusters of servers running the Linux operating system. By the late 1990s, two approaches to Linux-based parallel computing had emerged: the personal computer cluster methodology that became known as Beowulf and the Roadrunner architecture aimed at a more cost-effective supercomputer. While Beowulf attracted attention because of its low cost and thereby greater accessibility, Roadrunner took a different approach. While still affordable compared to vector processors and other commercially available supercomputers, Roadrunner integrated its commodity components with specialized networking technology. Furthermore, these systems initially served different purposes. While Beowulf focused on providing affordable parallel workstations for individual researchers at NASA, Roadrunner set out to provide a multi-user system that could compete with the commercial supercomputers that dominated the market at the time. This paper analyzes the technical decisions, performance implications, and long-term influence of both approaches. Through this analysis, we can start to judge the impact of both Roadrunner and Beowulf on the development of Linux-based supercomputers.
翻译:在20世纪80年代,高性能计算(HPC)成为开放性(非国防领域)科学与工程研究界的又一研究工具。然而,高性能计算价格高昂;据计算机历史博物馆记载,1985年推出的首批Cray-2机器每台造价在1200万至1700万美元之间,且主要仅限政府研究实验室或通过国家超级计算中心使用。20世纪90年代,随着庞大数据集、更复杂的建模以及科学应用对计算需求的日益增长,对高性能计算的需求不断上升,研究人员开始尝试使用运行Linux操作系统的服务器集群来构建高性能计算系统。到20世纪90年代末,基于Linux的并行计算出现了两种方法:后来被称为Beowulf的个人计算机集群方法,以及旨在实现更具成本效益的超级计算机的Roadrunner架构。Beowulf因其低成本带来的更高可及性而备受关注,而Roadrunner则另辟蹊径。尽管与向量处理器及其他商用超级计算机相比,Roadrunner仍具成本优势,但它将商品化组件与专用网络技术相结合。此外,这些系统最初服务于不同目的:Beowulf专注于为NASA的个体研究人员提供负担得起的并行工作站,而Roadrunner则致力于构建一个多用户系统,与当时主导市场的商用超级计算机展开竞争。本文分析了这两种方法的技术决策、性能影响及长期影响。通过这一分析,我们可以开始评判Roadrunner和Beowulf对基于Linux的超级计算机发展所产生的深远影响。