The emerging discipline of Computational Science is concerned with using computers to simulate or solve scientific problems. These problems span the natural, political, and social sciences. The discipline has exploded over the past decade due to the emergence of larger amounts of observational data and large-scale simulations that were previously unavailable or unfeasible. However, there are still significant challenges with managing the large amounts of data and simulations. The database management systems community has always been at the forefront of the development of the theory and practice of techniques for formalizing and actualizing systems that access or query large datasets. In this paper, we present EmpireDB, a vision for a data management system to accelerate computational sciences. In addition, we identify challenges and opportunities for the database community to further the fledgling field of computational sciences. Finally, we present preliminary evidence showing that the optimized components in EmpireDB could lead to improvements in performance compared to contemporary implementations.
翻译:计算科学这一新兴学科致力于利用计算机模拟或解决科学问题,这些问题涵盖自然科学、政治学与社会科学等领域。过去十年间,由于先前无法获取或不可行的大规模观测数据与模拟技术的出现,该学科实现了爆发式增长。然而,在管理海量数据与模拟过程方面仍存在重大挑战。数据库管理系统领域始终处于前沿,致力于发展用于形式化与实现大规模数据集访问及查询系统的理论与实践技术。本文提出EmpireDB——一种旨在加速计算科学发展的数据管理系统构想。同时,我们为数据库学界指出了推动计算科学这一新兴领域发展的挑战与机遇。最后,我们通过初步实验证据表明,EmpireDB中的优化组件相较于现有实现方案能够带来显著的性能提升。