CkIO: Parallel File Input for Over-Decomposed Task-Based Systems

Parallel input performance issues are often neglected in large scale parallel applications in Computational Science and Engineering. Traditionally, there has been less focus on input performance because either input sizes are small (as in biomolecular simulations) or the time doing input is insignificant compared with the simulation with many timesteps. But newer applications, such as graph algorithms add a premium to file input performance. Additionally, over-decomposed systems, such as Charm++/AMPI, present new challenges in this context in comparison to MPI applications. In the over-decomposition model, naive parallel I/O in which every task makes its own I/O request is impractical. Furthermore, load balancing supported by models such as Charm++/AMPI precludes assumption of data contiguity on individual nodes. We develop a new I/O abstraction to address these issues by separating the decomposition of consumers of input data from that of file-reader tasks that interact with the file system. This enables applications to scale the number of consumers of data without impacting I/O behavior or performance. These ideas are implemented in a new input library, CkIO, that is built on Charm++, which is a well-known task-based and overdecomposed-partitions system. CkIO is configurable via multiple parameters (such as the number of file readers and/or their placement) that can be tuned depending on characteristics of the application, such as file size and number of application objects. Additionally, CkIO input allows for capabilities such as effective overlap of input and application-level computation, as well as load balancing and migration. We describe the relevant challenges in understanding file system behavior and architecture, the design alternatives being explored, and preliminary performance data.

翻译：在计算科学与工程领域的大规模并行应用中，并行输入性能问题常被忽视。传统上，由于输入数据规模较小（如生物分子模拟），或输入时间相较于多时间步的模拟过程可忽略不计，输入性能较少受到关注。然而，新型应用（如图算法）对文件输入性能提出了更高要求。此外，与MPI应用相比，过度分解系统（如Charm++/AMPI）在此背景下带来了新的挑战。在过度分解模型中，每个任务独立发起I/O请求的简单并行I/O方案并不可行。同时，Charm++/AMPI等模型支持的负载均衡机制，使得单个节点上的数据连续性假设不再成立。为应对这些问题，我们开发了一种新型I/O抽象，将输入数据消费者任务的分解与文件系统交互的文件读取器任务的解耦分离。该设计使得应用能够在不影响I/O行为或性能的前提下，灵活扩展数据消费者的数量。这些理念在基于Charm++（著名的任务型过度分解分区系统）构建的新型输入库CkIO中得以实现。CkIO支持通过多个参数（如文件读取器数量及其布局策略）进行配置，可根据应用特征（如文件大小和应用对象数量）进行调优。此外，CkIO输入机制还具备输入与应用层计算的有效重叠、负载均衡与迁移等能力。本文阐述了理解文件系统行为与架构时面临的关键挑战，探讨了多种设计方案，并给出了初步性能数据。