Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems

The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the exception of the POSIX standard and its strict consistency model. The use of POSIX consistency imposes a performance penalty that becomes more significant as the scale of parallel file systems increases and the access time to storage devices, such as node-local solid storage devices, decreases. While some efforts have been made to adopt relaxed storage consistency models, these models are often defined informally and ambiguously as by-products of a particular implementation. In this work, we establish a connection between memory consistency models and storage consistency models and revisit the key design choices of storage consistency models from a high-level perspective. Further, we propose a formal and unified framework for defining storage consistency models and a layered implementation that can be used to easily evaluate their relative performance for different I/O workloads. Finally, we conduct a comprehensive performance comparison of two relaxed consistency models on a range of commonly-seen parallel I/O workloads, such as checkpoint/restart of scientific applications and random reads of deep learning applications. We demonstrate that for certain I/O scenarios, a weaker consistency model can significantly improve the I/O performance. For instance, in small random reads that typically found in deep learning applications, session consistency achieved an 5x improvement in I/O bandwidth compared to commit consistency, even at small scales.

翻译：高性能计算存储系统的语义由其遵循的一致性模型定义。相较于内存系统中的一致性模型，存储一致性模型的研究相对较少，仅有POSIX标准及其严格一致性模型是例外。采用POSIX一致性会带来性能代价，且随着并行文件系统规模扩大及存储设备（如节点本地固态存储设备）访问时间缩短，这一代价愈发显著。尽管已有研究尝试采用宽松存储一致性模型，但这些模型往往作为特定实现的附带产物被非正式且模糊地定义。本文建立了内存一致性模型与存储一致性模型之间的关联，并从高层视角重新审视存储一致性模型的关键设计选择。进一步地，我们提出了一个形式化且统一的存储一致性模型定义框架，以及一个可便捷评估不同I/O工作负载下模型相对性能的分层实现方案。最终，我们针对常见并行I/O负载（如科学应用检查点/重启与深度学习应用随机读取）对两种宽松一致性模型进行了全面性能比较。结果表明，在某些I/O场景下，弱一致性能够显著提升I/O性能。例如，在深度学习应用中典型的小规模随机读取场景，即使在节点规模较小时，会话一致性相较提交一致性仍能实现5倍的I/O带宽提升。