As the volume of data being produced is increasing at an exponential rate that needs to be processed quickly, it is reasonable that the data needs to be available very close to the compute devices to reduce transfer latency. Due to this need, local filesystems are getting close attention to understand their inner workings, performance, and more importantly their limitations. This study analyzes few popular Linux filesystems: EXT4, XFS, BtrFS, ZFS, and F2FS by creating, storing, and then reading back one billion files from the local filesystem. The study also captured and analyzed read/write throughput, storage blocks usage, disk space utilization and overheads, and other metrics useful for system designers and integrators. Furthermore, the study explored other side effects such as filesystem performance degradation during and after these large numbers of files and folders are created.
翻译:随着数据生成量以指数级速度增长且需快速处理,将数据置于计算设备近端以降低传输延迟已成为合理需求。基于此背景,本地文件系统的内部机制、性能表现及其局限性正受到密切关注。本研究通过创建、存储并回读十亿个文件,对EXT4、XFS、BtrFS、ZFS及F2FS等主流Linux文件系统进行剖析。实验过程中系统采集并分析了读写吞吐量、存储块使用率、磁盘空间利用率与开销等对系统设计及集成人员具有参考价值的指标。此外,本研究还探讨了创建海量文件与文件夹期间及之后可能引发的文件系统性能衰减等衍生问题。