Data redundancy techniques have been tested in several different applications to provide fault tolerance and performance gains. The use of these techniques is mostly seen at the hardware, device driver, or file system level. In practice, the use of data integrity techniques with logical data has largely been limited to verifying the integrity of transferred files using cryptographic hashes. In this paper, we study the RAID scheme used with disk arrays and adapt it for use with logical data. An implementation for such a system is devised in theory and implemented in software, providing the specifications for the procedures and file formats used. Rigorous experimentation is conducted to test the effectiveness of the developed system for multiple use cases. With computer-generated benchmarks and simulated experiments, the system demonstrates robust performance in recovering arbitrary faults in large archive files only using a small fraction of redundant data. This was achieved by leveraging computing power for the process of data recovery.
翻译:数据冗余技术已在多种不同应用中得到测试,以提供容错能力和性能提升。这些技术的应用主要见于硬件、设备驱动或文件系统层面。实践中,数据完整性技术在逻辑数据中的应用大多局限于使用密码学哈希验证传输文件的完整性。本文研究了磁盘阵列中使用的RAID方案,并将其适配应用于逻辑数据。此类系统的实现方案在理论上进行了设计,并通过软件实现,提供了所用流程和文件格式的规范。通过严谨的实验测试了所开发系统在多种使用场景下的有效性。借助计算机生成的基准测试和模拟实验,该系统仅使用少量冗余数据即展现出在大容量归档文件中恢复任意故障的鲁棒性能。这是通过利用计算能力进行数据恢复过程实现的。