Distributed storage systems must handle both data heterogeneity, arising from non-uniform access demands, and device heterogeneity, caused by time-varying node reliability. In this paper, we study convertible codes, which enable the transformation of one code into another with minimum cost in the merge regime, addressing the latter. We derive general lower bounds on the read and write costs of linear code conversion, applicable to arbitrary linear codes. We then focus on Reed-Muller codes, which efficiently handle data heterogeneity, addressing the former issue, and construct explicit conversion procedures that, for the first time, combine both forms of heterogeneity for distributed data storage.
翻译:分布式存储系统必须同时应对数据异构性(由非均匀访问需求引起)和设备异构性(由节点可靠性随时间变化引起)。本文研究可转换编码,其能够在合并模式下以最小成本将一种编码转换为另一种编码,以解决后者(设备异构性)问题。我们推导了适用于任意线性编码的线性编码转换在读取和写入成本上的通用下界。随后,我们聚焦于能够高效处理数据异构性(解决前者问题)的 Reed-Muller 编码,并构建了显式的转换流程。该流程首次将两种异构性形式结合用于分布式数据存储。