Current causally consistent data storage algorithms use partial or full replication to ensure data access to clients over a distributed setting. We develop, for the first time, an erasure coding-based algorithm called CausalEC that ensures causal consistency for a collection of read-write objects stored in a distributed set of nodes over an asynchronous message-passing system. CausalEC can use an arbitrary linear erasure code for data storage and ensures liveness, fault-tolerance, and storage properties prescribed by the erasure code. CausalEC retains a key benefit of previous replication-based algorithms - every write operation is "local", that is, a server performs only local actions before returning to a client that issued a write operation. For servers that store certain objects in an uncoded manner, read operations to those objects also return locally. In general, a read operation to an object can be returned by a server on contacting a small subset of other servers so long as the underlying erasure code allows for the object to be decoded from that subset. Notably, unlike previous consistent erasure coding-based algorithms, CausalEC is compatible with cross-object erasure coding, where nodes encode values across multiple objects. CausalEC navigates the technical challenges of cross-object erasure coding, in particular, pertaining to re-encoding when writes update the values and ensuring that concurrent reads are served in a non-blocking manner during the transition to storing codeword symbols corresponding to the updated values.
翻译:当前因果一致性数据存储算法采用部分或全量复制(replication)来确保分布式环境中客户端的数据访问。我们首次提出基于纠删编码(erasure coding)的算法——CausalEC,该算法能在异步消息传递系统的分布式节点集合中,为读-写对象集合保证因果一致性。CausalEC可采用任意线性纠删编码进行数据存储,并确保由该纠删编码所规定的活性、容错性和存储属性。CausalEC保留了此前基于复制算法的关键优势——每次写操作都是"本地"的,即服务器在执行写操作后仅需执行本地操作即可向发起写操作的客户端返回结果。对于以非编码方式存储特定对象的服务器,这些对象的读操作也能本地完成。一般而言,只要底层纠删编码允许从某子集中解码对象,服务器仅需联系少量其他服务器即可返回该对象的读操作结果。值得注意的是,与先前基于纠删编码的一致性算法不同,CausalEC支持跨对象纠删编码,即节点可对多个对象的值进行编码。CausalEC克服了跨对象纠删编码的技术挑战,尤其涉及写入更新值时需进行重编码、以及在过渡到存储对应更新值的码字符号时需以非阻塞方式服务并发读操作等问题。