Trusted Execution Environments (TEEs) have enabled confidential Byzantine Fault-Tolerant (BFT) consensus systems with confidentiality and improved scalability. However, TEEs do not provide state continuity: during recovery, a compromised host can roll back a crashed enclave to a stale persistent state, significantly threatening both safety and availability. Existing defenses face a fundamental tradeoff: they either impose substantial overhead on critical consensus paths, reducing throughput and increasing latency, or incur prolonged recovery delays, hurting availability. We present the first systematic taxonomy of rollback-resilient recovery for confidential BFT consensus, distilling prior approaches into four categories. We further expose their inherent limitations. Guided by this detailed analysis, we design CHIMERA, a protocol-aware recovery framework that breaks this tradeoff. Our key insight is that rollback protection in consensus systems should not be uniform. Different types of persistent states differ fundamentally in their state distribution, update behavior, and representation form. CHIMERA separates persistent state into metadata and logs according to these protocol-level properties and applies distinct recovery mechanisms to each type. We formally model CHIMERA in Maude and verify its safety and liveness properties. We implement it on Braft and ZooKeeper using Intel TDX, and evaluate it in both LAN and WAN settings. Results show that CHIMERA achieves higher throughput, lower recovery latency, and better availability than state-of-the-art rollback-resilient baselines.
翻译:摘要:可信执行环境(TEE)使得机密拜占庭容错(BFT)共识系统具备机密性与可扩展性提升成为可能。然而,TEE无法保证状态连续性:在恢复过程中,被攻陷的主机可将崩溃的飞地回滚至过时的持久状态,严重威胁安全性及可用性。现有防御方案面临根本性权衡:要么在关键共识路径上引入显著开销,降低吞吐量并增加延迟;要么导致恢复延迟过长,损害可用性。我们首次提出面向机密BFT共识的回滚弹性恢复的系统性分类体系,将现有方法归纳为四类,并揭示其固有局限性。基于此深入分析,我们设计了CHIMERA——一种打破该权衡的协议感知恢复框架。关键洞察在于:共识系统中的回滚保护不应采用统一策略。不同类型持久状态在状态分布、更新行为及表示形式上存在根本差异。CHIMERA根据协议级属性将持久状态分离为元数据与日志,对每类状态采用差异化恢复机制。我们使用Maude对CHIMERA进行形式化建模,并验证其安全性与活性属性。基于Intel TDX技术在Braft和ZooKeeper上实现该框架,并在局域网与广域网环境下完成评估。结果表明,与最先进回滚弹性基线方案相比,CHIMERA实现了更高吞吐量、更低恢复延迟及更优可用性。