Amortised analysis is a technique for proving a combined time bound for a batch of operations on a data structure, even if some of those operations are expensive. But the traditional method of amortised analysis yields incorrect time bounds when the data structure is used persistently. Persistence allows operations to be performed on previous versions of the data structure, which prevents us from amortising expensive restructuring work. In his seminal book, Chris Okasaki showed how to extend amortised analysis to persistent usage. His method works by extending the data structure with thunks and performing the analysis with debits rather than credits. His argument, that credits are unsound for analysing persistent usage, has become folklore. In this paper, we provide a new perspective on the role of debits in Okasaki's work. First, we set up an operational semantics of call-by-value lambda calculus with thunks, and show formally that traditional amortised analysis does not work in a persistent setting. Then we show that, contrary to the folklore, credit-based amortised analysis can be sound in a persistent setting as long as credits are only stored on thunks. Finally, we provide a formal semantics for Okasaki's debit-based approach. Our paper clarifies the formal foundation of Okasaki's work and makes it accessible to a wider audience.
翻译:分摊分析是一种为数据结构上的批量操作(即使其中某些操作代价高昂)证明组合时间界限的技术。然而,当数据结构被持久化使用时,传统的分摊分析方法会产生错误的时间界限。持久化允许对数据结构的先前版本执行操作,这阻碍了我们通过分摊昂贵重构工作来优化性能。在开创性著作中,Chris Okasaki展示了如何将分摊分析扩展到持久化使用场景。其方法通过向数据结构添加惰性求值块(thunk)并改用借记(debits)而非贷记(credits)进行分析来实现。他认为贷记机制不适合持久化使用分析的论断已成为学界共识。本文为Okasaki工作中借记机制的作用提供了全新视角。首先,我们构建了包含惰性求值块的按值调用lambda演算的操作语义,并形式化证明了传统分摊分析在持久化场景下失效。随后我们证明,与学界共识相反,只要贷记仅存储在惰性求值块上,基于贷记的分摊分析在持久化场景中仍然是稳健的。最后,我们为Okasaki的借记方法提供了形式化语义。本文厘清了Okasaki工作的形式化基础,使其更易于被广泛学者理解。