The Ethereum blockchain utilizes the EIP-1559 algorithm to manage transaction inclusion and block assembly. However, EIP-1559 and much of the existing literature study this problem from a static perspective, focusing on price evolution without modelling transaction dynamics within the mempool. Motivated by this limitation, we study a dynamic transaction scheduling problem in which transactions with heterogeneous sizes and per-unit values arrive over time and remain in the mempool until scheduled. To capture the stochastic mempool evolution, we formulate the problem as a Markov Decision Process (MDP) whose state represents the mempool configuration and whose actions correspond to block prices. We first provide a primal-dual interpretation of the static EIP-1559 mechanism, showing that block prices arise naturally as dual variables of a social-welfare maximization problem. Building on this perspective, we extend the framework to the dynamic setting and formulate an objective that maximizes long-run discounted reward while incorporating holding costs and overshoot penalties. We then employ a Natural Policy Gradient (NPG) algorithm to compute the optimal policy. Our results show that dynamic pricing stabilizes the mempool while maximizing long-run discounted reward. In particular, as the overshoot penalty increases, the average scheduled transaction volume converges to the target block capacity, and the resulting NPG updates closely resemble the EIP-1559 price update rule. Finally, we study two special cases of the MDP formulation: homogeneous transactions and uniform arrivals. In the homogeneous setting, where the protocol directly controls scheduled volume, we show that the optimal policy has a threshold structure. We then propose a bang-bang pricing mechanism for uniform arrivals and derive a lower bound on the block capacity needed to ensure system stability.
翻译:以太坊区块链利用EIP-1559算法来管理交易的包含和区块组装。然而,EIP-1559算法及现有大量文献均从静态视角研究该问题,聚焦于价格演变而忽略了对交易在内存池中动态过程的建模。受此局限性的启发,我们研究了一个动态交易调度问题:具有不同大小和单位价值的交易随时间陆续到达,并在被调度前一直驻留在内存池中。为刻画内存池的随机演化过程,我们将该问题建模为一个马尔可夫决策过程(MDP),其状态表示内存池配置,动作对应区块价格。我们首先对静态EIP-1559机制进行了原始-对偶解释,揭示了区块价格天然地作为社会福利最大化问题的对偶变量而出现。基于这一视角,我们将框架扩展至动态环境,并制定了一个目标函数:在计入持有成本和超调惩罚的同时,最大化长期折扣奖励。随后,我们采用自然策略梯度(NPG)算法来计算最优策略。结果表明,动态定价能在最大化长期折扣奖励的同时稳定内存池。特别地,随着超调惩罚增大,平均调度交易量收敛至目标区块容量,由此得到的NPG更新规则与EIP-1559价格更新规则高度相似。最后,我们研究了MDP公式的两个特例:同质交易和均匀到达。在同质交易环境下(协议直接控制调度量),我们证明了最优策略具有阈值结构。针对均匀到达情形,我们提出了一种bang-bang定价机制,并推导了确保系统稳定性所需区块容量的下界。