This paper introduces the Inside-Out Nested Particle Filter (IO-NPF), a novel, fully recursive, algorithm for amortized sequential Bayesian experimental design in the non-exchangeable setting. We frame policy optimization as maximum likelihood estimation in a non-Markovian state-space model, achieving (at most) $\mathcal{O}(T^2)$ computational complexity in the number of experiments. We provide theoretical convergence guarantees and introduce a backward sampling algorithm to reduce trajectory degeneracy. IO-NPF offers a practical, extensible, and provably consistent approach to sequential Bayesian experimental design, demonstrating improved efficiency over existing methods.
翻译:本文提出一种新颖的完全递归算法——内外嵌套粒子滤波(IO-NPF),用于非可交换场景下的摊销序贯贝叶斯实验设计。我们将策略优化问题构建为非马尔可夫状态空间模型中的极大似然估计,在实验次数上实现至多 $\mathcal{O}(T^2)$ 的计算复杂度。我们提供了理论收敛性保证,并引入反向采样算法以降低轨迹退化问题。IO-NPF 为序贯贝叶斯实验设计提供了一种实用、可扩展且可证明一致的方法,其效率较现有方法有显著提升。