This paper introduces the Inside-Out Nested Particle Filter (IO-NPF), a novel, fully recursive, algorithm for amortized sequential Bayesian experimental design in the non-exchangeable setting. We frame policy optimization as maximum likelihood estimation in a non-Markovian state-space model, achieving (at most) $\mathcal{O}(T^2)$ computational complexity in the number of experiments. We provide theoretical convergence guarantees and introduce a backward sampling algorithm to reduce trajectory degeneracy. IO-NPF offers a practical, extensible, and provably consistent approach to sequential Bayesian experimental design, demonstrating improved efficiency over existing methods.
翻译:本文提出了一种新颖的、完全递归的算法——内外嵌套粒子滤波器(IO-NPF),用于处理非可交换场景下的摊销序贯贝叶斯实验设计。我们将策略优化问题构建为非马尔可夫状态空间模型中的最大似然估计,实现了实验次数上至多$\mathcal{O}(T^2)$的计算复杂度。我们提供了理论收敛性保证,并引入了一种后向采样算法以减少轨迹退化。IO-NPF为序贯贝叶斯实验设计提供了一种实用、可扩展且可证明一致的方法,相较于现有方法展现出更高的效率。