When we exercise sequences of actions, their execution becomes more fluent and precise. Here, we consider the possibility that exercised action sequences can also be used to make planning faster and more accurate by focusing expansion of the search tree on paths that have been frequently used in the past, and by reducing deep planning problems to shallow ones via multi-step jumps in the tree. To capture such sequences, we use a flexible Bayesian action chunking mechanism which finds and exploits statistically reliable structure at different scales. This gives rise to shorter or longer routines that can be embedded into a Monte-Carlo tree search planner. We show the benefits of this scheme using a physical construction task patterned after tangrams.
翻译:当我们反复执行一系列动作时,这些动作的执行会变得更加流畅和精确。本文探讨了这样一种可能性:通过将搜索树的扩展聚焦于过去频繁使用的路径,并利用树中的多步跳跃将深层规划问题转化为浅层问题,可复用动作序列同样能使规划更快速、更准确。为捕捉此类序列,我们采用了一种灵活的贝叶斯动作分块机制,该机制能在不同尺度上发现并利用统计上可靠的结构。这便产生了可嵌入蒙特卡洛树规划器中的或长或短的例程。我们通过一项仿照七巧板模式构建的物理构造任务,展示了该方案的优势。