A significant challenge in applying planning technology to real-world problems lies in obtaining a planning model that accurately represents the problem's dynamics. Numeric Safe Action Models Learning (N-SAM) is a recently proposed algorithm that addresses this challenge. It is an algorithm designed to learn the preconditions and effects of actions from observations in domains that may involve both discrete and continuous state variables. N-SAM has several attractive properties. It runs in polynomial time and is guaranteed to output an action model that is safe, in the sense that plans generated by it are applicable and will achieve their intended goals. To preserve this safety guarantee, N-SAM must observe a substantial number of examples for each action before it is included in the learned action model. We address this limitation of N-SAM and propose N-SAM*, an enhanced version of N-SAM that always returns an action model where every observed action is applicable at least in some state, even if it was only observed once. N-SAM* does so without compromising the safety of the returned action model. We prove that N-SAM* is optimal in terms of sample complexity compared to any other algorithm that guarantees safety. An empirical study on a set of benchmark domains shows that the action models returned by N-SAM* enable solving significantly more problems compared to the action models returned by N-SAM.
翻译:将规划技术应用于实际问题时的一个重大挑战在于获得准确描述问题动态的规划模型。数值安全动作模型学习(N-SAM)是近期提出的一种解决该挑战的算法。该算法旨在从可能同时包含离散和连续状态变量的领域中的观测中学习动作的前提条件与效果。N-SAM具有若干优越特性:其运行时间为多项式时间,且能够保证输出一个安全的动作模型——即由其生成的规划方案不仅可执行,还能达成预期目标。为维持这一安全保证,N-SAM必须为每个动作观测到大量示例后,才能将该动作纳入学习得到的动作模型中。我们针对N-SAM的这一局限提出改进版本N-SAM*,该增强型算法始终返回一个动作模型,其中每个被观测到的动作至少能在某些状态下执行,即使该动作仅被观测一次。N-SAM*在实现这一目标的同时不会损害所返回动作模型的安全性。我们证明,在保证安全性的所有算法中,N-SAM*在样本复杂度方面达到最优。基于一组基准领域的实证研究表明,与N-SAM返回的动作模型相比,N-SAM*返回的动作模型能解决更多的问题。