In robotics, likelihood-free inference (LFI) can provide the domain distribution that adapts a learnt agent in a parametric set of deployment conditions. LFI assumes an arbitrary support for sampling, which remains constant as the initial generic prior is iteratively refined to more descriptive posteriors. However, a potentially misspecified support can lead to suboptimal, yet falsely certain, posteriors. To address this issue, we propose three heuristic LFI variants: EDGE, MODE, and CENTRE. Each interprets the posterior mode shift over inference steps in its own way and, when integrated into an LFI step, adapts the support alongside posterior inference. We first expose the support misspecification issue and evaluate our heuristics using stochastic dynamical benchmarks. We then evaluate the impact of heuristic support adaptation on parameter inference and policy learning for a dynamic deformable linear object (DLO) manipulation task. Inference results in a finer length and stiffness classification for a parametric set of DLOs. When the resulting posteriors are used as domain distributions for sim-based policy learning, they lead to more robust object-centric agent performance.
翻译:在机器人学中,无似然推断(LFI)能够提供适应学习智能体于一组参数化部署条件的域分布。LFI假设采样具有任意支持域,该支持域在初始通用先验被迭代细化为更具描述性的后验时保持不变。然而,可能误定的支持域会导致次优但虚假确定的后验。为解决此问题,我们提出了三种启发式LFI变体:EDGE、MODE和CENTRE。每种变体以其自身方式解释推断步骤中后验众数的移动,并在集成到LFI步骤中时,随同后验推断调整支持域。我们首先揭示了支持域误定问题,并使用随机动力学基准评估了我们的启发式方法。随后,我们评估了启发式支持域调整对动态可变形线性物体(DLO)操控任务的参数推断和策略学习的影响。推断结果实现了对参数化DLO集合更精细的长度和刚度分类。当将所得后验用作基于仿真的策略学习的域分布时,它们能带来更稳健的以物体为中心的智能体性能。