In interactive task learning (ITL), AI agents learn new capabilities from limited human instruction provided during task execution. STAND is a new method of data-efficient rule precondition induction specifically designed for these human-in-the-loop training scenarios. A key feature of STAND is its self-awareness of its own learning -- it can provide accurate metrics of training progress back to users. STAND beats popular methods like XGBoost, decision trees, random forests, and version spaces at small-data precondition induction tasks, and is highly accurate at estimating when its performance improves on holdout examples. In our evaluations, we find that STAND shows more monotonic improvement than other models with low rates of error recurrence. These features of STAND support a more consistent training experience, enabling human instructors to estimate when they are finished training and providing active-learning support by identifying trouble spots where more training is required.
翻译:在交互式任务学习(ITL)中,智能体通过任务执行过程中有限的人工指令学习新能力。STAND是一种数据高效规则前提条件归纳的新方法,专为这类人在回路的训练场景设计。STAND的一个关键特性是其对自身学习过程的自感知能力——能够向用户反馈准确的训练进度指标。在小样本前提条件归纳任务中,STAND的表现优于XGBoost、决策树、随机森林和版本空间等主流方法,并且在评估其自身在保留样本上性能提升的时机方面具有高精度。在我们的评估中发现,STAND相比其他模型展现出更强的单调改进特性,且错误复发率较低。STAND的这些特性为训练过程提供了更一致的体验,使人类指导者能够判断训练何时完成,并通过识别需要额外训练的问题区域来提供主动学习支持。