Robots interacting with humans must not only generate learned movements in real-time, but also infer the intent behind observed behaviors and estimate the confidence of their own inferences. This paper proposes a unified model that achieves all three capabilities within a single hierarchical predictive-coding recurrent neural network (PC-RNN) equipped with a class embedding vector, CERNet, which leverages a dynamically updated class embedding vector to unify motor generation and recognition. The model operates in two modes: generation and inference. In the generation mode, the class embedding constrains the hidden state dynamics to a class-specific subspace; in the inference mode, it is optimized online to minimize prediction error, enabling real-time recognition. Validated on a humanoid robot across 26 kinesthetically taught alphabets, our hierarchical model achieves 76% lower trajectory reproduction error than a parameter-matched single-layer baseline, maintains motion fidelity under external perturbations, and infers the demonstrated trajectory class online with 68% Top-1 and 81% Top-2 accuracy. Furthermore, internal prediction errors naturally reflect the model's confidence in its recognition. This integration of robust generation, real-time recognition, and intrinsic uncertainty estimation within a compact PC-RNN framework offers a compact and extensible approach to motor memory in physical robots, with potential applications in intent-sensitive human-robot collaboration.
翻译:与人类交互的机器人不仅需要实时生成已习得的运动,还需推断观察行为背后的意图并评估自身推断的置信度。本文提出了一种统一模型,在单个配备类嵌入向量的分层预测编码循环神经网络(PC-RNN)——CERNet中实现上述三种能力。该模型通过动态更新的类嵌入向量统一运动生成与识别功能。模型运行于两种模式:生成模式与推断模式。在生成模式下,类嵌入将隐藏状态动态约束至特定类别的子空间;在推断模式下,类嵌入通过在线优化以最小化预测误差,从而实现实时识别。在人形机器人上对26种动觉示教的字母轨迹进行验证,本分层模型相比参数匹配的单层基线模型轨迹复现误差降低76%,在外部扰动下保持运动保真度,并以68%的Top-1准确率和81%的Top-2准确率在线推断演示轨迹类别。此外,内部预测误差自然反映了模型对其识别结果的置信度。这种在紧凑PC-RNN框架内集成鲁棒生成、实时识别与内在不确定性估计的方法,为物理机器人提供了一种紧凑且可扩展的运动记忆方案,在意图敏感的人机协作领域具有应用潜力。