Natural co-speech gestures are essential components to improve the experience of Human-robot interaction (HRI). However, current gesture generation approaches have many limitations of not being natural, not aligning with the speech and content, or the lack of diverse speaker styles. Therefore, this work aims to repoduce the work by Yoon et,al generating natural gestures in simulation based on tri-modal inputs and apply this to a robot. During evaluation, ``motion variance'' and ``Frechet Gesture Distance (FGD)'' is employed to evaluate the performance objectively. Then, human participants were recruited to subjectively evaluate the gestures. Results show that the movements in that paper have been successfully transferred to the robot and the gestures have diverse styles and are correlated with the speech. Moreover, there is a significant likeability and style difference between different gestures.
翻译:自然伴随语音的手势是提升人机交互体验的关键要素。然而,当前的手势生成方法存在诸多局限,如不够自然、与语音及内容不匹配,或缺乏多样化的说话人风格。因此,本研究旨在复现Yoon等人基于三模态输入在仿真中生成自然手势的工作,并将其应用于实体机器人。评估过程中,采用“运动方差”与“弗雷歇手势距离”对生成效果进行客观评价。同时,招募人类参与者对手势进行主观评估。结果表明,该论文中的运动模式已成功迁移至机器人,所生成的手势风格多样且与语音内容相关。此外,不同手势在喜好度与风格上均存在显著差异。