Animation Fidelity in Self-Avatars: Impact on User Performance and Sense of Agency

The use of self-avatars is gaining popularity thanks to affordable VR headsets. Unfortunately, mainstream VR devices often use a small number of trackers and provide low-accuracy animations. Previous studies have shown that the Sense of Embodiment, and in particular the Sense of Agency, depends on the extent to which the avatar's movements mimic the user's movements. However, few works study such effect for tasks requiring a precise interaction with the environment, i.e., tasks that require accurate manipulation, precise foot stepping, or correct body poses. In these cases, users are likely to notice inconsistencies between their self-avatars and their actual pose. In this paper, we study the impact of the animation fidelity of the user avatar on a variety of tasks that focus on arm movement, leg movement and body posture. We compare three different animation techniques: two of them using Inverse Kinematics to reconstruct the pose from sparse input (6 trackers), and a third one using a professional motion capture system with 17 inertial sensors. We evaluate these animation techniques both quantitatively (completion time, unintentional collisions, pose accuracy) and qualitatively (Sense of Embodiment). Our results show that the animation quality affects the Sense of Embodiment. Inertial-based MoCap performs significantly better in mimicking body poses. Surprisingly, IK-based solutions using fewer sensors outperformed MoCap in tasks requiring accurate positioning, which we attribute to the higher latency and the positional drift that causes errors at the end-effectors, which are more noticeable in contact areas such as the feet.

翻译：自我化身的应用得益于价格适中的VR头显设备而日益普及。然而，主流VR设备通常仅使用少量追踪器，并提供低精度的动画。先前研究表明，具身感（尤其是主体感）依赖于化身动作模仿用户动作的程度。但鲜有研究探讨需要与环境精确交互的任务（即需要精准操作、精确脚步或正确身体姿态的任务）中的此类效应。在此类情形下，用户很可能察觉自我化身与实际姿态之间的不一致。本文研究了用户化身动画保真度对聚焦于手臂运动、腿部运动及身体姿态的多种任务的影响。我们比较了三种不同的动画技术：其中两种使用逆运动学从稀疏输入（6个追踪器）重建姿态，第三种采用配备17个惯性传感器的专业动作捕捉系统。我们从定量（完成时间、无意识碰撞、姿态精度）和定性（具身感）两个维度评估这些动画技术。结果表明，动画质量影响具身感：基于惯性的动作捕捉在模仿身体姿态方面表现显著更优。令人惊讶的是，使用更少传感器的逆运动学方案在需要精确定位的任务中表现优于动作捕捉，我们将其归因于后者更高的延迟及导致末端执行器误差的位置漂移——这种误差在足部等接触区域尤为明显。