Automated driving has made remarkable progress, yet situations still arise where human intervention is necessary. Teleoperation provides a scalable solution to address such cases, enabling remote operators to support vehicles without being physically present. In this context, video transmission forms the operator's primary source of situational awareness, making video quality a decisive factor for both safety and task performance. In an online study, participants rated compressed video sequences from the Zenseact Dataset and provided subjective quality ratings. These ratings were then used to retrain the Video Multi-Method Assessment Fusion (VMAF) model, yielding an adapted variant tailored to teleoperation. The retrained model demonstrated improved alignment with human ratings compared to the original 4K VMAF. In particular, RMSE decreased from 10.36 to 8.83, and MAD from 8.71 to 6.38, corresponding to improvements of 15% and 27%, respectively. These results highlight that incorporating domain-specific data can enhance the predictive power of established quality metrics in safety-critical applications. At the same time, Outlier cases emerged in which videos received high objective scores despite noticeable degradations in regions critical for the driving task.
翻译:自动驾驶技术已取得显著进展,但仍存在需要人类干预的场景。远程操控为应对此类情况提供了可扩展的解决方案,使远程操作员无需亲临现场即可为车辆提供支持。在此背景下,视频传输构成了操作员情境感知的主要来源,因此视频质量成为影响安全性与任务执行效率的关键因素。在一项在线研究中,参与者对来自Zenseact数据集的压缩视频序列进行了主观质量评分,并利用这些评分重新训练了视频多方法评估融合(VMAF)模型,生成了适配远程操控场景的改进变体。与原始4K VMAF相比,重训练模型在人类评分的对齐程度上显著提升,其中均方根误差(RMSE)从10.36降至8.83,平均绝对偏差(MAD)从8.71降至6.38,分别实现了15%和27%的改进。结果表明,在安全关键型应用中引入领域特定数据可增强现有质量指标的预测能力。与此同时,研究发现了异常案例:尽管某些视频在驾驶任务关键区域存在明显退化,其客观评分却依然较高。